当前位置:   article > 正文

NLTK下载punkt、stopsword_nltk punkt下载

nltk punkt下载

1、NLTK下载punkt并放置文件

from nltk import word_tokenize


sents = [sent1, sent2]
print(word_tokenize(sent1))
  • 1
  • 2
  • 3
  • 4
  • 5

报错:

D:\Anaconda3\python.exe "D:/002 知识总结/007 NLP/NLP入门文章/词袋模型与句子相似度.py"
[nltk_data] Error loading punkt: <urlopen error [SSL:
[nltk_data]     CERTIFICATE_VERIFY_FAILED] certificate verify failed:
[nltk_data]     unable to get local issuer certificate (_ssl.c:1123)>
Traceback (most recent call last):
  File "D:/002 知识总结/007 NLP/NLP入门文章/词袋模型与句子相似度.py", line 11, in <module>
    print(word_tokenize(sent1))
  File "D:\Anaconda3\lib\site-packages\nltk\tokenize\__init__.py", line 129, in word_tokenize
    sentences = [text] if preserve_line else sent_tokenize(text, language)
  File "D:\Anaconda3\lib\site-packages\nltk\tokenize\__init__.py", line 106, in sent_tokenize
    tokenizer = load("tokenizers/punkt/{0}.pickle".format(language))
  File "D:\Anaconda3\lib\site-packages\nltk\data.py", line 752, in load
    opened_resource = _open(resource_url)
  File "D:\Anaconda3\lib\site-packages\nltk\data.py", line 877, in _open
    return find(path_, path + [""]).open()
  File "D:\Anaconda3\lib\site-packages\nltk\data.py", line 585, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource punkt not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('punkt')
  
  For more information see: https://www.nltk.org/data.html

  Attempted to load tokenizers/punkt/english.pickle

  Searched in:
    - 'C:\\Users\\29617/nltk_data'
    - 'D:\\Anaconda3\\nltk_data'
    - 'D:\\Anaconda3\\share\\nltk_data'
    - 'D:\\Anaconda3\\lib\\nltk_data'
    - 'C:\\Users\\29617\\AppData\\Roaming\\nltk_data'
    - 'C:\\nltk_data'
    - 'D:\\nltk_data'
    - 'E:\\nltk_data'
    - ''
**********************************************************************


进程已结束,退出代码为 1
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43

解决方法:
【Python】nltk库使用报错之punkt安装:https://blog.csdn.net/weixin_43896318/article/details/106191856

2、NLTK下载stopwords并放置文件

英文文本分词之工具NLTK

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/小丑西瓜9/article/detail/624036
推荐阅读
相关标签
  

闽ICP备14008679号