当前位置:   article > 正文

自然语言处理基础技术工具篇之TextBlob_怎么下载textblob

怎么下载textblob

TextBlob简介

  1. TextBlob是一个用Python编写的开源的文本处理库。它可以用来执行很多自然语言处理的任务,比如,词性标注,名词性成分提取,情感分析,文本翻译,等等。
  2. Github地址:https://github.com/sloria/TextBlob
  3. 官方文档:https://textblob.readthedocs.io/en/dev/

TextBlob实战

安装:pip install textblob

如果下载速度太慢,可以配置国内源安装:pip install textblob -i https://pypi.tuna.tsinghua.edu.cn/simple

In [2]:

!pip install textblob
  • 1
Requirement already satisfied: textblob in /opt/conda/lib/python3.6/site-packages
Requirement already satisfied: nltk>=3.1 in /opt/conda/lib/python3.6/site-packages (from textblob)
Requirement already satisfied: six in /opt/conda/lib/python3.6/site-packages (from nltk>=3.1->textblob)
You are using pip version 9.0.1, however version 18.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
  • 1
  • 2
  • 3
  • 4
  • 5

资源包下载:

import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('brown')
nltk.download('wordnet')
  • 1
  • 2
  • 3
  • 4
  • 5

下载不了可以通过百度网盘下载:

链接:https://pan.baidu.com/s/1n4AX-_GAI-SVBo9hhCTeVA
提取码:9utb

下载完后将文件夹放到本地路径:C:\Users\wangchaojie\AppData\Roaming\nltk_data

其中wangchaojie为我们自己计算机名称

from textblob import TextBlob
text = 'I love natural language processing! I am not like fish!'
blob = TextBlob(text)
  • 1
  • 2
  • 3

1.词性标注

print(blob.tags)
  • 1

Out[13]:

[('I', 'PRP'),
 ('love', 'VBP'),
 ('natural', 'JJ'),
 ('language', 'NN'),
 ('processing', 'NN'),
 ('I', 'PRP'),
 ('am', 'VBP'),
 ('not', 'RB'),
 ('like', 'IN'),
 ('fish', 'NN')]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
2.短语抽取
np = blob.noun_phrases 
for w in np:
    print(w)
  • 1
  • 2
  • 3
natural language processing
  • 1
3.计算句子情感值

In [15]:

for sentence in blob.sentences:
    print(sentence + '------>' +  str(sentence.sentiment.polarity))
  • 1
  • 2
I love natural language processing!------>0.3125
I am not like fish!------>0.0
  • 1
  • 2
4.Tokenization(把文本切割成句子或者单词)
token = blob.words
for w in token:
    print(w)
  • 1
  • 2
  • 3
I
love
natural
language
processing
I
am
not
like
fish
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
sentence = blob.sentences
for s in sentence:
    print(s)
  • 1
  • 2
  • 3
I love natural language processing!
I am not like fish!
  • 1
  • 2
5.词语变形(Words Inflection)

In [18]:

token = blob.words
for w in token:
    # 变复数
    print(w.pluralize())
    # 变单数
    print(w.singularize())
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
we
I
love
love
naturals
natural
languages
language
processings
processing
we
I
ams
am
nots
not
likes
like
fish
fish
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
6.词干化(Words Lemmatization)
from textblob import Word
w = Word('went')
print(w.lemmatize('v'))
w = Word('octopi')
print(w.lemmatize())
  • 1
  • 2
  • 3
  • 4
  • 5
go
octopus
  • 1
  • 2
7.集成WordNet

In [22]:

from textblob.wordnet import VERB
word = Word('octopus')
syn_word = word.synsets
for syn in syn_word:
    print(syn)
  • 1
  • 2
  • 3
  • 4
  • 5
Synset('octopus.n.01')
Synset('octopus.n.02')
  • 1
  • 2

指定返回的同义词集为动词

In [23]:

syn_word1 = Word("hack").get_synsets(pos=VERB)
for syn in syn_word1:
    print(syn)
  • 1
  • 2
  • 3
Synset('chop.v.05')
Synset('hack.v.02')
Synset('hack.v.03')
Synset('hack.v.04')
Synset('hack.v.05')
Synset('hack.v.06')
Synset('hack.v.07')
Synset('hack.v.08')
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

查看synset(同义词集)的具体定义

In [24]:

Word("beautiful").definitions
  • 1

Out[24]:

['delighting the senses or exciting intellectual or emotional admiration',
 '(of weather) highly enjoyable']
  • 1
  • 2
8.拼写纠正(Spelling Correction)

In [25]:

sen = 'I lvoe naturl language processing!'
sen = TextBlob(sen)
print(sen.correct())
  • 1
  • 2
  • 3
I love nature language processing!
  • 1

Word.spellcheck()返回拼写建议以及置信度

In [26]:

w1 = Word('good')
w2 = Word('god')
w3 = Word('gd')
print(w1.spellcheck())
print(w2.spellcheck())
print(w3.spellcheck())
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
[('good', 1.0)]
[('god', 1.0)]
[('go', 0.586139896373057), ('god', 0.23510362694300518), ('d', 0.11658031088082901), ('g', 0.03626943005181347), ('ed', 0.009067357512953367), ('rd', 0.006476683937823834), ('nd', 0.0038860103626943004), ('gr', 0.0025906735751295338), ('sd', 0.0006476683937823834), ('md', 0.0006476683937823834), ('id', 0.0006476683937823834), ('gdp', 0.0006476683937823834), ('ga', 0.0006476683937823834), ('ad', 0.0006476683937823834)]
  • 1
  • 2
  • 3
9.句法分析(Parsing)

In [27]:

text = TextBlob('I lvoe naturl language processing!')
print(text.parse())
  • 1
  • 2
I/PRP/B-NP/O lvoe/NN/I-NP/O naturl/NN/I-NP/O language/NN/I-NP/O processing/NN/I-NP/O !/./O/O
  • 1

10.N-Grams

In [28]:

text = TextBlob('I lvoe naturl language processing!')
print(text.ngrams(n=2))
[WordList(['I', 'lvoe']), WordList(['lvoe', 'naturl']), WordList(['naturl', 'language']), WordList(['language', 'processing'])]
  • 1
  • 2
  • 3

另外,代码已经上传github:https://github.com/yuquanle/StudyForNLP/blob/master/NLPtools/TextBlobDemo.ipynb

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/weixin_40725706/article/detail/482870
推荐阅读
相关标签
  

闽ICP备14008679号