赞
踩
pip install spicy
spicy还需要载入文本库,使用pip的下载方式:
python3 -m spacy download en_core_web_sm
但是很有可能因为网络问题下载速度非常缓慢,所以可以选择到github上去直接下载(注意和自己的spacy版本匹配):github下载链接
下载*.tar.gz文件即可。
然后切换到下载路径,
pip install en_core_web_sm-3.1.0.tar.gz
使用spacy来处理nlp相关的功能还是很强大的,下面是一些基础用法展示:
import spacy from spacy import displacy from spacy.matcher import Matcher nlp = spacy.load("en_core_web_sm") text = """ Go to the bedroom with the guitars and black bed and empty the board. """ doc = nlp(text) ''' 词性提取 ''' print([(w.text,w.tag_) for w in doc])# 词性-细粒度 print([(w.text,w.pos_) for w in doc])# 词性-粗粒度 print([(w.text,w.label_) for w in doc.ents]) # 实体提取 ''' 可视化依赖关系 ''' html_str = displacy.render(doc,style="dep") with open('spacy_display.html','w',encoding='utf-8') as f: f.write(html_str) ''' 匹配 ''' matcher = Matcher(nlp.vocab) pattern_1 = [ {"LOWER":"go"}, {"TEXT":"to"}, {"TEXT":"the","OP":"?"}, {"POS":"NOUN"} ] # go to the xxx pattern_2 = [ {"POS":"VERB"}, {"TEXT":"the","OP":"?"}, {"POS":"NOUN","OP":"+"} ] matcher.add("go_to_pattern",[pattern_1]) matcher.add("verb_target_pattern",[pattern_2]) matches = matcher(doc) for match_id, start, end in matches: print(nlp.vocab.strings[match_id]) matched_span = doc[start:end] print(matched_span.text)
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。