当前位置:   article > 正文

python使用spacy做关键词提取_spacy库提取关键词

spacy库提取关键词

这是一个最简单的例子,对一篇英文文章提取关键词,不涉及词性分析等。

# -*- coding: utf-8 -*-
# @Author  : meng_zhihao
# @Email   : 312141830@qq.com
# @File    : text-rank.py
import spacy
import pytextrank  # 这行要加

# load a spaCy model, depending on language, scale, etc.
# python -m spacy download en
nlp = spacy.load("en_core_web_sm")  # 导入模块是最耗时的,所以做成服务的时候要避免重复导入! 这个是middle包,可以试试en_core_web_lg

# example text
# text = "Compatibility of systems of linear constraints over the set of natural numbers. Criteria of compatibility of a system of linear Diophantine equations, strict inequations, and nonstrict inequations are considered. Upper bounds for components of a minimal set of solutions and algorithms of construction of minimal generating sets of solutions for all types of systems are given. These criteria and the corresponding algorithms for constructing a minimal supporting set of solutions can be used in solving all the considered types systems and systems of mixed types."
with open('5.txt')as f:
    text = f.read()

# add PyTextRank to the spaCy pipeline
nlp.add_pipe("textrank")
doc = nlp(text)

words = []
# examine the top-ranked phrases in the document
for phrase in doc._.phrases:
    # print(phrase.rank, phrase.count)
    print(phrase.chunks[0])
    words.append(phrase.chunks[0])

print(words)


  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/羊村懒王/article/detail/694231
推荐阅读
相关标签
  

闽ICP备14008679号