赞
踩
这是一个最简单的例子,对一篇英文文章提取关键词,不涉及词性分析等。
# -*- coding: utf-8 -*- # @Author : meng_zhihao # @Email : 312141830@qq.com # @File : text-rank.py import spacy import pytextrank # 这行要加 # load a spaCy model, depending on language, scale, etc. # python -m spacy download en nlp = spacy.load("en_core_web_sm") # 导入模块是最耗时的,所以做成服务的时候要避免重复导入! 这个是middle包,可以试试en_core_web_lg # example text # text = "Compatibility of systems of linear constraints over the set of natural numbers. Criteria of compatibility of a system of linear Diophantine equations, strict inequations, and nonstrict inequations are considered. Upper bounds for components of a minimal set of solutions and algorithms of construction of minimal generating sets of solutions for all types of systems are given. These criteria and the corresponding algorithms for constructing a minimal supporting set of solutions can be used in solving all the considered types systems and systems of mixed types." with open('5.txt')as f: text = f.read() # add PyTextRank to the spaCy pipeline nlp.add_pipe("textrank") doc = nlp(text) words = [] # examine the top-ranked phrases in the document for phrase in doc._.phrases: # print(phrase.rank, phrase.count) print(phrase.chunks[0]) words.append(phrase.chunks[0]) print(words)
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。