赞
踩
<span style="font-size:24px;">将上问的输入文档归为两个主题</span>
- from gensim import corpora,models,similarities
- dictionary=corpora.Dictionary.load('/tmp/deerwester.dict')
- corpus=corpora.MmCorpus('/tmp/deerwester.mm')
- print(corpus)
-
- tfidf=models.TfidfModel(corpus)
-
- doc_bow=[(0,1),(1,1)]
- print(tfidf[doc_bow]) #计算最相关的文档
-
- corpus_tfidf=tfidf[corpus]
-
- #initialize an LSI transformation
- lsi=models.LsiModel(corpus_tfidf,id2word=dictionary,num_topics=3)
- #transformed tf-idf corpus via lsi into a laten 2-D space
- corpus_lsi=lsi[corpus_tfidf]
- lsi.print_topics(3)
- for doc in corpus_lsi:
- print(doc)
-
- lsi.save('/tmp/model.lsi')#same for tfidf,lda
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。