当前位置:   article > 正文

NLP入门spaCy配置_spacy库安装与配置

spacy库安装与配置


我用的是python, 打开终端,执行以下代码(参考:https://spacy.io/usage):

pip install -U pip setuptools wheel
pip install -U spacy
python -m spacy download en_core_web_sm
  • 1
  • 2
  • 3

其中第二行pip install -U spacy时,出现提示让我弄C++环境,于是按照提示下载了vs_buildtools,然后安装完之后打开把C++相关的右侧列表中的几乎都点了对勾,然后安装,大概需要半小时。(因为我不懂所以干脆全部安装)

然后再继续上面的第二行代码就成功了

然鹅,第三行代码python -m spacy download en_core_web_sm运行又有问题了,于是只好手动下载en_core_web_sm。
pip list查看spacy的版本,我的是3.7.5。然后去这个网站 https://github.com/explosion/spacy-models/releases 找到页面最下方的en_core_web_trf-3.7.3.tar.gz下载,然后pip 安装试试。

成功,然后运行下面的程序:

# chatgpt generated 

import spacy

# Load spaCy's pre-trained model
nlp = spacy.load("en_core_web_trf")
 
# Transcribed sentence from Vosk
sentence = "Please pick up the red box and put it on the desk in front of me."

# Process the sentence using spaCy
doc = nlp(sentence)

# Extracting actions (verbs) and their objects
actions = []
objects = []
places = []

for token in doc:
    if token.pos_ == "VERB":
        actions.append(token.text)
        for child in token.children:
            if child.dep_ in ("dobj", "attr"):  # Direct object or attribute
                objects.append(child.text)
                # Get the full noun phrase
                noun_phrase = ' '.join([child.text] + [grandchild.text for grandchild in child.children if grandchild.dep_ in ("amod", "compound")])
                objects.append(noun_phrase)
            if child.dep_ in ("prep"):  # Prepositional phrase
                place_phrase = ' '.join([child.text] + [grandchild.text for grandchild in child.children])
                places.append(place_phrase)

# Extracting target locations
for chunk in doc.noun_chunks:
    if any(substring in chunk.text for substring in ["desk", "me"]):
        places.append(chunk.text)

# Remove duplicates
objects = list(set(objects))
places = list(set(places))

# Display results
print("Actions:", actions)
print("Targeted Objects:", objects)
print("Targeted Places:", places)

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45

​运行结果如下:

# 使用trf的模型时(accuracy):
[Running] set PYTHONIOENCODING=utf8 && python -u "d:\Software\Python\pyCode\voice\spaCy\test.py"
Actions: ['pick', 'put']
Targeted Objects: ['box', 'box red', 'it']
Targeted Places: ['on desk', 'me', 'the desk']

[Done] exited with code=0 in 5.135 seconds


# 使用sm的模型时(efficiency):
[Running] set PYTHONIOENCODING=utf8 && python -u "d:\Software\Python\pyCode\voice\spaCy\test.py"
Actions: ['pick', 'put']
Targeted Objects: ['box', 'it', 'box red']
Targeted Places: ['the desk', 'on desk', 'me']

[Done] exited with code=0 in 2.182 seconds

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17

这两个模型,如果手动安装的话,可以在 https://spacy.io/models/en#en_core_web_trf 这里下载。

本文内容由网友自发贡献,转载请注明出处:【wpsshop博客】
推荐阅读
相关标签
  

闽ICP备14008679号