赞
踩
理解并实践语言模型
(每行数据是一段对话,句子间用__eou__分隔)
How much can I change 100 dollars for ? __eou__ What kind of currency do you
want ? __eou__ How much will it be in Chinese currency ? __eou__ That’s 680 Yuan .
__eou__
What kind of account do you prefer ? Checking account or savings account ?
__eou__ I would like to open a checking account . __eou__ Ok , please just fill out
this form and show us your ID card . __eou__ Here you are . __eou__
import nltk
nltk.download()
from nltk.tokenize import word_tokenize
from nltk import bigrams, FreqDist
from math import log
# 读取数据 小写 替换符号 分句
dataset = open("train_LM.txt", 'r+', encoding='utf-8').read().lower()\
.replace(',',' ').replace('.',' ').replace('?',
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。