赞
踩
合并txt文件内容时候,Python报错UnicodeDecodeError: ‘gbk‘ codec can‘t decode byte,这个错误是做NLP的小伙伴常见的一个错误,报错原因是读取的文件中有中文。
网上找到的解决办法:
将 with open(file) as f: 改成with open(file, ‘r’, encoding=‘utf-8’) as f:
结果运行出错:
结果发现要合并的txt文件是ANSI编码
可行的解决办法如下:
if os.path.isfile(text_ml) != False:
with open(text_ml, 'r',encoding='ANSI') as fd1, open('yfys/yfys_out.txt', 'a+',encoding='ANSI') as fout1:
text_out = fd1.read()
print(text_out)
fout1.write(text_list_no)
fout1.write(text_out)
fout1.close()
text_out = []
运行后正常
希望对大家有点帮助!
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。