当前位置:   article > 正文

错误:UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode的解决方案

unicodewarning: unicode equal comparison failed to convert both arguments to
运行环境 python2.7
【情景】
最近搞文本处理,遇到了很多编码问题。 chardet 是个好东西,封装了对编码格式的检测。通过返回的编码格式对文本进行解码/编码操作,也比较方便。当然,这里也遇到了一点小问题,当时,从文件中提取的一个单词放入chardet.detect()的函数,结果输出了一片警告。代码是能顺利执行,但是标准输出十分难看。
【错误警告】
/usr/lib/python2.7/dist-packages/chardet/universaldetector.py:69: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  if aBuf[:3] == '\xEF\xBB\xBF':
/usr/lib/python2.7/dist-packages/chardet/universaldetector.py:72: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  elif aBuf[:4] == '\xFF\xFE\x00\x00':
/usr/lib/python2.7/dist-packages/chardet/universaldetector.py:75: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  elif aBuf[:4] == '\x00\x00\xFE\xFF':
/usr/lib/python2.7/dist-packages/chardet/universaldetector.py:78: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  elif aBuf[:4] == '\xFE\xFF\x00\x00':
/usr/lib/python2.7/dist-packages/chardet/universaldetector.py:81: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  elif aBuf[:4] == '\x00\x00\xFF\xFE':
/usr/lib/python2.7/dist-packages/chardet/universaldetector.py:84: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  elif aBuf[:2] == '\xFF\xFE':
/usr/lib/python2.7/dist-packages/chardet/universaldetector.py:87: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  elif aBuf[:2] == '\xFE\xFF':
【解决方案】
 可以简单翻一下警告:
uncode编码警告:在unicode等价比较中,把两个参数同时转换为unicode编码失败。中断并认为他们不相等。
python里一般处理的是unicode和str的文本对象,经过侦测,传给chardet的文本是“ascii”的的格式,所以传给chardet前先转成unicode的就好了。

另外,python程序本身的utf-8个是编码,str对象的文本转换为unicode需要使用text.decode("utf-8").

P·S:祝所有励志今年“马上找到吕盆友”的兄弟,恩,还有姐妹们,加油~ 
 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/煮酒与君饮/article/detail/882519
推荐阅读
相关标签
  

闽ICP备14008679号