各作業系統下的不同編碼方式
先看一下
linux,python2.7
>>> B = b'\xc3\x84\xc3\xa8'
>>> B.decode('utf-8')
u'\xc4\xe8'
>>> type(B)
<type 'str'>
>>>
windows,python2.7,python shell
>>> B = b'\xc3\x84\xc3\xa8'
>>> B.decode('utf-8')
u'\xc4\xe8'
>>> print B.decode('utf-8')
Äè
>>>
windows,python2.7,python cmd控制台
>>> B = b'\xc3\x84\xc3\xa8'
>>> B.decode('utf-8')
u'\xc4\xe8'
>>> print B.decode('utf-8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'gbk' codec can't encode character u'\xc4' in position 0: il
legal multibyte sequence
>>>
三種環境下不同輸出的原因:
windows控制台預設采用GBK編碼,liunx預設采用UTF-8編碼
------------------------------------------------------
檢視linux預設編碼:
[[email protected] ~]# env |grep LANG
LANG=zh_CN.UTF-8
------------------------------------------------------
檢視windows控制台預設編碼:
cmd打開控制台---->屬性---->檢視編碼為936(簡體中文GBK)
(進一步在linux和windows下建立文本檔案檢視編碼方式果然沒錯,證明。)
轉載于:https://www.cnblogs.com/Micang/p/9733028.html