天天看點

scrapy爬取招聘網站,items轉換成dict遇到的問題

pipelines代碼

1 import json
 2 
 3 class TencentJsonPipeline(object):
 4     def __init__(self):
 5         self.file = open('tencent.json','wb')
 6 
 7     def process_item(self, item, spider):
 8         content = json.dumps(dict(item),ensure_ascii=False)+"\n"
 9         self.file.write(content)
10         return item
11     def close_project(self):
12         self.file.close()      

報錯:

self.file.write(content)
TypeError: a bytes-like object is required, not 'str'      

這個問題是基本的編碼解碼問題,打開json檔案時不能用‘wb’,而是‘w’,編碼方式為utf-8

更正後代碼:

1 class TencentJsonPipeline(object):
 2     def __init__(self):
 3         self.file = open('tencent.json','w',encoding='utf-8')
 4 
 5     def process_item(self, item, spider):
 6         content = json.dumps(dict(item),ensure_ascii=False)+"\n"
 7         self.file.write(content)
 8         return item
 9     def close_project(self):
10         self.file.close()      

運作正常

參考位址:https://stackoverflow.com/questions/44682018/typeerror-object-of-type-bytes-is-not-json-serializable