Python3.x 檔案寫入出現錯誤 TypeError: write() argument must be str, not bytes

2023-03-07 08:25:12

背景

用Pycharm編輯器Python3.x語言寫一個百度貼吧爬蟲程式

代碼如下：

import urllib.request
import urllib.parse
def loadPage(url):
    headers = {"User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.98 Safari/537.36 LBBROWSER "}
   #構造一個請求對象
    request = urllib.request.Request(url,headers=headers)
    response = urllib.request.urlopen(request)
   # print(response.read())
    return response.read()
def writePage(html,filename):
    """
    作用:将html内容寫入到本地
    html：伺服器相應檔案内容
    """
    #檔案寫入
    with open(filename,"w") as f :
        f.write(html)
    print("_"*30)
def tiebaSpider(url,beginPage,endPage):
   """
   作用：貼吧爬蟲排程器，負責組合處理每個頁面的url
   url:貼吧url的前部分
   beginPage ：起始頁
   endPage ： 結束頁
   """
   for page in range (beginPage,endPage+1):
       pn = (page - 1)*50
       filename = "第" + str(page) + "頁.html"
       html = url +"&pn" +str(pn)
       #print(html)
       html =loadPage(html)
       writePage(html,filename)
       #print(html.decode("utf-8"))
if __name__ == "__main__" :
    kw = input("請輸入需要爬取的貼吧名：")
    beginPage = int(input("請輸入起始頁："))
    endPage = int(input("請輸入結束頁："))
    url = "http://tieba.baidu.com/f?"
    key = urllib.parse.urlencode({"kw":kw})
    fullurl = url +key
    tiebaSpider(fullurl,beginPage,endPage)

程式運作：

Python3.x 檔案寫入出現錯誤 TypeError: write() argument must be str, not bytes

在網上查資料可知，pickle存儲方式預設是二進制方式，将writePage方法中代碼with open(filename,"w") as f :改成二進制方式打開便可，with open(filename,"wb") as f :

運作結果如下：

Python3.x 檔案寫入出現錯誤 TypeError: write() argument must be str, not bytes

Python3.x 檔案寫入出現錯誤 TypeError: write() argument must be str, not bytes

繼續閱讀

python3源碼剖析之concurrent.futures.ThreadPoolExecutor

Python資料分析挖掘的學習路線1. 學習路線圖2. 簡單介紹3. 資料分析的工具4. 資料分析方法5. 其他輔助工具

Python3 pip 解決問題： error: Unable to find vcvarsall.bat

Python3學習筆記（2）——特性和語句1 特性2 語句

【hackerrank】-Day 7: Arrays

【hackerrank】-Day 2: Operators

Python3：最簡短明了的requests, get json 請求

python3-開發進階-RESTful 軟體架構風格

Python-copy子產品-淺拷貝copy.copy()

python 入門之 – for 循環（十三）

python 入門之 – 集合類型（十九）

python 入門之 – 字元串類型及操作方法（十五）

python 入門之 – 深度copy 與資料類型記憶體位址（十四）

python包管理工具 setuptools 及pip安裝python包管理工具

腳本管理器項目

【崔慶才教材】《Python3網絡爬蟲開發實戰》3.4爬取貓眼電影排行代碼更正（繞過美團驗證碼）