下載下傳單個網頁(Python2.7)3. 下載下傳錯誤後重試下載下傳

2021-11-15 23:50:00

1. 功能與目的

就是下載下傳一個網頁的源代碼，網址就是CSDN部落格位址：

http://blog.csdn.net/woshisangsang

2. 下載下傳一個網頁

通過urllib2子產品的urlopen方法可以擷取一個位址對應的html代碼，注意在linux環境下，需要指明解釋器的路徑，并指明編碼（不然沒法使用中文）

#!/usr/bin/python2.7
# coding=UTF-8
import urllib2

#變量區域
url="http://blog.csdn.net/woshisangsang"#待下載下傳的網址

#方法區域
def downloadWebsite(url):#下載下傳
    print("start to download："+url)
    try:
        result=urllib2.urlopen(url).read()
    except urllib2.URLError as ex:
        print("download error,the reason is:"+ex.reason)
    return result

result=downloadWebsite(url)
print(result)

運作結果如下（部分結果，證明程式跑成功了），通過這個結果我們可以想到href對應的連結就是一篇博文對應的位址。

<span class="link_title"><a href="/woshisangsang/article/details/77222166">
        Java 線程與程序的速度比較（繼承Thread實作）            
        </a>
        </span>

3. 下載下傳錯誤後重試下載下傳

網頁請求失敗在所難免，是以可以重試下載下傳，而對隻有重試到最後一次仍然報錯的顯示錯誤資訊。具體實作代碼如下：

#!/usr/bin/python2.7
# coding=UTF-8
import urllib2

#變量區域
url="http://blog.csdn.net/woshisangsang11"#待下載下傳的網址

#方法區域
def downloadWebsite(url,retry_time=5):
    print("start to download："+url+",the retry time is:"+str(retry_time))
    try:
        result=urllib2.urlopen(url).read()
    except urllib2.URLError as ex:
        if retry_time==1:
            return "download error,the reason is:"+ex.reason+",error code"+str(ex.code)
        else:
            return downloadWebsite(url,retry_time-1)
    return result

result=downloadWebsite(url)
print(result)

執行結果如下，可見咱這個程式是不服氣的下載下傳了5次，最後實在不行才報告錯誤的。最後error code是403，指的是可以連接配接該網站，但沒有檢視網頁的權限，因為此處的url是我虛構的（加了11）.

start to download：

http://blog.csdn.net/woshisangsang11,the

retry time is:5

retry time is:4

retry time is:3

retry time is:2

retry time is:1

download error,the reason is:Forbidden,error code403

[Finished in 1.6s]

下載下傳單個網頁(Python2.7)3. 下載下傳錯誤後重試下載下傳

3. 下載下傳錯誤後重試下載下傳

繼續閱讀

Apache (You don't have permission to access / on this server.）

debian9更新4.9.0核心到4.19.2核心過程

centOS7 配置 vsftpd 虛拟使用者及權限Vsftpd配置虛拟使用者及權限

linux-svn解除安裝與安裝

vsftp虛拟多使用者多權限一鍵部署腳本

Ubuntu14.04 LTS下安裝mongodb

httpd服務的部署、啟動、配置和簡單優化一、部署二、啟動三、配置檔案

配置網頁内容通路

手動安裝Intel network I217-LM網卡的Linux驅動

禁止ubuntu系統彈出報錯界面

Ubuntu Linux下Apache的配置檔案

samba伺服器的功能

【Linux】UDP廣播封包接收速率問題

Linux裝置模型（中）之上層容器

PowerPC平台 Linux移植三