Python學習：網絡請求子產品 urllib 、requests

Python 網絡請求子產品 urllib 、requests

Python 給人的印象是抓取網頁非常友善，提供這種生産力的，主要依靠的就是 urllib、requests這兩個子產品。

urlib 介紹

urllib.request 提供了一個 urlopen 函數，來實作擷取頁面。支援不同的協定、基本驗證、cookie、代理等特性。
urllib 有兩個版本 urllib 以及 urllib2。
urllib2 能夠接受 Request 對象，urllib 則隻能接受 url。
urllib 提供了 urlencode 函數來對GET請求的參數進行轉碼，urllib2 沒有對應函數。
urllib 抛出了一個 URLError 和一個 HTTPError 來處理用戶端和服務端的異常情況。

Requests 介紹

Requests 是一個簡單易用的，用Python編寫的HTTP庫。這個庫讓我們能夠用簡單的參數就完成HTTP請求，而不必像 urllib 一樣自己指定參數。同時能夠自動将響應轉碼為Unicode，而且具有豐富的錯誤處理功能。

International Domains and URLs
Keep-Alive & Connection Pooling
Sessions with Cookie Persistence
Browser-style SSL Verification
Basic/Digest Authentication
Elegant Key/Value Cookies
Automatic Decompression
Unicode Response Bodies
Multipart File Uploads
Connection Timeouts
.netrc support
List item
Python 2.6—3.4
Thread-safe

以下為一些示例代碼，本文環境為 Python 3.6

無需參數直接請求單個頁面

import urllib
from urllib.request import request
from urllib.urlopen import urlopen
# import urllib2
import requests

# 使用 urllib 方式擷取
response = urllib.request.urlopen('http://www.baidu.com')
# read() 讀取的是伺服器的原始傳回資料 decode() 後會進行轉碼
print(response.read().decode())

# 使用 requests 方式擷取
# request 子產品相比
resp = requests.get('http://www.baidu.com')
print(resp)
print(resp.text)

HTTP 是基于請求和響應的工作模式，urllib.request 提供了一個 Request 對象來代表請求，是以上面的代碼也可以這麼寫

req = urllib.request.Request('http://www.baidu.com')
with urllib.request.urlopen(req) as response:
print(response.read())

Request對象可以增加header資訊

req = urllib.request.Request('http://www.baidu.com')
req.add_header('User-Agent', 'Mozilla/6.0 (iPhone; CPU iPhone OS 8_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/8.0 Mobile/10A5376e Safari/8536.25')
with urllib.request.urlopen(req) as response:
print(response.read())

或者直接将 header 傳入 Request 建構函數。

帶參數的 GET 請求

帶有參數的請求和上面的例子本質一樣，可以事先拼出URL請求字元串，然後再進行請求。

本例使用了騰訊的股票API，可以傳入不同的股票代碼以及日期，查詢對應股票在對應時間的價格、交易資訊。

# 使用帶參數的接口通路
tencent_api = "http://qt.gtimg.cn/q=sh601939"

response = urllib.request.urlopen(tencent_api)
# read() 讀取的是伺服器的原始傳回資料 decode() 後會進行轉碼
print(response.read())

resp = requests.get(tencent_api)
print(resp)
print(resp.text)

import urllib.parse
import urllib.request
url = 'http://www.someserver.com/cgi-bin/register.cgi'
values = {'name' : 'Michael Foord',
          'location' : 'Northampton',
          'language' : 'Python' }
data = urllib.parse.urlencode(values)
data = data.encode('ascii') # data should be bytes req = urllib.request.Request(url, data)
with urllib.request.urlopen(req) as response:
   the_page = response.read()

Python學習：網絡請求子產品 urllib 、requests

urlib 介紹

Requests 介紹

繼續閱讀

libsvm for python 安裝

學習軟體測試基礎測試第七天

Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API

【Torch】最簡潔logging使用指南

27. Remove Element(清單)題目代碼

配置網頁内容通路

艱難安裝LDAP,SSL認證

Apache配置SSLApache配置SSL

Windows下配置Apache的SSL服務

sort()函數到底是怎樣進行數字排序的

Cloud Studio初體驗

使用 ctypes 進行 Python 和 C 的混合程式設計

【python】【資料處理】畫多元資料分布圖

【python】netconf協定對接管理裝置

「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置

在python中建立excel并寫入

Python學習： 網絡請求子產品 urllib 、requests

urlib 介紹

Requests 介紹

繼續閱讀

Python學習：網絡請求子產品 urllib 、requests