Python3之requests子產品

　　Python标準庫中提供了：urllib等子產品以供Http請求，但是，它的 API 太渣了。它是為另一個時代、另一個網際網路所建立的。它需要巨量的工作，甚至包括各種方法覆寫，來完成最簡單的任務。

　　發送GET請求

import urllib.request

f = urllib.request.urlopen('http://www.webxml.com.cn//webservices/qqOnlineWebService.asmx/qqCheckOnline?qqCode=424662508')
result = f.read().decode('utf-8')

　　發送攜帶請求頭的GET請求

import urllib.request

req = urllib.request.Request('http://www.example.com/')
req.add_header('Referer', 'http://www.python.org/')
r = urllib.request.urlopen(req)

result = f.read().decode('utf-8')

　　更多内容點選檢視官方文檔

　　Requests 是使用 Apache2 Licensed 許可證的基于Python開發的HTTP 庫，其在Python内置子產品的基礎上進行了高度的封裝，進而使得Pythoner進行網絡請求時，變得美好了許多，使用Requests可以輕而易舉的完成浏覽器可有的任何操作。

requests庫特性：

Keep-Alive & 連接配接池
國際化域名和 URL
帶持久 Cookie 的會話
浏覽器式的 SSL 認證
自動内容解碼
基本/摘要式的身份認證
優雅的 key/value Cookie
自動解壓
Unicode 響應體
HTTP(S) 代理支援
檔案分塊上傳
流下載下傳
連接配接逾時
分塊請求
支援 .netrc

1. 安裝子產品

安裝:
	pip install requests
更新：
	pip install --upgrade requests

2. 使用子產品

　　HTTP的請求類型有POST，GET，PUT，DELETE，HEAD 以及 OPTIONS，其中POST和GET是最常使用的。

　　GET請求

import requests
# 無參數示例
r = requests.get('https://httpbin.org/get')
# 有參數示例
r = requets.get('http://httpbin.org/get', params=d)

傳遞URL參數：
    在URL中常見?符号，http://httpbin.org/get?key=val 這種帶有?傳遞關鍵字參數的方式，requests可以通過params實作。
d = {'k1':'v1', 'k2':'v2', 'k3':None, 'k4':['v4','v5']}  
    # 字典中鍵值為None的鍵不會被添加到URL中
    # 多個鍵值中間用&符号連接配接
    # 鍵值可是清單 例如'k4'
print(r.url)
執行結果為：http://httpbin.org/get?k1=v1&k2=v2&k4=v4&k4=v5

　　POST請求

# 1、基本POST執行個體
 
import requests
 
payload = {'key1': 'value1', 'key2': 'value2'}
ret = requests.post("http://httpbin.org/post", data=payload)
print(ret.text)
# 輸出結果
{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "key1": "value1", 
    "key2": "value2"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Connection": "close", 
    "Content-Length": "23", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.18.4"
  }, 
  "json": null, 
  "origin": "不告訴你這裡傳回的是你的IP位址", 
  "url": "http://httpbin.org/post"
}

 
# 2、發送請求頭和資料執行個體
 
import requests
import json

url = 'https://api.github.com/some/endpoint'
payload = {'some': 'data'}
headers = {'content-type': 'application/json'}
ret = requests.post(url, data=json.dumps(payload), headers=headers)
print(ret.text)
print(ret.cookies)
# 輸出結果
{"message":"Not Found","documentation_url":"https://developer.github.com/v3"}
<RequestsCookieJar[]>

　　關于響應内容

　requests子產品的傳回對象是一個Response對象，可以從這個對象中擷取需要的資訊。下面 r 代表Response對象。

r.text 文本響應内容
r.context 二進制響應内容
r.json() JSON響應内容
r.raw 原始相應内容

Python3之requests子產品

# 文本響應内容
    Response對象包含很多資訊，Requests可以自動對大多數unicode字元集無縫解碼。
    請求發出後，Requests會基于HTTP頭部對響應的編碼做出有根據的推測。
    我們可以通過r.encoding得到編碼，也可以使用r.encoding屬性改變編碼

#二進制響應内容
      對于非文本請求r.content，Requests會自動解碼gzip和deflate傳輸編碼的響應内容。

# JSON相應内容
    需要注意如果JSON解碼失敗，r.json()會抛出異常。然而成功調用r.json()并不意味着響應成功，因為某些伺服器失敗
    的相應中也會包含一個JSON對象，這種JSON會被解碼傳回。如果要判斷請求是否成功，可以使用r.raise_for_status()
    或者檢查r.status_code是否和預期相同。

# 原始相應内容
    如果需要擷取伺服器的原始套接字相應，可以使用r.raw，使用時要確定在初始請求中設定了 stream=True
r = requests.get('https://httpbin.org/get', stream=True)
print(r.raw)
print(r.raw.read(10))
# 結果輸出
<urllib3.response.HTTPResponse object at 0x061665F0>
b'{\n  "args"

相應内容介紹

　　定制請求頭

如果想要添加HTTP頭部，隻需要傳遞一個字典給headers參數即可。注意: 所有的 header 值必須是 string、bytestring 或者 unicode。盡管傳遞 unicode header 也是允許的，但不建議這樣做。

注意：定制header的優先級低于某些特定的資訊源，例如：

如果在 .netrc 中設定了使用者認證資訊，使用 headers= 設定的授權就不會生效。而如果設定了 auth= 參數，``.netrc`` 的設定就無效了。
如果被重定向到别的主機，授權 header 就會被删除。
代理授權 header 會被 URL 中提供的代理身份覆寫掉。
在我們能判斷内容長度的情況下，header 的 Content-Length 會被改寫

更進一步講，Requests 不會基于定制 header 的具體情況改變自己的行為。隻不過在最後的請求中，所有的 header 資訊都會被傳遞進去。

url = 'https://api.github.com/some/endpoint'
headers = {'user-agent': 'my-app/0.0.1'}
r = requests.get(url, headers=headers)

　　響應狀态碼

可以通過響應狀态碼得知請求的結果，一般 200表示請求成功，Requests還附帶一個内置的狀态碼查詢對象 request.codes:

>>> r = requests.get('http://httpbin.org/get')
>>> r.status_code
200
>>> r.status_code == requests.codes.ok
True

# 如果發送了一個錯誤請求(一個 4XX 用戶端錯誤，或者 5XX 伺服器錯誤響應)，我們可以通過 Response.raise_for_status() 來抛出異常：

>>> bad_r = requests.get('http://httpbin.org/status/404')
>>> bad_r.status_code
404

>>> bad_r.raise_for_status()
Traceback (most recent call last):
  File "requests/models.py", line 832, in raise_for_status
    raise http_error
requests.exceptions.HTTPError: 404 Client Error

# 但是，由于我們的例子中 r 的 status_code 是 200 ，當我們調用 raise_for_status() 時，得到的是：
>>> r.raise_for_status()
None

　　響應頭

>>> r.headers
{
    'content-encoding': 'gzip',
    'transfer-encoding': 'chunked',
    'connection': 'close',
    'server': 'nginx/1.0.4',
    'x-runtime': '148ms',
    'etag': '"e1ca502697e5c9317743dc078f67693f"',
    'content-type': 'application/json'
}

#但是這個字典比較特殊：它是僅為 HTTP 頭部而生的。根據 RFC 2616， HTTP 頭部是大小寫不敏感的。

>>> r.headers['Content-Type']
'application/json'

>>> r.headers.get('content-type')
'application/json'

>>> url = 'http://example.com/some/cookie/setting/url'
>>> r = requests.get(url)

>>> r.cookies['example_cookie_name']
'example_cookie_value'

# 如果想要發送你的cookies到伺服器，可以使用cookies參數
>>> url = 'http://httpbin.org/cookies'
>>> cookies = dict(cookies_are='working')
>>> r = requests.get(url, cookies=cookies)
>>> r.text
'{"cookies": {"cookies_are": "working"}}'

# Cookie 的傳回對象為 RequestsCookieJar，它的行為和字典類似，但界面更為完整，适合跨域名跨路徑使用。你還可以把 Cookie Jar 傳到 Requests 中：
>>> jar = requests.cookies.RequestsCookieJar()
>>> jar.set('tasty_cookie', 'yum', domain='httpbin.org', path='/cookies')
>>> jar.set('gross_cookie', 'blech', domain='httpbin.org', path='/elsewhere')
>>> url = 'http://httpbin.org/cookies'
>>> r = requests.get(url, cookies=jar)
>>> r.text
'{"cookies": {"tasty_cookie": "yum"}}'

　　逾時

你可以告訴 requests 在經過以

timeout

參數設定的秒數時間之後停止等待響應。基本上所有的生産代碼都應該使用這一參數。如果不使用，你的程式可能會永遠失去響應。

>>> requests.get('http://github.com', timeout=0.001)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
requests.exceptions.Timeout: HTTPConnectionPool(host='github.com', port=80): Request timed out. (timeout=0.001)

# 注意事項
    timeout 僅對連接配接過程有效，與響應體的下載下傳無關。 timeout 并不是整個下載下傳響應的時間限制，而是如果伺服器在 timeout 秒内沒有應答，
    将會引發一個異常（更精确地說，是在 timeout 秒内沒有從基礎套接字上接收到任何位元組的資料時）
    If no timeout is specified explicitly, requests do not time out.

　　錯誤與異常

遇到網絡問題（如：DNS 查詢失敗、拒絕連接配接等）時，Requests 會抛出一個 ConnectionError 異常。

如果 HTTP 請求傳回了不成功的狀态碼， Response.raise_for_status() 會抛出一個 HTTPError 異常。

若請求逾時，則抛出一個 Timeout 異常。

若請求超過了設定的最大重定向次數，則會抛出一個 TooManyRedirects 異常。

所有Requests顯式抛出的異常都繼承自 requests.exceptions.RequestException 。

　　其他請求

requests.get(url, params=None, **kwargs)
requests.post(url, data=None, json=None, **kwargs)
requests.put(url, data=None, **kwargs)
requests.head(url, **kwargs)
requests.delete(url, **kwargs)
requests.patch(url, data=None, **kwargs)
requests.options(url, **kwargs)
 
# 以上方法均是在此方法的基礎上建構
requests.request(method, url, **kwargs)

3. Http請求和XML執行個體

執行個體：檢測QQ賬号是否線上

import urllib
import requests
from xml.etree import ElementTree as ET

# 使用内置子產品urllib發送HTTP請求，或者XML格式内容
"""
f = urllib.request.urlopen('http://www.webxml.com.cn//webservices/qqOnlineWebService.asmx/qqCheckOnline?qqCode=424662508')
result = f.read().decode('utf-8')
"""


# 使用第三方子產品requests發送HTTP請求，或者XML格式内容
r = requests.get('http://www.webxml.com.cn//webservices/qqOnlineWebService.asmx/qqCheckOnline?qqCode=424662508')
result = r.text

# 解析XML格式内容
node = ET.XML(result)

# 擷取内容
if node.text == "Y":
    print("線上")
else:
    print("離線")

執行個體：檢視火車停靠資訊

import urllib
import requests
from xml.etree import ElementTree as ET

# 使用内置子產品urllib發送HTTP請求，或者XML格式内容
"""
f = urllib.request.urlopen('http://www.webxml.com.cn/WebServices/TrainTimeWebService.asmx/getDetailInfoByTrainCode?TrainCode=G666&UserID=')
result = f.read().decode('utf-8')
"""

# 使用第三方子產品requests發送HTTP請求，或者XML格式内容
r = requests.get('http://www.webxml.com.cn/WebServices/TrainTimeWebService.asmx/getDetailInfoByTrainCode?TrainCode=G666&UserID=')
result = r.text

# 解析XML格式内容
root = ET.XML(result)
for node in root.iter('TrainDetailInfo'):
    print(node.find('TrainStation').text,node.find('StartTime').text,node.tag,node.attrib)

Python3之requests子產品

requests庫特性：

1. 安裝子產品

2. 使用子產品

3. Http請求和XML執行個體

繼續閱讀

來自python的【條件控制/語句循環/break/continue/else/pass】一、條件控制二、語句循環

無法解析的外部符号 wmain，該符号在函數 "void cdecl mainCRTStartupHelper(struct HINSTANCE *,unsigned short con......

TestLink導出用例轉換工具(XML2Excel)

YAML簡介和PyYAML安全操作YAML支援的類型YAML的優點：yaml的基本文法python操作

Small tricks

libsvm for python 安裝

學習軟體測試基礎測試第七天

Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API

【Torch】最簡潔logging使用指南

27. Remove Element(清單)題目代碼

Cloud Studio初體驗

使用 ctypes 進行 Python 和 C 的混合程式設計

【python】【資料處理】畫多元資料分布圖

【python】netconf協定對接管理裝置

「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置

在python中建立excel并寫入