middlewares.py
from w3lib.http import basic_auth_header
class CustomProxyMiddleware(object):
def process_request(self, request, spider):
request.meta['proxy'] = "https://<PROXY_IP_OR_URL>:<PROXY_PORT>"
request.headers['Proxy-Authorization'] = basic_auth_header(
'<PROXY_USERNAME>', '<PROXY_PASSWORD>')
settings.py
DOWNLOADER_MIDDLEWARES = {
'<PROJECT_NAME>.middlewares.CustomProxyMiddleware': 350,
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': 400,
}
問題
1、如果代理驗證設定不對,狀态碼會傳回407
407 Proxy Authentication Required
剛開始采用以下格式配置,發現部分請求可以發送,不過會有一個重試,部分請求直接報錯
request.meta['proxy'] = "https://<PROXY_USERNAME>:<PROXY_PASSWORD>@<PROXY_IP_OR_URL>:<PROXY_PORT>"
正确的設定是在請求頭中設定
Proxy-Authorization
參考