環境
依賴安裝
pip install flask-cors flask selenium
安裝chromedriver
mac下安裝selenium+phantomjs+chromedriver實作代碼
1、hook.js
監聽 XMLHttpRequest 請求
// 打開連結,複制代碼到這裡
// https://unpkg.com/[email protected]/dist/ajaxhook.min.js
// https://unpkg.com/axios/dist/axios.min.js
ah.proxy({
//請求成功後進入
onResponse: (response, handler) => {
if (response.config.url.startsWith('/api/movie')) {
axios.post('http://localhost:5000/receiver/movie', {
url: window.location.href,
data: response.response
})
console.log(response.response)
handler.next(response)
}
}
})
2、main.py
驅動chrome
# -*- coding: utf-8 -*-
from selenium import webdriver
import time
browser = webdriver.Chrome()
browser.get('https://dynamic2.scrape.center/')
browser.execute_script(open('hook.js').read())
time.sleep(2)
for index in range(3):
print('current page', index)
btn_next = browser.find_element_by_css_selector('.btn-next')
btn_next.click()
time.sleep(2)
browser.close()
browser.quit()
3、server.py
接收資料的服務,可以進一步将資料存入資料庫
# -*- coding: utf-8 -*-
import json
from flask import Flask, request, jsonify
from flask_cors import CORS
app = Flask(__name__)
CORS(app)
@app.route('/receiver/movie', methods=['POST'])
def receive():
content = json.loads(request.data)
print(content)
# to something
return jsonify({'status': True})
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000, debug=True)
參考
如何用 Hook 實時處理和儲存 Ajax 資料