目錄
一、離線安裝python3.6.8
二、依賴離線子產品下載下傳
三、爬蟲離線子產品安裝
四、浏覽器驅動下載下傳安裝
五、驗證版本和依賴
python版本下載下傳位址1:https://www.python.org/downloads/
python版本下載下傳位址2:https://www.python.org/ftp/python/3.6.8/
windows安裝版:python-3.6.8-amd64.exe
windows綠色版:python-3.6.8-embed-amd64.zip
windows編譯版:Python-3.6.8.tgz

python3.6依賴子產品搜尋位址:https://pypi.org/search/?c=Programming+Language+%3A%3A+Python+%3A%3A+3.6
python擴充包鏡像網:https://www.lfd.uci.edu/~gohlke/pythonlibs/
selenium 中文文檔:https://python-selenium-zh.readthedocs.io/zh_CN/latest/
python爬蟲依賴子產品位址
功能
子產品
官方位址
安裝包連結
pip依賴
setuptools
https://pypi.org/project/setuptools/
setuptools-51.0.0-py3-none-any.whl
子產品安裝工具
pip
https://pypi.org/project/pip/
pip-20.3.3-py2.py3-none-any.whl
requests依賴庫
certifi
https://pypi.org/project/certifi/
certifi-2020.12.5-py2.py3-none-any.whl
chardet
https://pypi.org/project/chardet/
chardet-4.0.0-py2.py3-none-any.whl
idna
https://pypi.org/project/idna/
idna-2.10-py2.py3-none-any.whl
urllib3
https://pypi.org/project/urllib3/
urllib3-1.26.2-py2.py3-none-any.whl
http庫
requests
https://pypi.org/project/requests/
requests-2.25.1-py2.py3-none-any.whl
xml解析庫
lxml
https://pypi.org/project/lxml/
lxml-4.6.2-cp36-cp36m-win_amd64.whl
浏覽器自動化架構
selenium
https://pypi.org/project/selenium/
selenium-3.141.0-py2.py3-none-any.whl
文字識别庫
pytesseract
https://pypi.org/project/pytesseract/
pytesseract-0.3.7.tar.gz
tesserocr依賴庫
tesseract
https://pypi.org/project/tesseract/
tesseract-0.1.3.tar.gz
圖像識别庫
tesserocr
https://pypi.org/project/tesserocr/
https://github.com/simonflueckiger/tesserocr-windows_build/releases
tesserocr-2.5.1.tar.gz
tesserocr-2.4.0-cp36-cp36m-win_amd64.whl
文字識别
tesseract-ocr
https://digi.bib.uni-mannheim.de/tesseract/
tesseract-ocr-w64-setup-v4.0.0.20181030.exe
矩陣數組計算庫
numpy
https://pypi.org/project/numpy/
numpy-1.19.4-cp36-cp36m-win_amd64.whl
計算機視覺庫
opencv-python
https://pypi.org/project/opencv-python/
opencv_python-4.4.0.46-cp36-cp36m-win_amd64.whl
1、whl依賴包離線安裝
2、tar.gz依賴包離線安裝
解壓之後 cd 進入目錄執行
3、tesseract-ocr安裝
Python tesserocr的安裝教程:https://jingyan.baidu.com/article/6b18230972e3e6fb59e15909.html
(1)安裝時選擇多語言資料下載下傳
(2)将 Tesseract-OCR 添加到環境變量
(3)安裝成功之後需要将 Tesseract-OCR 根目錄下的 tessdata 檔案夾複制到 Python 根目錄下,否則會出現報錯
(4)指定變量 tesseract_cmd 為 安裝的 tesseract.exe 檔案
selenium webdriver download
模拟浏覽器
檢視版本
鏡像位址
驅動下載下傳
谷歌浏覽器
chrome://version/
http://chromedriver.storage.googleapis.com/index.html http://npm.taobao.org/mirrors/chromedriver
chromedriver_win32.zip
火狐浏覽器
about:support
https://npm.taobao.org/mirrors/geckodriver
https://github.com/mozilla/geckodriver/releases
geckodriver-v0.26.0-win64.zip
微軟浏覽器
edge://version/
https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/
edgedriver_win64.zip
opera浏覽器
https://github.com/operasoftware/operachromiumdriver/releases
operadriver_win64.zip
IE浏覽器
設定 - 關于IE
http://selenium-release.storage.googleapis.com/index.html
IEDriverServer_x64_3.9.0.zip
PhantomJS
https://phantomjs.org/download.html https://bitbucket.org/ariya/phantomjs/downloads
phantomjs-2.1.1-windows.zip