讓浏覽器全自動下載下傳你想要的資料，程式員才知道的小技巧，太實用

2021-11-09 16:14:13

Python爬蟲、資料分析、網站開發等案例教程視訊免費線上觀看

前言

現在電商平台有很多商品資料，采集到的資料對電商價格戰很有優勢

今天帶大家采集京東這個電商平台的資料

環境介紹：

python 3.6

pycharm

selenium

csv

time

首先配置好開發環境

先要找到你Google浏覽器的版本

複制位址，随便在一個檔案夾内粘貼打開

然後就可以看見你Google浏覽器的版本

在百度上搜尋浏覽器驅動，第一個就是

找一個和你版本一樣或者差不多的版本下載下傳

現在可以敲代碼了

安裝selenium子產品

pip install selenium

再導入子產品，建立浏覽器對象

# 浏覽器功能
from selenium import webdriver   

driver = webdriver.Chrome()
driver.get('https://www.jd.com/')

運作代碼，可以操控浏覽器自動打開你輸入的網址

既然能自動的打開網頁，那幹脆來個全自動的搜尋商品好了

def get_product(key):
    """商品搜尋"""
    driver.find_element_by_css_selector('#key').send_keys(key)
    driver.find_element_by_css_selector('#search > div > div.form > button').click()

keyword = input('請輸入商品搜尋的關鍵字:')

解析搜尋商品的網頁資料

def parse_data():
    """頁面的資料解析"""
    lis = driver.find_elements_by_css_selector('.gl-item')  # 所有li标簽

    for li in lis:
        try:
            name = li.find_element_by_css_selector('div.p-name a em').text  # 商品的名字
            name = name.replace('京東超市', "").replace('"', '').replace('\n', '')
            price = li.find_element_by_css_selector('div.p-price strong i').text + '元'  # 商品的價格
            deal = li.find_element_by_css_selector('div.p-commit strong a').text  # 商品的評價數量
            title = li.find_element_by_css_selector('span.J_im_icon a').get_attribute('title')  # 商品的店鋪名字
            print(name, price, deal, title, sep=' | ')

最後一步，就是儲存資料了

import csv # 資料儲存子產品, 内置

with open('京東資料.csv', mode='a', encoding='utf-8', newline='') as f:
    csv_write = csv.writer(f)
    csv_write.writerow([name, price, deal, title])

運作代碼，效果如下圖

讓浏覽器全自動下載下傳你想要的資料，程式員才知道的小技巧，太實用

Python爬蟲、資料分析、網站開發等案例教程視訊免費線上觀看

繼續閱讀

TestLink導出用例轉換工具(XML2Excel)

IE8 CSS設定DIV居中，添加“margin:0 auto”

YAML簡介和PyYAML安全操作YAML支援的類型YAML的優點：yaml的基本文法python操作

Small tricks

libsvm for python 安裝

學習軟體測試基礎測試第七天

Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API

【Torch】最簡潔logging使用指南

27. Remove Element(清單)題目代碼

Cloud Studio初體驗

使用 ctypes 進行 Python 和 C 的混合程式設計

【python】【資料處理】畫多元資料分布圖

詳解STM32單片機的堆棧

【python】netconf協定對接管理裝置

「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置

在python中建立excel并寫入