初學爬蟲：利用Python爬取資料

2023-08-07 05:24:47

新手初學，話不多說，直接放代碼，需要的自取。
沒有加每日自動運作的東西，回頭研究一下。預設儲存在代碼目錄的water_level.csv檔案裡，隻保留了幾個有用的資訊。
回頭研究一下http://www.cjh.com.cn/sqindex.html)資料，。看了一下網頁，資料是存在js裡的，還沒有研究爬js的資料。

# -*- encoding: utf-8 -*-
'''
@File    :   cq_water_level.py
@Time    :   2022/12/03 10:31:18
@Author  :   erqie
@Version :   1.0
@Contact :   [email protected]
@Function:   爬取每日資料
'''
__author__ = 'erqie'


import requests
import json

url = 'http://cqsw.slj.cq.gov.cn/hydrologyapi/stRiverR/dayWaterNotice'
# 根據自己的浏覽器自己修改，防止無法通路的情況。沒有測試不加header會不會拒絕通路。
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.139 Safari/537.36'
}
resp = requests.post(url, headers=headers)
# print(resp.text)
data_list = json.loads(resp.text)["data"]
# print(data_list)
with open('water_level.csv', 'a', encoding='utf-8') as f:
    for s in data_list:
        # print(f'{s.get("stcd")}----------{s.get("stnm")}--------{s.get("tm")}--------{s.get("z")}')
        f.write(
            f'{"".join(s.get("stcd").split())},{"".join(s.get("addvnm").split())},{"".join(s.get("stnm").split())},{"".join(s.get("tm").split())},{s.get("z")}\n'
        )

初學爬蟲：利用Python爬取資料

繼續閱讀

無法解析的外部符号 wmain，該符号在函數 "void cdecl mainCRTStartupHelper(struct HINSTANCE *,unsigned short con......

TestLink導出用例轉換工具(XML2Excel)

YAML簡介和PyYAML安全操作YAML支援的類型YAML的優點：yaml的基本文法python操作

Small tricks

libsvm for python 安裝

學習軟體測試基礎測試第七天

Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API

【Torch】最簡潔logging使用指南

27. Remove Element(清單)題目代碼

sort()函數到底是怎樣進行數字排序的

Cloud Studio初體驗

使用 ctypes 進行 Python 和 C 的混合程式設計

【python】【資料處理】畫多元資料分布圖

【python】netconf協定對接管理裝置

「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置

在python中建立excel并寫入