python異步爬蟲-批量爬取美女圖檔

2023-08-07 03:22:16

異步爬蟲練習項目，需要改進的地方請大神指點！感謝🙏

'''
使用異步抓取umei.cc上的美女寫真圖檔

'''
import asyncio  #異步子產品
import aiohttp  #異步http子產品
from lxml import etree
import time

#抓取的頁數
END_PAG = 20

#美女寫真類目頁url
URL = "https://www.umei.cc/meinvtupian/meinvxiezhen/{}.htm"


#主函數
async def main():

    tasks = []

    #抓取的頁數
    for page in range(1,END_PAG+1):

        #把異步函數放到tasks清單裡
        tasks.append(asyncio.create_task(get_html(URL.format(page),page)))

    await asyncio.wait(tasks)

#擷取類目頁的html
async def get_html(url,page):

    t1 = time.time()
    print(f'開始抓取第【{page}】頁圖檔！')

    #建立session對象，類似于requests
    async with aiohttp.ClientSession() as session:
        #發送請求
        async with session.get(url) as rsp:
            #得到頁面
            html = await rsp.text()

            #對頁面進行解析
            html = etree.HTML(html)
            await downloads_img(html,session)

    t2 = time.time()

    print(f'第【{page}】頁已抓取完畢！耗時間【%s】' %(t2-t1))

#從類目頁url擷取圖檔url并進行下載下傳
async def downloads_img(html,session):
    #比對得到每頁對img連結
    img_lists = html.xpath('//div[@class="TypeList"]//li//img/@src')

    #循環每個圖檔連結并進行下載下傳
    for i in img_lists:
        #擷取圖檔名稱并添加路徑
        file_name = './meitu/' + i.rsplit('/', 1)[1]

        #請求圖檔
        async with session.get(i) as rsp:

            #儲存圖檔
            with open(file_name,'wb') as f:
                f.write(await rsp.content.read())


if __name__ == '__main__':

	#啟動
    asyncio.run(main())

python異步爬蟲-批量爬取美女圖檔

異步爬蟲練習項目，需要改進的地方請大神指點！感謝🙏

繼續閱讀

無法解析的外部符号 wmain，該符号在函數 "void cdecl mainCRTStartupHelper(struct HINSTANCE *,unsigned short con......

TestLink導出用例轉換工具(XML2Excel)

YAML簡介和PyYAML安全操作YAML支援的類型YAML的優點：yaml的基本文法python操作

Small tricks

libsvm for python 安裝

學習軟體測試基礎測試第七天

Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API

【Torch】最簡潔logging使用指南

27. Remove Element(清單)題目代碼

sort()函數到底是怎樣進行數字排序的

Cloud Studio初體驗

使用 ctypes 進行 Python 和 C 的混合程式設計

【python】【資料處理】畫多元資料分布圖

【python】netconf協定對接管理裝置

「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置

在python中建立excel并寫入