使用xpath批量爬取堆糖图片

2023-07-20 17:45:10

import requests
import os
from lxml import etree

kw = input("输入搜索的关键字：")
url = "https://www.duitang.com/search/?kw={}&type=feed".format(kw)
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'
}

# 确定文件夹路径
base_path = os.path.dirname(__file__)
path = os.path.join(base_path, '{}'.format(kw))
if not os.path.exists(path):
    os.makedirs(path)
# print(path)

html = requests.get(url, headers=headers)
xhtml = etree.HTML(html.text)
src = xhtml.xpath('//a[@class="a"]/img/@src')
title = xhtml.xpath('//a[@class="a"]/img/@data-rootid')

# print(title)
for i in range(len(src)):
    img_src = requests.get(url=src[i], headers=headers).content
    pic_path = path+'./{}'.format(kw) + title[i] + '.jpg'
    with open(pic_path, 'wb') as f:
        f.write(img_src)
    print("<<====正在保存第{}张，剩{}张====>>".format(i + 1, len(src) - i - 1))

1.输入搜索关键字

2.生成关键字文件夹

3.批量保存图片到对应的文件夹中

效果如下

使用xpath批量爬取堆糖图片

使用xpath批量爬取堆糖图片

继续阅读

无法解析的外部符号 wmain，该符号在函数 "void cdecl mainCRTStartupHelper(struct HINSTANCE *,unsigned short con......

TestLink导出用例转换工具(XML2Excel)

YAML简介和PyYAML安全操作YAML支持的类型YAML的优点：yaml的基本语法python操作

Small tricks

libsvm for python 安装

学习软件测试基础测试第七天

Zeppelin 配置访问 REST APIApache Zeppelin Configuration REST API

【Torch】最简洁logging使用指南

27. Remove Element(列表)题目代码

sort()函数到底是怎样进行数字排序的

Cloud Studio初体验

使用 ctypes 进行 Python 和 C 的混合编程

【python】【数据处理】画多维数据分布图

【python】netconf协议对接管理设备

「Python 网络自动化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 网络设备

在python中创建excel并写入