python BeautifulSoup使用小记

2023-04-17 17:51:38

python BeautifulSoup使用小记

注：最近在使用BeautifulSoup 进行HTML解析，记一笔~

一、BeautifulSoup一些用法规则

1.导入BeautifulSoup模块

from BeautifulSoup import BeautifulSoup

2.获取BeautifulSoup对象

soup=BeautifulSoup(str)

3.通过ID获取指定对象

soup.find(id='newscontent')  #返回id='newscontent'的第一个可匹配对象

soup.findAll(id='newscontent') #返回id='newscontent'的所有Tag以及NavigableString

4.通过class属性获取指定对象

soup.find(attrs={'class':'pagelink'}) #返回class='pagelink'的第一个可匹配对象

5.通过Tag获取指定对象

soup.find('em')   #<em>one</em>

soup.findAll('em') #获取所有的<em>标签

soup.findAll('em')[0] #获取所有<em>标签中的第一个<em>标签

#获取 与所有<em>标签中的第一个<em>标签并列的下面所有的<dd>标签
soup.findAll('em')[0].findAllNext('dd')

如果一个标签只有一个子节点且是字符串类型，这个子节点可以这样访问

tag.string

，等同于

tag.contents[0]

的形式

soup.find('em').string      #<em>one</em> ->one

通过get()方法获取tag对应的属性值

soup.find('a').get('href') #<a href='http://cn.bing.com/'> </a> -> 'http://cn.bing.com/'

python BeautifulSoup使用小记

python BeautifulSoup使用小记

继续阅读

Python atexit模块

Python3.7之pip 升级

Python 入门基础笔记(二)

python编程中常见问题及解决方案

数字图像处理(17): 直方图均衡化处理1 直方图均衡化简介2 直方图均衡化-equalizeHist()3 matplotlib.pyplot.subplot() 函数4 matplotlib.pyplot.imshow() 函数5 直方图均衡化对比参考资料

Python语法之Xml 解析

python基础：数据类型、数据连接、查询和转换、条件判断

django AlreadyRegistered问题处理

“笨办法”学Python 3基础篇 - 数据容器与程序结构“笨办法”学Python 3基础篇系列文章前言4.1 程序逻辑结构-分支 if、elif、else4.2 程序逻辑结构-循环 for 和 while4.3 更复杂的列表操作结语

Django 集成logging 日志提示：ValueError: Unable to configure handler ‘file‘

100多种编程语言学什么？编程范型/编程范式

正则表达式与 Python re库

Python：the usage of argparse16.4. argparse — Parser for command-line options, arguments and sub-commands

Django 3.0.5 mysql ImproperlyConfigured: mysqlclient 1.3.13 or newer is required; you have 0.9.3

python 列表操作及使用方法

python sqlalchemy 连接查询