python BeautifulSoup使用小記

2023-04-17 17:51:38

python BeautifulSoup使用小記

注：最近在使用BeautifulSoup 進行HTML解析，記一筆~

一、BeautifulSoup一些用法規則

1.導入BeautifulSoup子產品

from BeautifulSoup import BeautifulSoup

2.擷取BeautifulSoup對象

soup=BeautifulSoup(str)

3.通過ID擷取指定對象

soup.find(id='newscontent')  #傳回id='newscontent'的第一個可比對對象

soup.findAll(id='newscontent') #傳回id='newscontent'的所有Tag以及NavigableString

4.通過class屬性擷取指定對象

soup.find(attrs={'class':'pagelink'}) #傳回class='pagelink'的第一個可比對對象

5.通過Tag擷取指定對象

soup.find('em')   #<em>one</em>

soup.findAll('em') #擷取所有的<em>标簽

soup.findAll('em')[0] #擷取所有<em>标簽中的第一個<em>标簽

#擷取 與所有<em>标簽中的第一個<em>标簽并列的下面所有的<dd>标簽
soup.findAll('em')[0].findAllNext('dd')

如果一個标簽隻有一個子節點且是字元串類型，這個子節點可以這樣通路

tag.string

，等同于

tag.contents[0]

的形式

soup.find('em').string      #<em>one</em> ->one

通過get()方法擷取tag對應的屬性值

soup.find('a').get('href') #<a href='http://cn.bing.com/'> </a> -> 'http://cn.bing.com/'

python BeautifulSoup使用小記

python BeautifulSoup使用小記

繼續閱讀

Python atexit子產品

Python3.7之pip 更新

Python 入門基礎筆記(二)

python程式設計中常見問題及解決方案

數字圖像處理(17): 直方圖均衡化處理1 直方圖均衡化簡介2 直方圖均衡化-equalizeHist()3 matplotlib.pyplot.subplot() 函數4 matplotlib.pyplot.imshow() 函數5 直方圖均衡化對比參考資料

Python文法之Xml 解析

python基礎：資料類型、資料連接配接、查詢和轉換、條件判斷

django AlreadyRegistered問題處理

“笨辦法”學Python 3基礎篇 - 資料容器與程式結構“笨辦法”學Python 3基礎篇系列文章前言4.1 程式邏輯結構-分支 if、elif、else4.2 程式邏輯結構-循環 for 和 while4.3 更複雜的清單操作結語

Django 內建logging 日志提示：ValueError: Unable to configure handler ‘file‘

100多種程式設計語言學什麼？程式設計範型/程式設計範式

正規表達式與 Python re庫

Python：the usage of argparse16.4. argparse — Parser for command-line options, arguments and sub-commands

Django 3.0.5 mysql ImproperlyConfigured: mysqlclient 1.3.13 or newer is required; you have 0.9.3

python 清單操作及使用方法

python sqlalchemy 連接配接查詢