天天看點

python資料分析——lxml已下載下傳但pycharm中使用仍然報錯

參照一本書(《python資料分析入門 從資料擷取到可視化》-沈祥壯)上的代碼準備學習一下爬蟲,但是卡在了标題中的錯誤中,嘗試了很多方法:pip install lxml / pip uninstall lxml、直接在官網上下載下傳相應版本的lxml 使用絕對路徑安裝等等,仍然無法解決。

期間有許多提示内容,其中就包括下圖所示:

Requirement already satisfied: lxml in c:\users\許逍遙\appdata\local\programs\python\python37\lib\site-packages (4.4.1)

python資料分析——lxml已下載下傳但pycharm中使用仍然報錯

顯示的意思很明顯,已經安裝過了lxml,是以問題就在pycharm配置這塊,具體解決辦法可以參考下面這篇文章(主要是

注意!敲黑闆了!

進入到pycharm,選擇file-setting-project interpreter:

這塊):

​​python 中安裝lxml包出現的問題​​

import requests
from bs4 import BeautifulSoup

url = 'https://book.douban.com/latest'

data = requests.get(url)
#data = requests.get(url)
#print(data.text)

soup = BeautifulSoup(data.text,'lxml')
books_left = soup.find('ul',{ 'class':'cover-col-4 clearfix' })
books_left = books_left.find_all('li')

books_right = soup.find('ul',{ 'class':'cover-col-4 pl20 clearfix' })
books_right = books_right.find_all('li')

books = list(books_left) + list(books_right)
#print(soup)

img_urls = []
titles = []
ratings = []
authors = []
details = []
for book in books:
    #封面圖檔url位址
    img_url = book.find_all('a')[0].find('img').get('src')
    img_urls.append(img_url)

    #圖書标題
    title = book.find_all('a')[1].get_text()
    titles.append(title)

    # 評價星級
    rating = book.find('p', {'class': 'rating'}).get_text()
    rating = rating.replace('\n', '').replace(' ', '')
    ratings.append(rating)

    # 作者及出版資訊
    author = book.find('p', {'class': 'color-gray'}).get_text()
    author = author.replace('\n', '').replace(' ', '')
    authors.append(author)

    # 圖書簡介
    detail = book.find_all('p')[2].get_text()
    detail = detail.replace('\n', '').replace(' ', '')
    details.append(detail)

    print("img_urls: ", img_urls)
    print("titles: ", titles)
    print("ratings: ", ratings)
    print("authors: ", authors)
    print("details: ", details)