python人工智能小說爬取 - - 語音合成

2022-09-22 20:22:09

1爬取小說:

測試位址:https://www.qidian.com/

将小說儲存到本地

直接上代碼了

import requests
from lxml import etree
import os

# 設計模式  --  面向對象


class Spider(object):
    def state_request(self):
        # 請求網站,擷取小說名,小說連結,建立檔案夾
        response = requests.get('https://www.qidian.com/all')       #請求網址擷取相應
        xml = etree.HTML(response.text)         #整理xml文檔對象
        novel_list = xml.xpath('//div[@class="book-mid-info"]/h4/a/text()')      #擷取小說名
        novel_url_list = xml.xpath('//div[@class="book-mid-info"]/h4/a/@href')      #擷取小說連結
        for novel, novel_url in zip(novel_list, novel_url_list):
            if os.path.exists(novel) == False:
                os.mkdir(novel)

            self.next_section(novel, novel_url)

    def next_section(self, novel, novel_url):
        # 請求小說擷取html源碼,擷取章節名,章節連結
        response = requests.get('http:' + novel_url)
        xml = etree.HTML(response.text)
        section_list = xml.xpath('//ul[@class="cf"]/li/a/text()')       #小說章節名
        section_url_list = xml.xpath('//ul[@class="cf"]/li/a/@href')    #小說章節連結
        for section, section_url in zip(section_list, section_url_list):
            self.finally_file(novel, section, section_url)

    def finally_file(self, novel, section, section_url):
        response = requests.get('http:' + section_url)
        xml = etree.HTML(response.text)
        content = "\n".join(xml.xpath('//div[@class="read-content j_readContent"]/p/text()'))
        fileName = novel + "\\" + section + ".txt"
        print("正在儲存小說檔案:" + fileName)
        with open(fileName, 'w', encoding='utf-8') as f:
            f.write(content)


spider = Spider()
spider.state_request()

以上是爬蟲全部代碼

2，語音合成

直接上代碼了

需要先在百度ai開放平台注冊：https://login.bce.baidu.com/?account=

# 發音人選擇, 基礎音庫：0為度小美，1為度小宇，3為度逍遙，4為度丫丫，
# 精品音庫：5為度小嬌，103為度米朵，106為度博文，110為度小童，111為度小萌，預設為度小美 
PER = 0;
#語速，取值0-9，預設為5中語速
SDP = 5;
#音調，取值0-9，預設為5中語調
PIT = 5;
#音量，取值0-9，預設為5中音量
VOL = 5;

from aip import AipSpeech

app_id = '17370766'
api_key = 'icSpdlysLxpPYe5QbCMNhxvY'
secret_key = 'XwnoVuHrVLioKDc2LlgTOPwQVSTVwb5L'

client = AipSpeech(app_id, api_key, secret_key)

result = client.synthesis("烈日當空，灼燒大地，盡管已經是八月末了，但炎熱的夏季依舊在散發着陣陣餘威。", "zh", 2, {
    "vol": 9,   #音量
    "spd": 3.5,   #語速
    "pit": 3,   #語調
    "per": 5,   #音色
})

with open("audio.mp3", "wb") as f:
    f.write(result)

以上爬蟲用到的核心技術：

requests 請求對象

etree 整理xml文檔對象

xpath 定位擷取資訊

os 建立檔案夾

語音合成用到的核心技術：

百度的aip

安裝方式兩種：

pip install aip

python人工智能小說爬取 - - 語音合成

繼續閱讀

Compile workrave under windows &ndash; My exprience 在Windows上編譯Workrave

HDU 2821 Pusher

UVA 1401 Remember the Word

ZOJ 2748 Free Kick

CSU 1567 Reverse Rot

JAVA 系列——>開發工具IntelliJ IDEA的安裝以及配置、快捷鍵IDEA 簡介

門戶通專訪草根站長九天狼：做站貴在堅持

UVA 519 Puzzle (II)

磁盤結構及在Linux中的命名

tabpanel 使用問題

SIP Presence SUBSCRIBE-NOTIFY

為什麼把CSS放頭部，script放下面

CSS之折疊菜單

QName是什麼

web開發之前後端渲染

403 Forbidden，You don't have permission to access / on this server.Forbidden

python人工智能 小說爬取 - - 語音合成

繼續閱讀

python人工智能小說爬取 - - 語音合成