天天看點

基于 Python 建構的文章轉視訊神器

作者:網際網路進階架構師

需求

網上有不少文稿轉視訊的文章,但是往往依賴于Stable Diffusion等大模型(比如link),部署和使用都不太友善。于是就打算基于 Python 和開源庫,做一個文章轉視訊的工具,友善地将圖文并茂的網頁文章轉換成視訊。

代碼已經上傳到GitHub: github.com/sim4d/text2…

具體步驟

  • 通過 requests 擷取文章内容
  • 借 edge-tts 生成語音及字幕
  • 利用 BeautifulSoap 抓取文章圖像
  • 再借助 moviepy 将圖像拼接為視訊,并比對語音與字幕

項目特點

  • 完全基于 Python 及其第三方應用庫,無需依賴于stable-diffusion/midjourney等大模型。
  • 實作以微信公衆号文章為例,函數的具體查詢是基于公衆号文章的特點 (如 make_audio.py 裡的 get_title / get_wechat_article 等函數)。若用于其它類型文章,可能需要調整查詢的tag和class資訊

開發環境

Windows 11, WSL2 + Rocky 9.3,Python 3.12

安裝 Rocky Linux 9.3 for Windows Subsystem for Linux 2 (WSL2), refer to link

wget -Uri https://dl.rockylinux.org/pub/rocky/9/images/x86_64/Rocky-9-Container-Base.latest.x86_64.tar.xz -OutFile ./Rocky-9-Container-Base.latest.x86_64.tar.xz
mkdir wsl-rocky
wsl --import rocky9 ./wsl-rocky ./Rocky-9-Container-Base.latest.x86_64.tar.xz --version 2
wsl -d rocky9           

更新系統,添加普通使用者

# dnf install epel-release
# dnf update && dnf upgrade
#
# dnf install sudo
# adduser wsl
# passwd wsl
# usermod -aG wheel wsl
# exit           

以普通使用者進入 Rocky9

wsl -d rocky9 -u wsl           

安裝 pip 和 ImageMagick

sudo dnf install python3-pip
sudo dnf install ImageMagick
sudo dnf install git vim           

準備 sandbox (要求 pub key 已經在 GitHub 賬戶 Profile 設定好)

cd ~/
mkdir sandbox
cd sandbox
git clone [email protected]:sim4d/text2video.git text2video           

安裝依賴

cd ~/sandbox/text2video
python3 -m venv my_venv
source ./my_venv/bin/activate
pip3 install -r requirements.txt           

設定url,運作text2video.py

python3 text2video.py           

References

  • zulko.github.io/moviepy/ref…
  • github.com/rany2/edge-…

其它問題

1. edge-tts 生成 vtt 字幕時,隻能以詞語為 Boundary

做成視訊後,字幕看起來就很亂。如圖

基于 Python 建構的文章轉視訊神器

解決方案

把 edge-tts 項目導入進來,并打上patch,改為以句子為 Boundary

Patch diff

$ diff -u communicate.py-original communicate.py
--- communicate.py-original     2024-05-12 16:28:58.031623420 +0800
+++ communicate.py      2024-05-13 00:20:53.785773846 +0800
@@ -331,7 +331,7 @@
                 "Content-Type:application/json; charset=utf-8\r\n"
                 "Path:speech.config\r\n\r\n"
                 '{"context":{"synthesis":{"audio":{"metadataoptions":{'
-                '"sentenceBoundaryEnabled":false,"wordBoundaryEnabled":true},'
+                '"sentenceBoundaryEnabled":true,"wordBoundaryEnabled":false},'
                 '"outputFormat":"audio-24khz-48kbitrate-mono-mp3"'
                 "}}}}\r\n"
             )
@@ -359,7 +359,7 @@
         def parse_metadata() -> Dict[str, Any]:
             for meta_obj in json.loads(data)["Metadata"]:
                 meta_type = meta_obj["Type"]
-                if meta_type == "WordBoundary":
+                if meta_type in ("WordBoundary", "SentenceBoundary"):
                     current_offset = meta_obj["Data"]["Offset"] + offset_compensation
                     current_duration = meta_obj["Data"]["Duration"]
                     return {           

新的字幕效果

基于 Python 建構的文章轉視訊神器

2. 同樣的代碼,用 WSL + Ubuntu,就會碰到以下問題。換成 Ubuntu 24.04 也一樣。最後換成 Rocky 9.3 才行。

Traceback (most recent call last):
  File "/home/wsl/sandbox/text2video/text2video.py", line 116, in <module>
    main(url, font_path)
  File "/home/wsl/sandbox/text2video/text2video.py", line 107, in main
    generate_video(images_dir, audio_path, vtt_path, font_path, output_path, front_txt, title_txt)
  File "/home/wsl/sandbox/text2video/text2video.py", line 36, in generate_video
    front_clip = mp.TextClip(front_txt, color='black', bg_color='white', font=font_path, align='West', kerning=5, fontsize=18)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wsl/sandbox/text2video/my_venv/lib/python3.12/site-packages/moviepy/video/VideoClip.py", line 1146, in __init__
    raise IOError(error)
OSError: MoviePy Error: creation of None failed because of the following error:

convert-im6.q16: attempt to perform an operation not allowed by the security policy `@/tmp/tmpn53ke08b.txt' @ error/property.c/InterpretImageProperties/3771.
convert-im6.q16: label expected `@/tmp/tmpn53ke08b.txt' @ error/annotate.c/GetMultilineTypeMetrics/782.
convert-im6.q16: no images defined `PNG32:/tmp/tmp3vlxrrq6.png' @ error/convert.c/ConvertImageCommand/3234.
.

.This error can be due to the fact that ImageMagick is not installed on your computer, or (for Windows users) that you didn't specify the path to the ImageMagick binary in file conf.py, or that the path you specified is incorrect           

解決方案

這可能是 Windows + Python + ImageMagick 特有的問題,好像這個 link 碰到類似問題,并成功解決了。

作者:Simford

連結:https://juejin.cn/post/7368637177428426752

來源:稀土掘金

繼續閱讀