demand

There are many articles on the Internet that convert manuscripts to videos, but they often rely on large models such as Stable Diffusion (such as link), which are not easy to deploy and use. Therefore, I plan to make an article-to-video tool based on Python and open-source libraries to easily convert illustrated web articles into videos.

The code has been uploaded to GitHub: github.com/sim4d/text2...

Specific steps:

Get the content of the article via requests
借 edge-tts 生成语音及字幕
利用 BeautifulSoap 抓取文章图像
Then use MoviePy to stitch images into video and match voice and subtitles

Project features:

完全基于 Python 及其第三方应用库,无需依赖于stable-diffusion/midjourney等大模型。
Taking the WeChat official account article as an example, the specific query of the function is based on the characteristics of the official account article (such as the get_title / get_wechat_article functions in make_audio.py). If you use it for other types of articles, you may need to adjust the tag and class information of the query

Development environment

Windows 11, WSL2 + Rocky 9.3，Python 3.12

安装 Rocky Linux 9.3 for Windows Subsystem for Linux 2 (WSL2), refer to link

wget -Uri https://dl.rockylinux.org/pub/rocky/9/images/x86_64/Rocky-9-Container-Base.latest.x86_64.tar.xz -OutFile ./Rocky-9-Container-Base.latest.x86_64.tar.xz
mkdir wsl-rocky
wsl --import rocky9 ./wsl-rocky ./Rocky-9-Container-Base.latest.x86_64.tar.xz --version 2
wsl -d rocky9

Upgrade the system and add regular users

# dnf install epel-release
# dnf update && dnf upgrade
#
# dnf install sudo
# adduser wsl
# passwd wsl
# usermod -aG wheel wsl
# exit

Enter Rocky9 as a normal user

wsl -d rocky9 -u wsl

安装 pip 和 ImageMagick

sudo dnf install python3-pip
sudo dnf install ImageMagick
sudo dnf install git vim

准备 sandbox (要求 pub key 已经在 GitHub 账户 Profile 设置好)

cd ~/
mkdir sandbox
cd sandbox
git clone [email protected]:sim4d/text2video.git text2video

Install dependencies

cd ~/sandbox/text2video
python3 -m venv my_venv
source ./my_venv/bin/activate
pip3 install -r requirements.txt

Set the URL and run the text2video.py

python3 text2video.py

References

zulko.github.io/moviepy/ref…
github.com/rany2/edge-…

Miscellaneous

1. edge-tts 生成 vtt 字幕时，只能以词语为 Boundary

Once the video is made, the subtitles look messy. As shown in Fig

Article-to-video artifact built on Python

solution

把 edge-tts 项目导入进来，并打上patch，改为以句子为 Boundary

Patch diff

$ diff -u communicate.py-original communicate.py
--- communicate.py-original     2024-05-12 16:28:58.031623420 +0800
+++ communicate.py      2024-05-13 00:20:53.785773846 +0800
@@ -331,7 +331,7 @@
                 "Content-Type:application/json; charset=utf-8\r\n"
                 "Path:speech.config\r\n\r\n"
                 '{"context":{"synthesis":{"audio":{"metadataoptions":{'
-                '"sentenceBoundaryEnabled":false,"wordBoundaryEnabled":true},'
+                '"sentenceBoundaryEnabled":true,"wordBoundaryEnabled":false},'
                 '"outputFormat":"audio-24khz-48kbitrate-mono-mp3"'
                 "}}}}\r\n"
             )
@@ -359,7 +359,7 @@
         def parse_metadata() -> Dict[str, Any]:
             for meta_obj in json.loads(data)["Metadata"]:
                 meta_type = meta_obj["Type"]
-                if meta_type == "WordBoundary":
+                if meta_type in ("WordBoundary", "SentenceBoundary"):
                     current_offset = meta_obj["Data"]["Offset"] + offset_compensation
                     current_duration = meta_obj["Data"]["Duration"]
                     return {

New subtitle effects

2. With the same code, using WSL + Ubuntu, you will encounter the following problems. It's the same with Ubuntu 24.04. In the end, switch to Rocky 9.3.

Traceback (most recent call last):
  File "/home/wsl/sandbox/text2video/text2video.py", line 116, in <module>
    main(url, font_path)
  File "/home/wsl/sandbox/text2video/text2video.py", line 107, in main
    generate_video(images_dir, audio_path, vtt_path, font_path, output_path, front_txt, title_txt)
  File "/home/wsl/sandbox/text2video/text2video.py", line 36, in generate_video
    front_clip = mp.TextClip(front_txt, color='black', bg_color='white', font=font_path, align='West', kerning=5, fontsize=18)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wsl/sandbox/text2video/my_venv/lib/python3.12/site-packages/moviepy/video/VideoClip.py", line 1146, in __init__
    raise IOError(error)
OSError: MoviePy Error: creation of None failed because of the following error:

convert-im6.q16: attempt to perform an operation not allowed by the security policy `@/tmp/tmpn53ke08b.txt' @ error/property.c/InterpretImageProperties/3771.
convert-im6.q16: label expected `@/tmp/tmpn53ke08b.txt' @ error/annotate.c/GetMultilineTypeMetrics/782.
convert-im6.q16: no images defined `PNG32:/tmp/tmp3vlxrrq6.png' @ error/convert.c/ConvertImageCommand/3234.
.

.This error can be due to the fact that ImageMagick is not installed on your computer, or (for Windows users) that you didn't specify the path to the ImageMagick binary in file conf.py, or that the path you specified is incorrect

solution

This may be a problem specific to Windows + Python + ImageMagick, as if this link ran into a similar issue and successfully resolved.

作者:Simford

Link: https://juejin.cn/post/7368637177428426752

Source: Rare Earth Nuggets

Article-to-video artifact built on Python

demand

Specific steps:

Project features:

Development environment

References

Miscellaneous

solution

solution

Read on