Today I would like to share with you a third-party library for Python to process Word: Python-Docx.
What is Python-Docx?
Python-Docx is a Python library for creating and updating Microsoft Word (.docx) files.
If you need to work with Word documents on a daily basis, it is very convenient to use Python's free third-party package: Python-Docx.
And this package is used in combination with pandas package, you can insert excel sheets in word, saving a lot of time copying, pasting, and adjusting table styles, which is really convenient!
But beware: Python-Docx can only handle docx, docx, docx files!
Let's introduce you to how to use Python-Docx~
Installation
- Since python-docx has already been submitted to the PyPI repository, it can be installed using pip as follows:
pip install python-docx
- If you have both python2 and python3 installed, then pip may not work, you can use pip3 to install it, as follows:
pip3 install python-docx
Document downloads
The official website of Python-Docx provides usage documentation: this document explains how to use all the features of Python-Docx, and includes a complete API reference. The capabilities of Python-Docx are also well demonstrated in the examples included in the download.
Use the demo
Here's a code to show you how to generate the document shown in the figure above, including:
- 导入python-docx库
- Create a WROD document, Level 1, Level 2, Level 3 headings, and natural paragraphs
- Format fonts
- Add an image in the specified location
- Add a table in the specified location
- The document is saved as
The following code can be swiped left and right, or directly copied and pasted~!
# 1、导入python-docx库
from docx import Document
from docx.shared import Inches
document = Document()
document.add_heading('Document Title', 0)
# 2、新建wrod文档、一级、二级、三级标题、自然段
p = document.add_paragraph('A plain paragraph having some ')
# 3、设置字体格式
p.add_run('bold').bold = True
p.add_run(' and some ')
p.add_run('italic.').italic = True
document.add_heading('Heading, level 1', level=1)
document.add_paragraph('Intense quote', style='Intense Quote')
document.add_paragraph('first item in unordered list', style='List Bullet')
document.add_paragraph('first item in ordered list', style='List Number')
# 4、在指定位置添加图片
document.add_picture('monty-truth.png', width=Inches(1.25))
records = (
(3, '101', 'Spam'),
(7, '422', 'Eggs'),
(4, '631', 'Spam, spam, eggs, and spam')
)
# 5、在指定位置添加表格
table = document.add_table(rows=1, cols=3)
hdr_cells = table.rows[0].cells
hdr_cells[0].text = 'Qty'
hdr_cells[1].text = 'Id'
hdr_cells[2].text = 'Desc'
for qty, id, desc in records:
row_cells = table.add_row().cells
row_cells[0].text = str(qty)
row_cells[1].text = id
row_cells[2].text = desc
document.add_page_break()
# 6、文档另存为
document.save('demo.docx')
Additional Resources:
More sample code can be found on the GitHub page of Python-Docx.
https://github.com/python-openxml/python-docx