laitimes

Protecting cultural heritage, highlighting the new meaning of the times丨Digitization, making ancient books within reach

author:China Industry Network

Protecting cultural heritage and highlighting the new meaning of the times

Original title: Bringing together more than 2,200 ancient books such as "Yongle Canon", "Reading Ancient Books" Platform - (Citation)

Digitizing Ancient Books at Your Fingertips (Theme)

People's Daily reporter Wu Dan

Core Reading

There are about 200,000 kinds of ancient books in the mainland, and it may take more than 300 years to restore and sort out all the existing ancient books. The digitization of ancient books is imminent. How to "move" an ancient book from a page to a web page? Extensive collection of image materials and the use of modern technology to refine the ...... Since its launch in October 2022, the "Ancient Books" platform has successively collected more than 2,200 ancient books, including classics, histories, sub-books, and collections, and opened them to the public free of charge, making a useful attempt to solve the contradiction between the protection and utilization of ancient books.

Tu Youyou got inspiration from ancient books, and discovered artemisinin; looking for subtleties in the vast ancient books, Zhu Kezhen drew a curve of phenological changes - "Zhu Kezhen Curve", condensing the cold and warm changes of the land of China for 5,000 years.

The vast volumes of ancient Chinese books condense the wisdom of the ancestors, record the brilliant culture, and tell the endless Chinese civilization. Some ancient books are slowly "aging", fading, brittle, corroding, and even damaged after a little reading.

What happens when ancient texts meet modern technology?

A whole new way to "open" ancient books

"First Sight", "Streamer", "Shocking", "Zhulian", "Embellishment"...... Click on the tag on the webpage, and the information of the "Yongle Canon" such as the past and present, compilation method, historical value and other information is presented in front of you, accompanied by animated sound effects.

Click "Reading Dictionary" in the upper right corner of the page to enter the text reading platform. The original image of the Yongle Dictionary is compared with the digital text, and the traditional and simplified Chinese characters can be switched at any time. In case of rare words and sentences, you can choose the Chinese version, click "View Citation", the source is clear and verifiable.

The Yongle Canon is the largest book of its kind in ancient China, bringing together all kinds of classics from the pre-Qin period to the early Ming Dynasty, and is known as "the largest encyclopedia in the history of the world". However, after several dispersals, less than 4% of the original copies survived. For scholars, the Yongle Canon is an important source for scholarly research, but for the general reader, ancient texts are often difficult to understand and have little access to.

Today, the high-definition image database (first series) of Yongle Dadian has been officially launched on the ancient book digitization platform "Zhidian Ancient Books", which is open to the public free of charge. With the help of modern digital technology, the thick classics are condensed in every square inch, and the dusty historical picture scroll is slowly unfolding, becoming a cultural resource at your fingertips.

"The interactive and visual presentation is more in line with the reading Xi of contemporary people, and the immersive reading experience shortens the distance between ancient books and ordinary readers. Wei Tong, one of the project leaders of the "Reading Ancient Books" platform and an assistant professor of the Department of Information Management of Peking University, said.

Since its launch in October 2022, the "Ancient Books" platform has successively collected more than 2,200 ancient books, including classics, histories, sub-books, and collections, which are open to readers at home and abroad for free. The platform is jointly built by Peking University and Douyin, and is committed to providing users with free, open, stable, fast and convenient search and reading services.

Wang Jun, project leader of the "Reading Ancient Books" platform and director of the Digital Humanities Research Center of Peking University, hopes that the "Reading Ancient Books" platform can promote the return of Chinese ancient books scattered overseas and promote the open sharing of ancient books.

An attempt to resolve the contradiction between protection and utilization

Why is the digitization of ancient books urgent?

Wang Jun has calculated that there are about 200,000 kinds of ancient books in the mainland, and from 1949 to 2019, nearly 38,000 kinds have been restored and published, and it may take more than 300 years to restore and sort out all the existing ancient books. It can be said that the speed of restoration of ancient books cannot keep up with the speed of aging.

Fixing and tidying up is just the first step in digitalization. Ancient books have the dual attributes of cultural relics and documents, and if the restored ancient books are only shelved, follow-up research cannot be carried out, let alone the value of their cultural inheritance.

Therefore, digitalization is a change in production efficiency and an attempt to solve the contradiction between the protection and utilization of ancient books.

How to "move" an ancient book from a page to a web page?

Entering the "Ancient Books" platform, Yang Hao, the designer of the platform and an associate researcher at the Institute of Artificial Intelligence of Peking University, began to demonstrate: "The digitization of ancient books is divided into two steps. First, we cooperate with ancient book collection institutions at home and abroad to collect a wide range of digital image materials of ancient books. The second is textualization, which uses artificial intelligence technology to recognize, sort, proofread, structure, punctuation, and entity recognition of ancient texts, and refines the content. ”

Yang Hao uploaded a page of images of ancient books, and after a while, the automatic text recognition and processing were completed. Small boxes of different colors appear on the images of ancient books, "Each box corresponds to a text, and the order is adjusted first." The red box is a reminder that human intervention is required here for further judgment and processing. ”

At the same time, a piece of text has been automatically recognized next to the image of the ancient book, and can be modified and adjusted according to the original image. Yang Hao continued: "In this process, artificial intelligence technologies such as text recognition, automatic punctuation, and named entity recognition are mainly used. Character recognition technology is to perform a single segmentation of the text in the digital image of ancient books, and then carry out text recognition and sequential reading; automatic punctuation technology is to automatically carry out modern punctuation of ancient books by way of sequence annotation; and named entity recognition technology is to identify the names of people, places, books, time, official positions and other information in the text through the sequence annotation method. At the same time, after the machine automatically identifies it, there will be a special person to review the results to further improve the accuracy.

It is reported that the accuracy rate of text recognition on the "Classics and Ancient Books" platform has reached more than 96%, the accuracy rate of automatic sentence reading has reached 94%, and the accuracy rate of named entity recognition in medieval historical materials is close to 98%.

"Most platforms for reading ancient books only provide scanned manuscripts or only text content, and some commercial databases charge high fees and it is very difficult to access resources. Liu Muhan, a student of the Department of History of Peking University, said that the "Reading Ancient Books" platform has rich search functions, classification and chronological screening functions, which can assist in academic research.

A full-process intelligent finishing platform

The collection and display of digital versions of ancient books is not the whole of the "Ancient Books" platform. The team has a bigger vision - to realize all aspects of the intelligent organization of ancient books in one platform.

"The 'Reading Ancient Books' platform consists of two parts, the front end is the reading platform, and the back end is the ancient books sorting platform. Wang Jun made an analogy, "It's like the front hall and back kitchen of a restaurant." ”

At present, as the "back kitchen" of the ancient book sorting platform, various user roles such as team administrator, bibliographic administrator, reviewer, and arranger have been set up. In the next step, it will attract ancient book lovers and researchers from all walks of life to promote the construction of ancient book sorting projects and databases in the form of crowdsourcing proofreading and collaborative review, and create a whole-process system of "ancient book image uploading, text proofreading and sorting, high-quality marking, and text output".

Liu Yuxin, a student at the School of History and Culture of Harbin Normal University, experienced the role of "finisher" in advance.

"I hope to do what I can for the seriously damaged ancient books. In April 2022, seeing the recruitment information of the Digital Humanities Research Center of Peking University, Liu Yuxin signed up as soon as possible and became a volunteer on the "Reading Ancient Books" platform.

"I participated in the proofreading of ancient books such as "Notes on the Left Biography of the Spring and Autumn Period", "Historical Records" and "Book of Han". When it comes to volunteer work, Liu Yuxin's love is beyond words, "The deepest impression is that in order to formulate the rules for the labeling of official positions in the Wei, Jin, Southern and Northern Dynasties, I consulted a large number of documents, and also read in detail the 21st examination of the 'Vocational Official Examination' of the "General Examination of Literature". ”

"The development of ancient books in the new era requires a group of compound talents who are familiar with classical philology, ancient book protection, information technology and digitization processes, and can organically integrate all aspects. Yang Haizheng, professor of the Department of Chinese at Peking University, suggested that the theoretical construction and curriculum system construction of ancient books should be strengthened, professional teaching materials suitable for the needs of ancient books in the new era should be compiled, and more practical opportunities should be provided for students to promote the construction of ancient books talent team in the new era.

"Ancient books are a powerful proof of the uninterrupted continuation of Chinese civilization for thousands of years, and we hope to promote the inheritance and development of Chinese civilization and show and spread the beauty of our culture to the world through the platform of 'Reading Ancient Books', summer workshops, academic seminars and other activities. Wang Jun said.

At the beginning of the new year, Yang Hao wrote a new outlook:

"China has a long history and a vast civilization. In 2024, I hope to collect more ancient books, improve the quality of sorting, and better protect the root of civilization!"

(Li Ye participated in the writing)

Source: People's Daily

Read on