laitimes

Too cumbersome to verify nucleic acid reports? Fudan doctoral students wrote 130 lines of code to get it done

Oriental Network reporter Fu Wenjing and correspondent Yin Menghao reported on April 7: Open the "health cloud", take a screenshot of the nucleic acid report, upload and submit it to the system... This set of operations has become a "routine action" for teachers and students on university campuses to cooperate with epidemic prevention and control. But manual verification is often time-consuming and laborious, and it is easy to make mistakes. Li Xiaokang, a doctoral student at the School of Information Science and Engineering of Fudan University, recently developed a small program that can quickly verify the nucleic acids of hundreds of people and complete screenshots in a few minutes, which greatly improves the efficiency and accuracy of verification.

From 1 hour to 2 minutes

Li Xiaokang, a doctoral student majoring in biomedical engineering in the School of Information Science and Engineering, served as a counselor for the 2019 class of information 1 of the college. After the school entered the quasi-closed management, he was busy fighting the epidemic and thought about a tedious daily work. Originally, the recent school often needs to carry out full nucleic acid testing, in order to ensure that every student in school has participated, the school requires each class counselor to collect a screenshot of the student's "health cloud", if the verification finds that someone has not participated in the nucleic acid, it is necessary to prompt them to test as soon as possible to ensure that "no one is missed" on the day.

Too cumbersome to verify nucleic acid reports? Fudan doctoral students wrote 130 lines of code to get it done

Li Xiaokang served as a volunteer

"This work may sound simple, but when it is actually done, it may take half an hour to check the screenshot of a class, and if it is a large number of departments, it may take longer, and it may be misread and missed." Li Xiaokang said.

He thought that this thing was monotonous and time-consuming, and it was very repetitive, which was in line with the characteristics of computer programs. The idea of writing a code program to automatically check the nucleic acid to complete the screenshot was generated in his mind.

As soon as the program was written, Li Xiaokang verified on the nucleic acid screenshot data of his class, and the accuracy rate was indeed very high, and even detected the problems that had not been found in the previous manual verification. Moreover, the program runs for a short time, and more than 80 diagrams only need more than 20 seconds, which greatly saves time and manpower.

Later, Li Xiaokang learned that Gao Limei, the leader of the research and work group of the School of Information Science and Engineering, needed to verify the nucleic acid screenshots of all the graduate students in the college every time, which took a long time and was particularly hard. In order to reduce The burden of Teacher Gao's work, Li Xiaokang also asked her to use her own procedures for verification. 800 screenshots, originally required several people to check boring for more than an hour, now only need to wait 2 minutes to get the results. The program is currently served 2 weeks at the College.

Too cumbersome to verify nucleic acid reports? Fudan doctoral students wrote 130 lines of code to get it done

More than an hour to run through the code

Speaking of procedural principles, Li Xiaokang believes that it is not complicated. As a doctoral student majoring in biomedical engineering, his research direction is medical imaging and artificial intelligence, and he is usually exposed to many image processing methods. Even in the current busy situation of anti-epidemic work, the tutors Wang Yuanyuan and Guo Yi still insist on discussing the progress of scientific research with him every week, and care about his scientific research and life without interruption. Thanks to long-term scientific research habits and code sensitivity, in the face of automatic verification of nucleic acid screenshots, Li Xiaokang first thought of the OCR (Optical Character Recognition) technology he had learned before.

"OCR can recognize the text in the image and convert it into text information, which is convenient for verification." And because the nucleic acid screenshot is a printed font, the recognition rate is very high, and it can be almost 100% accurate. Li Xiaokang said.

There is a lot of text information in a screenshot, including the name, type of certificate, certificate number, sampling time, organization, etc. of the masking process, but not all the information is useful. Among them, the name, sampling time, and whether it has been sampled are the most critical, which is the content that needs to be retrieved and filtered.

To that end, he thought of regular expressions in python —specific patterns that can be searched into strings. "Using regular expressions, you can filter the information you want out of the text that OCR recognizes. Finally, after confirming the name, detection time and whether it has been sampled in each screenshot, the results of everyone are output to an Excel file for manual confirmation. ”

After thinking, Li Xiaokang's program idea was basically determined to be OCR text recognition + regular expression screening. Just do it. On the evening of March 15, he spent more than an hour writing the initial code, a total of 130 lines, and found that it could indeed run through and run efficiently.

Of course, there are also several technical difficulties encountered - the implementation of OCR technology, the types of screenshots submitted by students are not uniform, and the program waiting anxiety when the number of screenshots is large. Li Xiaokang tried the tools one by one, analyzed the image characteristics, and found the best solution.

Too cumbersome to verify nucleic acid reports? Fudan doctoral students wrote 130 lines of code to get it done

Excel file of program output

In the future, it is expected to cover the whole school

Li Xiaokang said that the original intention of developing the program was to reduce the workload of himself and the teachers around him. "Although the principle is also very simple, as long as the person who can write the code will understand what is going on at the first time, but not doing the relevant work does not feel the time-consuming and laborious of this matter, and naturally will not come up with a way." I just use what I've learned to solve difficulties in real work. He said.

After Li Xiaokang sent this matter to the circle of friends, many colleagues of the student workers expressed great interest, and he also shared the code so that teachers with needs could use it in time.

"Because the program is written in python and the code comments are well written, as long as you can use Python, you can get started quickly." In order to facilitate the use of teachers who do not know how to program, Li Xiaokang finally encapsulated the program. "When you need to use it, just enter a line of code on the command line and run it, it's very simple."

It is reported that the school information office has docked with Li Xiaokang. "His thinking and approach have inspired us a lot." According to the relevant person in charge, it is collecting the management needs of the second-level units, studying and formulating relevant plans, and developing new mini programs to be included in the school's "One Netcom Office" platform. It is expected that in the near future, teachers and students will no longer need to manually collect nucleic acid screenshots through WeChat, but directly upload pictures through mini programs, and the heads of second-level units can view the statistical results at any time in the background.

Read on