laitimes

"Past, Future, Old Road, New Road" - the road to digital transformation of biomedicine in the era of large models

author:The CIO's night reading hours

Follow and star [CIO's Night Reading Time]

Walk with more CIOs along the way

This article is based on the sharing of Xing Jie, COO of Shuimu Molecules, at the 12th CIAPH Pharmaceutical and Health Industry Digitalization Summit Forum

"Past, Future, Old Road, New Road" - the road to digital transformation of biomedicine in the era of large models

Shuimu Molecule COO Xing Jie

Dear colleagues, I am Xing Jie, sharing on behalf of Shuimu Molecular Company. My career has spanned personal computers, the Internet, and the mobile Internet, and I am actively integrating into the era of large models. Last November, I made an important decision to move from being an investor to a partner in a startup. The startup focuses on verticals in the pharmaceutical industry and is committed to developing multimodal large models and their applications.

Mizuki Molecule is a company based on artificial intelligence-native technology.

At every critical moment of technological change, I am always committed to actively learning, participating in, and promoting the application of technology in real-world scenarios. I was deeply worried that I would be missed out on this era, or rather that I myself had missed it.

"Past, Future, Old Road, New Road" - the road to digital transformation of biomedicine in the era of large models

How to make yourself not miss this era, only bow down to enter the game

"Past, Future, Old Road, New Road" - the road to digital transformation of biomedicine in the era of large models

Looking back at history, each previous industrial revolution took about a century. However, since the concept of "artificial intelligence" was proposed at the Dartmouth conference in 1956, to the rise of deep learning and neural network technology in 2012, and then to the emergence of ChatGPT and large models at the end of 2022, the iteration speed of the industrial revolution in this new era has shown a trend of superlinear acceleration.

"Past, Future, Old Road, New Road" - the road to digital transformation of biomedicine in the era of large models

Neural networks are not a native product of computer science, but a major revolution in brain science. It mimics the way humans think about problems, stimulating domain-specific functions through neuronal attention mechanisms to solve specialized problems more effectively. Still, neural networks don't solve all problems. OpenAI's ChatGPT has quickly become popular, sparking endless reverie about the future experience of artificial intelligence. However, in a specific domain, it is necessary to combine domain knowledge, data, compliance and other factors to find a balance between generation and retrieval, trustworthiness and traceability, which is not a simple problem that can be solved by a general model, which is why I chose to join a vertical model company.

We are in the midst of a seismic shift, how do we adapt, and what does that mean for the CIO?

Bill Gates has noted:

We've been exploring how AI can change the way humans interact with computers. Whether it is structured data or unstructured data, it will eventually be parsed and processed by large models, transformed into tasks, and executed by agents to achieve intent.
"Past, Future, Old Road, New Road" - the road to digital transformation of biomedicine in the era of large models

In the field of biomedicine, human intuition and innovative thinking are indispensable. We believe that the intuition of scientists is crucial, and there is no substitute for automated filling by machines. We need to guide this intuition to our model so that it can schedule tasks more efficiently, perform actions, and provide better feedback. For the industry as a whole, big models will not replace the work of scientists, but this super assistant will certainly change the paradigm of their work.

The old road is new, and when we reach the new road, how should we move forward?

"Past, Future, Old Road, New Road" - the road to digital transformation of biomedicine in the era of large models

We have been on the road of enterprise digital transformation for more than 20 years. Now, facing a new path, how should we move forward? I'm currently trying it out and would like to share some of my refinements with you.

The barriers in the biomedical field are very high, and the next step of the platform's capabilities and technologies will definitely have other ways to connect. The most difficult thing about a humanoid robot is not the humanoid, but its brain, and you let the agent do all the tasks, and you can control it well through the brain to complete all the tasks.

The "intuition" of scientists controls "model + agent + automation" is the next paradigm in the pharmaceutical industry.

  • By the end of the 19th century, it belonged to the first generation (TMDD): manual operation, based on empiricism, characterized by low throughput, lack of systematization, time consuming and high cost;
  • From the late 19th century to the mid-20th century, it was the second generation (CADD): computer-aided accelerated drug discovery and design, supported by physical and chemical rules at the bottom, characterized by high throughput, instrumental attributes, solving only a single problem, and relying on the experience of researchers;
  • From the middle of the 20th century to the beginning of the 21st century, the third generation (AIDD) is the transformation of AI technology in drug discovery, drug discovery and design from training data, and the dotted connection between expert cognition and large model knowledge, characterized by ultra-high throughput, process, lack of model interaction with experts, and reliance on large-scale high-quality annotated data;
  • From 2023 to the present, it will be the fourth generation (ChatDD): human-machine collaborative conversational drug development, redefining the drug development model, characterized by expert cognition and large-scale model knowledge link.
"Past, Future, Old Road, New Road" - the road to digital transformation of biomedicine in the era of large models

The difference between AIDD and ChatDD in the era of large models is that although AIDD may not seem that successful, it actually plays a very crucial role in some ways. For example, just as Tmall Genie can execute simple commands well, but is powerless for more complex problems, generative large models can draw inferences from each other to better deal with complex problems in complex scenarios in the pharmaceutical industry.

Mizuki Molecule's pharmaceutical model released a 10B open-source model and a 100B closed-source model last year. Our BioMedGPT open-source model can be downloaded on GitHub, and we welcome everyone to use it, and we hope to have your Star support.

"Past, Future, Old Road, New Road" - the road to digital transformation of biomedicine in the era of large models

ChatDD-FM® is our 100-billion-parameter closed-source model, which is based on expert instructions combined with enhanced search for internal and external trusted data, and can directly provide traceable and more rigorous answers. The model integrates the native agent and external tool invocation, connecting a series of prompt words, tools, and invocation tasks to form a "chain" closed loop from intent to execution. Such a model is more suitable for the situation of pharmaceutical companies, and can play a greater role in the field of serious medicine.

  • The role of memory is to allow the model to retain more memory, and in many scenarios (e.g., medical customer service, etc.), the history of Q&A needs to be memorized, especially under the medical traceability system.
  • Prompts are an important skill to interact with the model correctly, and it is very important to build a Q&A template in the biomedical professional field and continuously accumulate high-quality data based on Prompt to help the model fine-tune and further strengthen learning.
"Past, Future, Old Road, New Road" - the road to digital transformation of biomedicine in the era of large models

The use of generic large models in the pharmaceutical industry inevitably brings "hallucination" problems, especially in the professional field. For example, when asked what cancer has the highest mortality rate in China, an ambiguous answer may be obtained. Therefore, every enterprise needs internal experts to "guide" the model and continuously improve the ability to recognize contextual intent to better generate enterprise-specific value.

"Past, Future, Old Road, New Road" - the road to digital transformation of biomedicine in the era of large models

Our blueprint is mainly composed of a three-layer architecture: the bottom layer is Foundation, which is mainly composed of the ChatDD-FM 100 billion parameter multi-modal biomedical model and professional database, the middle layer is the DeepApp ecosystem, which mainly focuses on R&D assistant, business intelligence, clinical R&D, medicine and marketing, process and production, etc., and the top layer is Deepin Pharma scenario, which is used in early R&D, business decision-making, pipeline tracking, clinical R&D, and marketing.

Although this blueprint will change greatly with the rapid development of the model ecology in two years, I believe that the core problem of connecting these elements to solve the above scenarios is the core problem of every enterprise, and this will definitely not change.

If you only talk about technology without taking into account the scenario, then the technology itself will be meaningless. Scenario understanding is a key part of solving key problems in the industry, and it is necessary to find the right scenario and find the most suitable solution from the scenario.

Outside of this framework, I must emphasize that high-quality data is the most critical core factor for every enterprise in the era of large models. All CIOs should regard data asset management as the most important task to build a higher core competitiveness of the enterprise.

"Past, Future, Old Road, New Road" - the road to digital transformation of biomedicine in the era of large models

Through these two scenarios, we can think about what the entire closed-dry, wet-dry, or model-driven laboratory of the future will look like. ChemCrow uses large language models to drive intent and task understanding to task disassembly, and then calls APIs to close the loop of dry and wet tasks, and collects high-quality timely feedback data. Another example is the Lawrence Berkeley National Laboratory, an attempt in materials science where they isolated 41 new compounds from 58 targets in 17 days. If the commercialization of dry and wet experiments driven by large models can really be implemented, the work efficiency and R&D efficiency of scientists will be drastically changed.

"Past, Future, Old Road, New Road" - the road to digital transformation of biomedicine in the era of large models

In the age of models, we are all on the same page. My advice is: first do it, then do it right, then do it better. There is certainly a cost to try, but there is a world of difference between doing and not doing. If you haven't tried the changes brought about by the model, then talking about the model is just an empty slogan.

The live speech video is far more than the text arrangement, but also includes the speaker's on-site interaction with Professor Zhang Wei of the University of Hong Kong.

This article is reprinted with permission from the CIO Development Center

Read on