The strongest open-source medical AI model based on Llama 3 was released, refreshing the list

author：The mountain monster Atu 2024-04-29 12:21:00

一家名为Saama AI Labs发布了他们基于Llama 3 微调的开源医疗AI大模型OpenBioLLM-Llama3-70B 和 OpenBioLLM-Llama3-8B,刷新抱抱脸上的医疗大模型榜单,并占据榜首。其在生物医学领域的测试性能超越 GPT-4、Gemini、Meditron-70B、Med-PaLM-2等行业巨头。

The strongest open-source medical AI model based on Llama 3 was released, refreshing the list

Benchmark performance

OpenBioLLM-70B demonstrated superior performance, surpassing larger models such as GPT-4, Gemini, Meditron-70B, and Med-PaLM-2 in 9 different biomedical datasets. Despite its small number of parameters compared to GPT-4 and Med-PaLM, it achieves the best results, with an impressive average score of 86.06%.

Detailed results of the accuracy of medical topics

Fine-tune the process

The fine-tuning process is carried out in two phases, using the LLama-3 70B and 8B models as the basis for fine-tuning

1. Strategy optimization: Optimize the DPO dataset and fine-tune the recipe with direct preference. Direct Preference Optimization: Your Language Model Is Actually a Reward Model arxiv.org/abs/2305.18290

2. Fine-tune the dataset: Customize the medical guidance dataset. It took about 4 months to collect the data, including 3000 healthcare and more than 10 medical subject data, working with medical experts to review their quality and filter out non-conforming examples. The dataset details have not yet been published.

An example of an official application

Summarize clinical records

OpenBioLLM efficiently analyzes and summarizes complex clinical records, EHR data, and discharge summaries, extracts key information, and generates concise, structured summaries of medical records.

Answer medical questions

OpenBioLLM can provide answers to a wide range of medical questions.

Classification of medical documents

Perform a variety of biomedical classification tasks, such as disease prediction, sentiment analysis, medical document classification

Clinical Entity Recognition

Advanced clinical entity recognition can be performed by identifying and extracting key medical concepts such as diseases, symptoms, medications, surgeries, and anatomical structures from unstructured clinical texts. By leveraging a deep understanding of medical terminology and context, the model can accurately annotate and classify clinical entities, enabling more efficient information retrieval, data analysis, and knowledge discovery from electronic health records, research articles, and other biomedical text sources. This capability can support a variety of downstream applications, such as clinical decision support, pharmacovigilance, and medical research.

Medical marker extraction

Identify and automatically delete patient information

Detect and delete personally identifiable information (PII) from medical records, ensure patient privacy, and comply with data protection regulations such as HIPAA.

Download and use the model

OpenBioLLM-70B 下载地址：huggingface.co/aaditya/Llama3-OpenBioLLM-70B

OpenBioLLM-8B 量化模型下载地址：huggingface.co/aaditya/OpenBioLLM-Llama3-8B-GGUF

It is important to note that you should use the chat templates provided by the Llama-3 guided version. Otherwise, performance will be degraded. Some users and I have already encountered the problem of answering the wrong question in the self-loop because of the prompt word template in the process of actually using Llama 3.

Officials also claim that while OpenBioLLM-70B and 8B utilize high-quality data sources, their output may still contain inaccuracies, biases, or inconsistencies, which could pose risks if relied upon for medical decisions without further testing and refinement. The performance of this model has not been rigorously evaluated in randomized controlled trials or in real-world healthcare settings.

If the model is deployed locally in the hospital and linked to the hospital's local knowledge base, a lot of practical applications can already be imagined, and it should become a super efficiency tool in the medical industry.

The strongest open-source medical AI model based on Llama 3 was released, refreshing the list

Read on

Medical samples from Hong Kong hospitals were mixed, and the woman's uterus was accidentally removed

The United States intends to impose tariffs on some medical consumables in China, which listed companies are the most hurt

The authenticity of Beterbiev's injury is questioned! WBO requirement: Submit a medical statement within 10 days

"Cure or buy?" The medical chaos has sparked heated discussions on the Internet

Medical Horror! Hong Kong woman's uterus was mistakenly cut, and the hospital is now threatening to apologize

Residents' health is the "heart power" of Yangpu Medical's "chain"

Review of China Resources Healthcare's closure of Huaiyin Hospital: Ten years ago, the celebrity acquisition case buried hidden dangers, and the loss of medical resources took away patients

Ministry of Health: Singapore's healthcare model does not need to learn from Europe

Shouyang County Medical Group 2024 Open Recruitment Staff Announcement

Big health leader Robust Medical: Occasional factors fade, business growth is resilient, and long-term layout is the right time to look at financial reports

National Medical Security Administration: "Seeking medical treatment under false names" may constitute a violation of the law

Think manatees are procrastinating for time? Malele picked up the Manatee Doctor's medical kit and put it directly off the field

Byte model released! "99% lower than the industry price", said Tan Cheng, president of Volcano Engine

Ant Bailing large model No. 1: The release of GPT-4o is not unexpected, and the direction of native multimodality is clear

The ByteDance large model made its debut with full staff: the price was 99% lower, and there was no parameter scale and running score

Tasly and Huawei released a large model of digital intelligence materia medica

99.3% cheaper than the industry! ByteDance's bean bag model is going to overturn the industry

What do you have to do to "tame" a large model that is not controlled?

【County News】The Wolong Branch conducts on-site supervision and assistance for radiation safety protection and daily management of medical waste

Original | How multimodal large models can help enterprises in digital transformation

Patriot missiles are pulled by pallet trucks and run all over the streets, real or model?

Huawei's whole-home intelligence "anti-follower": do not shout the slogan of large models, and intensively cultivate AI health care

Multi-functional RNA analysis, the RNA language model of the Baidu team was published in the journal Nature

OpenAI叇巼બ模५GPT-4o, 微๓ "I'm going to say "I'm going to go"

A province issued! There are 9 specialties and 32 provincial-level regional medical centers

Huawei's press conference was accused of fraud: the pictures generated by the large model are manually manipulated?

The 2023 annual reports of 58 listed banks took stock: net interest income grew negatively for the first time since 2017, accelerating the layout of large models

Former Hong Kong star Chen Cunzheng appeared in a medical program to tell the story of suffering from severe kidney disease, which attracted attention

Byte took the lead in launching a large model price war

The construction standard construction period model of the school project of China Construction Eighth Bureau 2022 is available for download

Zhang Shaohan revealed that he broke off the relationship with his parents: there were no medical expenses for illness, and all his property was transferred by his parents

Baidu released a new model of autonomous driving, saying that it is more than 10 times safer than real driving, and the comment area is lively