Med42-v2 - A Suite of Clinically-aligned Large Language Models

Med42-v2 is a suite of open-access clinical large language models (LLM) instruct and preference-tuned by M42 to expand access to medical knowledge. Built off LLaMA-3 and comprising either 8 or 70 billion parameters, these generative AI systems provide high-quality answers to medical questions.

Key performance metrics:

Med42-v2-70B outperforms GPT-4.0 in most of the MCQA tasks.
Med42-v2-70B achieves a MedQA zero-shot performance of 79.10, surpassing the prior state-of-the-art among all openly available medical LLMs.
Med42-v2-70B sits at the top of the Clinical Elo Rating Leaderboard.

Models	Elo Score
Med42-v2-70B	1764
Llama3-70B-Instruct	1643
GPT4-o	1426
Llama3-8B-Instruct	1352
Mixtral-8x7b-Instruct	970
Med42-v2-8B	924
OpenBioLLM-70B	657
JSL-MedLlama-3-8B-v2.0	447

Limitations & Safe Use

The Med42-v2 suite of models is not ready for real clinical use. Extensive human evaluation is undergoing as it is required to ensure safety.
Potential for generating incorrect or harmful information.
Risk of perpetuating biases in training data.

Use this suite of models responsibly! Do not rely on them for medical usage without rigorous safety testing.

Model Details

Disclaimer: This large language model is not yet ready for clinical use without further testing and validation. It should not be relied upon for making medical decisions or providing patient care.

Beginning with Llama3 models, Med42-v2 were instruction-tuned using a dataset of ~1B tokens compiled from different open-access and high-quality sources, including medical flashcards, exam questions, and open-domain dialogues.

Model Developers: M42 Health AI Team

Finetuned from model: Llama3 - 8B & 70B Instruct

Context length: 8k tokens

Input: Text only data

Output: Model generates text only

Status: This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we enhance the model’s performance.

License: Llama 3 Community License Agreement

Research Paper: Coming soon

Intended Use

The Med42-v2 suite of models is being made available for further testing and assessment as AI assistants to enhance clinical decision-making and access to LLMs for healthcare use. Potential use cases include: - Medical question answering - Patient record summarization - Aiding medical diagnosis - General health Q&A

More can be read here: https://huggingface.co/m42-health/Llama3-Med42-70B

m42-health/Llama3-Med42-70B in Ollama

Models

Readme

Med42-v2 - A Suite of Clinically-aligned Large Language Models

Key performance metrics:

Limitations & Safe Use

Model Details

Intended Use

More