Built upon the powerful LLaMa-3 architecture and fine-tuned on an extensive dataset of health information, this model leverages its vast medical knowledge to offer clear, comprehensive answers.

Based on Medichat-Llama3-8B. It is quantized to q4_K_M for a balance between memory usage and output quality.

Medichat-Llama3-8B

Built upon the powerful LLaMa-3 architecture and fine-tuned on an extensive dataset of health information, this model leverages its vast medical knowledge to offer clear, comprehensive answers.

The following YAML configuration was used to produce this model:


models:
  - model: Undi95/Llama-3-Unholy-8B
    parameters:
      weight: [0.25, 0.35, 0.45, 0.35, 0.25]
      density: [0.1, 0.25, 0.5, 0.25, 0.1]
  - model: Locutusque/llama-3-neural-chat-v1-8b
  - model: ruslanmv/Medical-Llama3-8B-16bit
    parameters:
      weight: [0.55, 0.45, 0.35, 0.45, 0.55]
      density: [0.1, 0.25, 0.5, 0.25, 0.1]
merge_method: dare_ties
base_model: Locutusque/llama-3-neural-chat-v1-8b
parameters:
  int8_mask: true
dtype: bfloat16

Comparision Against Dr.Samantha 7B

Subject	Medichat-Llama3-8B Accuracy (%)	Dr. Samantha Accuracy (%)
Clinical Knowledge	71.70	52.83
Medical Genetics	78.00	49.00
Human Aging	70.40	58.29
Human Sexuality	73.28	55.73
College Medicine	62.43	38.73
Anatomy	64.44	41.48
College Biology	72.22	52.08
High School Biology	77.10	53.23
Professional Medicine	63.97	38.73
Nutrition	73.86	50.33
Professional Psychology	68.95	46.57
Virology	54.22	41.57
High School Psychology	83.67	66.60
Average	70.33	48.85

The current model demonstrates a substantial improvement over the previous Dr. Samantha model in terms of subject-specific knowledge and accuracy.

Usage:

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("sethuiyer/Medichat-Llama3-8B")
model = AutoModelForCausalLM.from_pretrained("sethuiyer/Medichat-Llama3-8B").to("cuda")

# Function to format and generate response with prompt engineering using a chat template
def askme(question):
    sys_message = ''' 
    You are an AI Medical Assistant trained on a vast dataset of health information. Please be thorough and
    provide an informative answer. If you don't know the answer to a specific medical inquiry, advise seeking professional help.
    '''

    # Create messages structured for the chat template
    messages = [{"role": "system", "content": sys_message}, {"role": "user", "content": question}]

    # Applying chat template
    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    outputs = model.generate(**inputs, max_new_tokens=512, use_cache=True)  # Adjust max_new_tokens for longer responses

    # Extract and return the generated text
    answer = tokenizer.batch_decode(outputs)[0].strip()
    return answer

## Example usage
question = '''
Symptoms:
Dizziness, headache and nausea.

What is the differnetial diagnosis?
'''
print(askme(question))

Built upon the powerful LLaMa-3 architecture and fine-tuned on an extensive dataset of health information, this model leverages its vast medical knowledge to offer clear, comprehensive answers.

Readme

Medichat-Llama3-8B

Comparision Against Dr.Samantha 7B

Usage: