a finetuned Llama 3.1 8b model with supplementary DPO training

tools

ollama run 3logic/llama_dpo

curl http://localhost:11434/api/chat \
  -d '{
    "model": "3logic/llama_dpo",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='3logic/llama_dpo',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: '3logic/llama_dpo',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 1 year ago

1 year ago

158aab3789a2 · 16GB ·

model

archllama

parameters8.03B

quantizationF16

16GB

license

LLAMA 3.1 COMMUNITY LICENSE AGREEMENT Llama 3.1 Version Release Date: July 23, 2024 “Agreement”

12kB

adapter

archllama

parameters168M

quantizationF16

336MB

params

{ "stop": [ "<|start_header_id|>", "<|end_header_id|>", "<|eot_id|>"

96B

template

{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|> {{- if .System }} {{ .System }

1.5kB

Readme

Model Specifications

Base Model

Architecture: Llama 3.1
Size: 8B parameters
Type: Instruct model
Precision: FP16

Finetuned model parameters

Method: Full Precision LoRA
Epoch: 20
Rank: 64
Alpha: 16
Training Dataset: train_05_prompted_v2

DPO Parameters

Epoch: 3
Rank: 64
Alpha: 16
Training Dataset: dpo_en_demo

Performance Metrics

Metric	DPO Score	Fine Tuned Model Score	Base Model Score
Agentic Similarity	83.67	86	84.67
CoT Contextual Accuracy	⁵⁵⁄₄	56 / 3	54 / 5
Medical GPT Score	58.17	65	51.75