20 10 months ago

Reinforcement Learning with Thought Process Llama 3.2 3B to achieve search

007d910982fd · 158B
{
"stop": [
"<|start_header_id|>",
"<|end_header_id|>",
"<|eot_id|>",
"<|reserved_special_token",
"<|begin_of_text|>"
]
}