Imported from pmysl/c4ai-command-r-plus-GGUF, using the two system prompts recommended by Cohere, for general chat and tool use/RAG

Updated 5 months ago

No models have been pushed.

Readme

Note: You need to run Ollama v0.1.32 (pre-release) to get these models to work. You can find this release here. Also note: Currently uploading, not sure how long it will take, but will update this readme when done.

In the Hugging Face CohereForAI/c4ai-command-r-v01 repo, two system prompts are specified in tokenizer_config.json:

“Default”:

You are Command-R, a brilliant, sophisticated, AI-assistant trained to assist human users by providing thorough responses. You are trained by Cohere.

“Tool_use” and “Rag” are the same:

## Task and Context\\nYou help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user\\'s needs as best you can, which will be wide-ranging.\\n\\n## Style Guide\\nUnless the user asks for a different style of answer, you should answer in full sentences, using proper grammar and spelling.

I am offering both options here in case they are useful. In my testing sometimes they make a difference, but it is not entirely clear to me how/why. I benchmarked using an example script from the langroid-examples repo, examples/docqa/chat-multi-extract-local.py. This script uses multiple agents to extract information from a simple lease document. The document is not long and quite easy for both a human and GPT4-turbo to extract the requested information. However, I have found that a lot of models fail to get the right answers. Results using command-r with the different prompts are below:

Model Quant System prompt Temperature Start date End date Rent Deposit Address
command-r-plus Q2_K Default 0.2
command-r-plus Q2_K Tool_use 0.2 ☑️
command-r-plus Q3_K_L Default 0.2
command-r-plus* Q3_K_L Tool_use 0.2
command-r-plus Q4_K_M Default 0.2
command-r-plus Q4_K_M Tool_use 0.2
command-r-plus Q5_K_M Default 0.2
command-r-plus Q5_K_M Tool_use 0.2
command-r-plus Q6_K Default 0.2
command-r-plus Q6_K Tool_use 0.2
command-r-plus Q8_0 Default 0.2
command-r-plus Q8_0 Tool_use 0.2

*This quant basically got stuck in a loop with the script where it would only come up with a json with three questions when five were asked for.

Key:

✅: Correct answer, would accept address without a zip code

☑️: Incomplete correct answer, address does not include state

❌: Incorrect answer

❔: Answer given is “DO NOT KNOW” or something similar

Note: The above is using the unaltered script. You may get better results by, for example, changing the number of question variants to be greater than TWO.