Tools 70B

3 Pulls Updated 9 days ago

9 days ago

86f0a9f99006 · 43GB

model
llama
·
70.6B
·
Q4_K_M
params
{"stop":["<|start_header_id|>","<|end_header_id|>","<|eot_id|>"]}
template
{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|> {{- if .System }} {{ .System }} {{- end }} {{- if .Tools }} You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the orginal use question. {{- end }} {{- end }}<|eot_id|> {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 }} {{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|> {{- if and $.Tools $last }} Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt. Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables. {{ $.Tools }} {{- end }} {{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|> {{ end }} {{- else if eq .Role "assistant" }}<|start_header_id|>assistant<|end_header_id|> {{- if .ToolCalls }} {{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }} {{- else }} {{ .Content }}{{ if not $last }}<|eot_id|>{{ end }} {{- end }} {{- else if eq .Role "tool" }}<|start_header_id|>ipython<|end_header_id|> {{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|> {{ end }} {{- end }} {{- end }}

Readme

Forge: A Meta Llama 3.1 Based Model

Overview

Forge is a powerful language model based on the popular Meta Llama 3.1 architecture, specifically the 70B Instruct variant. This model has been fine-tuned and optimized for performance, leveraging the strengths of its base model to deliver accurate and informative responses.

Key Features

  • Architecture: Based on the well-established LLaMA (Large Language Model) architecture
  • Base Model: Meta Llama 3.1 70B Instruct
  • Quantization Version: 2
  • File Type: Q4_K_M

Model Specifications

  • Attention Mechanism: Utilizes 64 attention heads with a head count of 8 for key and value computations
  • Layer Normalization: Employs RMS epsilon of 1e-05 for stability and accuracy
  • Block Count: Comprises 80 blocks, allowing for deep contextual understanding
  • Context Length: Supports sequences up to 131,072 tokens
  • Embedding Length: Embeds input tokens into a 8192-dimensional space
  • Feed Forward Length: Expands the embedding length to 28672 dimensions through feed-forward layers
  • RoPE (Rotary Positional Encoding): Utilizes 128 dimensions and a frequency base of 500,000 for efficient positional encoding

Vocabulary

  • Vocabulary Size: Comprises 128,256 unique tokens