Models
Docs
Pricing
Sign in
Download
Models
Download
Docs
Pricing
Sign in
kavai
/
gemma4-caveman
:skill-26b
96
Downloads
Updated
5 days ago
๐ชจ caveman be better. more save token. token be less. easy be use.
๐ชจ caveman be better. more save token. token be less. easy be use.
Cancel
vision
tools
thinking
audio
gemma4-caveman:skill-26b
...
/
model
7121486771cb ยท 18GB
Metadata
general.architecture
gemma4
gemma4
general.file_type
Q4_K_M
Q4_K_M
gemma4.attention.head_count
16
16
gemma4.attention.head_count_kv
[8, 8, 8, 8, 8, ...]
[8, 8, 8, 8, 8, ...]
gemma4.attention.key_length
512
512
gemma4.attention.key_length_swa
256
256
gemma4.attention.layer_norm_rms_epsilon
1e-06
1e-06
gemma4.attention.shared_kv_layers
0
0
gemma4.attention.sliding_window
1024
1024
gemma4.attention.sliding_window_pattern
[true, true, true, true, true, ...]
[true, true, true, true, true, ...]
gemma4.attention.value_length
512
512
gemma4.attention.value_length_swa
256
256
gemma4.block_count
30
30
gemma4.context_length
262144
262144
gemma4.embedding_length
2816
2816
gemma4.embedding_length_per_layer_input
0
0
gemma4.expert_count
128
128
gemma4.expert_feed_forward_length
704
704
gemma4.expert_used_count
8
8
gemma4.feed_forward_length
2112
2112
gemma4.final_logit_softcapping
30
30
gemma4.rope.dimension_count
512
512
gemma4.rope.dimension_count_swa
256
256
gemma4.rope.freq_base
1e+06
1e+06
gemma4.rope.freq_base_swa
10000
10000
gemma4.vision.attention.head_count
16
16
gemma4.vision.attention.layer_norm_epsilon
1e-06
1e-06
gemma4.vision.block_count
27
27
gemma4.vision.embedding_length
1152
1152
gemma4.vision.feed_forward_length
4304
4304
gemma4.vision.num_channels
3
3
gemma4.vision.patch_size
16
16
gemma4.vision.projector.scale_factor
3
3
tokenizer.ggml.add_bos_token
false
false
tokenizer.ggml.add_eos_token
false
false
tokenizer.ggml.add_mask_token
false
false
tokenizer.ggml.add_padding_token
false
false
tokenizer.ggml.add_unknown_token
false
false
tokenizer.ggml.bos_token_id
2
2
tokenizer.ggml.eos_token_id
1
1
tokenizer.ggml.eos_token_ids
[1, 106, 50]
[1, 106, 50]
tokenizer.ggml.mask_token_id
4
4
tokenizer.ggml.merges
[ , โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ, , , โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโ, ...]
[ , โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ, , , โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโ, ...]
tokenizer.ggml.model
llama
llama
tokenizer.ggml.padding_token_id
0
0
tokenizer.ggml.pre
gemma4
gemma4
tokenizer.ggml.scores
[0, 1, 2, 3, 4, ...]
[0, 1, 2, 3, 4, ...]
tokenizer.ggml.token_type
[3, 3, 3, 3, 3, ...]
[3, 3, 3, 3, 3, ...]
tokenizer.ggml.tokens
[<pad>, <eos>, <bos>, <unk>, <mask>, ...]
[<pad>, <eos>, <bos>, <unk>, <mask>, ...]
tokenizer.ggml.unknown_token_id
3
3
Tensor
Name
Type
Shape
token_embd.weight
Q6_K
Q6_K
[2816, 262144]
blk.0
blk.0.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.0.attn_k_norm.weight
F32
F32
[256]
blk.0.attn_norm.weight
F32
F32
[2816]
blk.0.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.0.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.0.attn_q_norm.weight
F32
F32
[256]
blk.0.attn_v.weight
Q6_K
Q6_K
[2816, 2048]
blk.0.ffn_down.weight
Q8_0
Q8_0
[2112, 2816]
blk.0.ffn_down_exps.scale
F32
F32
[128]
blk.0.ffn_down_exps.weight
Q8_0
Q8_0
[704, 2816, 128]
blk.0.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.0.ffn_gate_inp.scale
F32
F32
[2816]
blk.0.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.0.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.0.ffn_norm.weight
F32
F32
[2816]
blk.0.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.0.layer_output_scale.weight
F32
F32
[1]
blk.0.post_attention_norm.weight
F32
F32
[2816]
blk.0.post_ffw_norm.weight
F32
F32
[2816]
blk.0.post_ffw_norm_1.weight
F32
F32
[2816]
blk.0.post_ffw_norm_2.weight
F32
F32
[2816]
blk.0.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.1
blk.1.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.1.attn_k_norm.weight
F32
F32
[256]
blk.1.attn_norm.weight
F32
F32
[2816]
blk.1.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.1.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.1.attn_q_norm.weight
F32
F32
[256]
blk.1.attn_v.weight
Q6_K
Q6_K
[2816, 2048]
blk.1.ffn_down.weight
Q8_0
Q8_0
[2112, 2816]
blk.1.ffn_down_exps.scale
F32
F32
[128]
blk.1.ffn_down_exps.weight
Q8_0
Q8_0
[704, 2816, 128]
blk.1.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.1.ffn_gate_inp.scale
F32
F32
[2816]
blk.1.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.1.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.1.ffn_norm.weight
F32
F32
[2816]
blk.1.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.1.layer_output_scale.weight
F32
F32
[1]
blk.1.post_attention_norm.weight
F32
F32
[2816]
blk.1.post_ffw_norm.weight
F32
F32
[2816]
blk.1.post_ffw_norm_1.weight
F32
F32
[2816]
blk.1.post_ffw_norm_2.weight
F32
F32
[2816]
blk.1.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.2
blk.2.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.2.attn_k_norm.weight
F32
F32
[256]
blk.2.attn_norm.weight
F32
F32
[2816]
blk.2.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.2.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.2.attn_q_norm.weight
F32
F32
[256]
blk.2.attn_v.weight
Q6_K
Q6_K
[2816, 2048]
blk.2.ffn_down.weight
Q8_0
Q8_0
[2112, 2816]
blk.2.ffn_down_exps.scale
F32
F32
[128]
blk.2.ffn_down_exps.weight
Q8_0
Q8_0
[704, 2816, 128]
blk.2.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.2.ffn_gate_inp.scale
F32
F32
[2816]
blk.2.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.2.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.2.ffn_norm.weight
F32
F32
[2816]
blk.2.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.2.layer_output_scale.weight
F32
F32
[1]
blk.2.post_attention_norm.weight
F32
F32
[2816]
blk.2.post_ffw_norm.weight
F32
F32
[2816]
blk.2.post_ffw_norm_1.weight
F32
F32
[2816]
blk.2.post_ffw_norm_2.weight
F32
F32
[2816]
blk.2.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.3
blk.3.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.3.attn_k_norm.weight
F32
F32
[256]
blk.3.attn_norm.weight
F32
F32
[2816]
blk.3.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.3.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.3.attn_q_norm.weight
F32
F32
[256]
blk.3.attn_v.weight
Q6_K
Q6_K
[2816, 2048]
blk.3.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.3.ffn_down_exps.scale
F32
F32
[128]
blk.3.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.3.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.3.ffn_gate_inp.scale
F32
F32
[2816]
blk.3.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.3.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.3.ffn_norm.weight
F32
F32
[2816]
blk.3.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.3.layer_output_scale.weight
F32
F32
[1]
blk.3.post_attention_norm.weight
F32
F32
[2816]
blk.3.post_ffw_norm.weight
F32
F32
[2816]
blk.3.post_ffw_norm_1.weight
F32
F32
[2816]
blk.3.post_ffw_norm_2.weight
F32
F32
[2816]
blk.3.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.4
blk.4.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.4.attn_k_norm.weight
F32
F32
[256]
blk.4.attn_norm.weight
F32
F32
[2816]
blk.4.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.4.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.4.attn_q_norm.weight
F32
F32
[256]
blk.4.attn_v.weight
Q6_K
Q6_K
[2816, 2048]
blk.4.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.4.ffn_down_exps.scale
F32
F32
[128]
blk.4.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.4.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.4.ffn_gate_inp.scale
F32
F32
[2816]
blk.4.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.4.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.4.ffn_norm.weight
F32
F32
[2816]
blk.4.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.4.layer_output_scale.weight
F32
F32
[1]
blk.4.post_attention_norm.weight
F32
F32
[2816]
blk.4.post_ffw_norm.weight
F32
F32
[2816]
blk.4.post_ffw_norm_1.weight
F32
F32
[2816]
blk.4.post_ffw_norm_2.weight
F32
F32
[2816]
blk.4.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.5
blk.5.attn_k.weight
Q4_K
Q4_K
[2816, 1024]
blk.5.attn_k_norm.weight
F32
F32
[512]
blk.5.attn_norm.weight
F32
F32
[2816]
blk.5.attn_output.weight
Q4_K
Q4_K
[8192, 2816]
blk.5.attn_q.weight
Q4_K
Q4_K
[2816, 8192]
blk.5.attn_q_norm.weight
F32
F32
[512]
blk.5.ffn_down.weight
Q8_0
Q8_0
[2112, 2816]
blk.5.ffn_down_exps.scale
F32
F32
[128]
blk.5.ffn_down_exps.weight
Q8_0
Q8_0
[704, 2816, 128]
blk.5.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.5.ffn_gate_inp.scale
F32
F32
[2816]
blk.5.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.5.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.5.ffn_norm.weight
F32
F32
[2816]
blk.5.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.5.layer_output_scale.weight
F32
F32
[1]
blk.5.post_attention_norm.weight
F32
F32
[2816]
blk.5.post_ffw_norm.weight
F32
F32
[2816]
blk.5.post_ffw_norm_1.weight
F32
F32
[2816]
blk.5.post_ffw_norm_2.weight
F32
F32
[2816]
blk.5.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.6
blk.6.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.6.attn_k_norm.weight
F32
F32
[256]
blk.6.attn_norm.weight
F32
F32
[2816]
blk.6.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.6.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.6.attn_q_norm.weight
F32
F32
[256]
blk.6.attn_v.weight
Q6_K
Q6_K
[2816, 2048]
blk.6.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.6.ffn_down_exps.scale
F32
F32
[128]
blk.6.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.6.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.6.ffn_gate_inp.scale
F32
F32
[2816]
blk.6.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.6.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.6.ffn_norm.weight
F32
F32
[2816]
blk.6.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.6.layer_output_scale.weight
F32
F32
[1]
blk.6.post_attention_norm.weight
F32
F32
[2816]
blk.6.post_ffw_norm.weight
F32
F32
[2816]
blk.6.post_ffw_norm_1.weight
F32
F32
[2816]
blk.6.post_ffw_norm_2.weight
F32
F32
[2816]
blk.6.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.7
blk.7.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.7.attn_k_norm.weight
F32
F32
[256]
blk.7.attn_norm.weight
F32
F32
[2816]
blk.7.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.7.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.7.attn_q_norm.weight
F32
F32
[256]
blk.7.attn_v.weight
Q4_K
Q4_K
[2816, 2048]
blk.7.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.7.ffn_down_exps.scale
F32
F32
[128]
blk.7.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.7.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.7.ffn_gate_inp.scale
F32
F32
[2816]
blk.7.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.7.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.7.ffn_norm.weight
F32
F32
[2816]
blk.7.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.7.layer_output_scale.weight
F32
F32
[1]
blk.7.post_attention_norm.weight
F32
F32
[2816]
blk.7.post_ffw_norm.weight
F32
F32
[2816]
blk.7.post_ffw_norm_1.weight
F32
F32
[2816]
blk.7.post_ffw_norm_2.weight
F32
F32
[2816]
blk.7.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.8
blk.8.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.8.attn_k_norm.weight
F32
F32
[256]
blk.8.attn_norm.weight
F32
F32
[2816]
blk.8.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.8.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.8.attn_q_norm.weight
F32
F32
[256]
blk.8.attn_v.weight
Q4_K
Q4_K
[2816, 2048]
blk.8.ffn_down.weight
Q8_0
Q8_0
[2112, 2816]
blk.8.ffn_down_exps.scale
F32
F32
[128]
blk.8.ffn_down_exps.weight
Q8_0
Q8_0
[704, 2816, 128]
blk.8.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.8.ffn_gate_inp.scale
F32
F32
[2816]
blk.8.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.8.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.8.ffn_norm.weight
F32
F32
[2816]
blk.8.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.8.layer_output_scale.weight
F32
F32
[1]
blk.8.post_attention_norm.weight
F32
F32
[2816]
blk.8.post_ffw_norm.weight
F32
F32
[2816]
blk.8.post_ffw_norm_1.weight
F32
F32
[2816]
blk.8.post_ffw_norm_2.weight
F32
F32
[2816]
blk.8.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.9
blk.9.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.9.attn_k_norm.weight
F32
F32
[256]
blk.9.attn_norm.weight
F32
F32
[2816]
blk.9.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.9.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.9.attn_q_norm.weight
F32
F32
[256]
blk.9.attn_v.weight
Q6_K
Q6_K
[2816, 2048]
blk.9.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.9.ffn_down_exps.scale
F32
F32
[128]
blk.9.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.9.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.9.ffn_gate_inp.scale
F32
F32
[2816]
blk.9.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.9.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.9.ffn_norm.weight
F32
F32
[2816]
blk.9.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.9.layer_output_scale.weight
F32
F32
[1]
blk.9.post_attention_norm.weight
F32
F32
[2816]
blk.9.post_ffw_norm.weight
F32
F32
[2816]
blk.9.post_ffw_norm_1.weight
F32
F32
[2816]
blk.9.post_ffw_norm_2.weight
F32
F32
[2816]
blk.9.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.10
blk.10.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.10.attn_k_norm.weight
F32
F32
[256]
blk.10.attn_norm.weight
F32
F32
[2816]
blk.10.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.10.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.10.attn_q_norm.weight
F32
F32
[256]
blk.10.attn_v.weight
Q4_K
Q4_K
[2816, 2048]
blk.10.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.10.ffn_down_exps.scale
F32
F32
[128]
blk.10.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.10.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.10.ffn_gate_inp.scale
F32
F32
[2816]
blk.10.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.10.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.10.ffn_norm.weight
F32
F32
[2816]
blk.10.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.10.layer_output_scale.weight
F32
F32
[1]
blk.10.post_attention_norm.weight
F32
F32
[2816]
blk.10.post_ffw_norm.weight
F32
F32
[2816]
blk.10.post_ffw_norm_1.weight
F32
F32
[2816]
blk.10.post_ffw_norm_2.weight
F32
F32
[2816]
blk.10.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.11
blk.11.attn_k.weight
Q4_K
Q4_K
[2816, 1024]
blk.11.attn_k_norm.weight
F32
F32
[512]
blk.11.attn_norm.weight
F32
F32
[2816]
blk.11.attn_output.weight
Q4_K
Q4_K
[8192, 2816]
blk.11.attn_q.weight
Q4_K
Q4_K
[2816, 8192]
blk.11.attn_q_norm.weight
F32
F32
[512]
blk.11.ffn_down.weight
Q8_0
Q8_0
[2112, 2816]
blk.11.ffn_down_exps.scale
F32
F32
[128]
blk.11.ffn_down_exps.weight
Q8_0
Q8_0
[704, 2816, 128]
blk.11.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.11.ffn_gate_inp.scale
F32
F32
[2816]
blk.11.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.11.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.11.ffn_norm.weight
F32
F32
[2816]
blk.11.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.11.layer_output_scale.weight
F32
F32
[1]
blk.11.post_attention_norm.weight
F32
F32
[2816]
blk.11.post_ffw_norm.weight
F32
F32
[2816]
blk.11.post_ffw_norm_1.weight
F32
F32
[2816]
blk.11.post_ffw_norm_2.weight
F32
F32
[2816]
blk.11.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.12
blk.12.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.12.attn_k_norm.weight
F32
F32
[256]
blk.12.attn_norm.weight
F32
F32
[2816]
blk.12.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.12.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.12.attn_q_norm.weight
F32
F32
[256]
blk.12.attn_v.weight
Q4_K
Q4_K
[2816, 2048]
blk.12.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.12.ffn_down_exps.scale
F32
F32
[128]
blk.12.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.12.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.12.ffn_gate_inp.scale
F32
F32
[2816]
blk.12.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.12.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.12.ffn_norm.weight
F32
F32
[2816]
blk.12.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.12.layer_output_scale.weight
F32
F32
[1]
blk.12.post_attention_norm.weight
F32
F32
[2816]
blk.12.post_ffw_norm.weight
F32
F32
[2816]
blk.12.post_ffw_norm_1.weight
F32
F32
[2816]
blk.12.post_ffw_norm_2.weight
F32
F32
[2816]
blk.12.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.13
blk.13.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.13.attn_k_norm.weight
F32
F32
[256]
blk.13.attn_norm.weight
F32
F32
[2816]
blk.13.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.13.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.13.attn_q_norm.weight
F32
F32
[256]
blk.13.attn_v.weight
Q6_K
Q6_K
[2816, 2048]
blk.13.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.13.ffn_down_exps.scale
F32
F32
[128]
blk.13.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.13.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.13.ffn_gate_inp.scale
F32
F32
[2816]
blk.13.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.13.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.13.ffn_norm.weight
F32
F32
[2816]
blk.13.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.13.layer_output_scale.weight
F32
F32
[1]
blk.13.post_attention_norm.weight
F32
F32
[2816]
blk.13.post_ffw_norm.weight
F32
F32
[2816]
blk.13.post_ffw_norm_1.weight
F32
F32
[2816]
blk.13.post_ffw_norm_2.weight
F32
F32
[2816]
blk.13.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.14
blk.14.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.14.attn_k_norm.weight
F32
F32
[256]
blk.14.attn_norm.weight
F32
F32
[2816]
blk.14.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.14.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.14.attn_q_norm.weight
F32
F32
[256]
blk.14.attn_v.weight
Q4_K
Q4_K
[2816, 2048]
blk.14.ffn_down.weight
Q8_0
Q8_0
[2112, 2816]
blk.14.ffn_down_exps.scale
F32
F32
[128]
blk.14.ffn_down_exps.weight
Q8_0
Q8_0
[704, 2816, 128]
blk.14.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.14.ffn_gate_inp.scale
F32
F32
[2816]
blk.14.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.14.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.14.ffn_norm.weight
F32
F32
[2816]
blk.14.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.14.layer_output_scale.weight
F32
F32
[1]
blk.14.post_attention_norm.weight
F32
F32
[2816]
blk.14.post_ffw_norm.weight
F32
F32
[2816]
blk.14.post_ffw_norm_1.weight
F32
F32
[2816]
blk.14.post_ffw_norm_2.weight
F32
F32
[2816]
blk.14.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.15
blk.15.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.15.attn_k_norm.weight
F32
F32
[256]
blk.15.attn_norm.weight
F32
F32
[2816]
blk.15.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.15.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.15.attn_q_norm.weight
F32
F32
[256]
blk.15.attn_v.weight
Q4_K
Q4_K
[2816, 2048]
blk.15.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.15.ffn_down_exps.scale
F32
F32
[128]
blk.15.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.15.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.15.ffn_gate_inp.scale
F32
F32
[2816]
blk.15.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.15.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.15.ffn_norm.weight
F32
F32
[2816]
blk.15.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.15.layer_output_scale.weight
F32
F32
[1]
blk.15.post_attention_norm.weight
F32
F32
[2816]
blk.15.post_ffw_norm.weight
F32
F32
[2816]
blk.15.post_ffw_norm_1.weight
F32
F32
[2816]
blk.15.post_ffw_norm_2.weight
F32
F32
[2816]
blk.15.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.16
blk.16.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.16.attn_k_norm.weight
F32
F32
[256]
blk.16.attn_norm.weight
F32
F32
[2816]
blk.16.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.16.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.16.attn_q_norm.weight
F32
F32
[256]
blk.16.attn_v.weight
Q6_K
Q6_K
[2816, 2048]
blk.16.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.16.ffn_down_exps.scale
F32
F32
[128]
blk.16.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.16.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.16.ffn_gate_inp.scale
F32
F32
[2816]
blk.16.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.16.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.16.ffn_norm.weight
F32
F32
[2816]
blk.16.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.16.layer_output_scale.weight
F32
F32
[1]
blk.16.post_attention_norm.weight
F32
F32
[2816]
blk.16.post_ffw_norm.weight
F32
F32
[2816]
blk.16.post_ffw_norm_1.weight
F32
F32
[2816]
blk.16.post_ffw_norm_2.weight
F32
F32
[2816]
blk.16.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.17
blk.17.attn_k.weight
Q4_K
Q4_K
[2816, 1024]
blk.17.attn_k_norm.weight
F32
F32
[512]
blk.17.attn_norm.weight
F32
F32
[2816]
blk.17.attn_output.weight
Q4_K
Q4_K
[8192, 2816]
blk.17.attn_q.weight
Q4_K
Q4_K
[2816, 8192]
blk.17.attn_q_norm.weight
F32
F32
[512]
blk.17.ffn_down.weight
Q8_0
Q8_0
[2112, 2816]
blk.17.ffn_down_exps.scale
F32
F32
[128]
blk.17.ffn_down_exps.weight
Q8_0
Q8_0
[704, 2816, 128]
blk.17.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.17.ffn_gate_inp.scale
F32
F32
[2816]
blk.17.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.17.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.17.ffn_norm.weight
F32
F32
[2816]
blk.17.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.17.layer_output_scale.weight
F32
F32
[1]
blk.17.post_attention_norm.weight
F32
F32
[2816]
blk.17.post_ffw_norm.weight
F32
F32
[2816]
blk.17.post_ffw_norm_1.weight
F32
F32
[2816]
blk.17.post_ffw_norm_2.weight
F32
F32
[2816]
blk.17.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.18
blk.18.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.18.attn_k_norm.weight
F32
F32
[256]
blk.18.attn_norm.weight
F32
F32
[2816]
blk.18.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.18.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.18.attn_q_norm.weight
F32
F32
[256]
blk.18.attn_v.weight
Q4_K
Q4_K
[2816, 2048]
blk.18.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.18.ffn_down_exps.scale
F32
F32
[128]
blk.18.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.18.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.18.ffn_gate_inp.scale
F32
F32
[2816]
blk.18.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.18.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.18.ffn_norm.weight
F32
F32
[2816]
blk.18.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.18.layer_output_scale.weight
F32
F32
[1]
blk.18.post_attention_norm.weight
F32
F32
[2816]
blk.18.post_ffw_norm.weight
F32
F32
[2816]
blk.18.post_ffw_norm_1.weight
F32
F32
[2816]
blk.18.post_ffw_norm_2.weight
F32
F32
[2816]
blk.18.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.19
blk.19.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.19.attn_k_norm.weight
F32
F32
[256]
blk.19.attn_norm.weight
F32
F32
[2816]
blk.19.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.19.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.19.attn_q_norm.weight
F32
F32
[256]
blk.19.attn_v.weight
Q4_K
Q4_K
[2816, 2048]
blk.19.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.19.ffn_down_exps.scale
F32
F32
[128]
blk.19.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.19.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.19.ffn_gate_inp.scale
F32
F32
[2816]
blk.19.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.19.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.19.ffn_norm.weight
F32
F32
[2816]
blk.19.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.19.layer_output_scale.weight
F32
F32
[1]
blk.19.post_attention_norm.weight
F32
F32
[2816]
blk.19.post_ffw_norm.weight
F32
F32
[2816]
blk.19.post_ffw_norm_1.weight
F32
F32
[2816]
blk.19.post_ffw_norm_2.weight
F32
F32
[2816]
blk.19.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.20
blk.20.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.20.attn_k_norm.weight
F32
F32
[256]
blk.20.attn_norm.weight
F32
F32
[2816]
blk.20.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.20.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.20.attn_q_norm.weight
F32
F32
[256]
blk.20.attn_v.weight
Q6_K
Q6_K
[2816, 2048]
blk.20.ffn_down.weight
Q8_0
Q8_0
[2112, 2816]
blk.20.ffn_down_exps.scale
F32
F32
[128]
blk.20.ffn_down_exps.weight
Q8_0
Q8_0
[704, 2816, 128]
blk.20.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.20.ffn_gate_inp.scale
F32
F32
[2816]
blk.20.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.20.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.20.ffn_norm.weight
F32
F32
[2816]
blk.20.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.20.layer_output_scale.weight
F32
F32
[1]
blk.20.post_attention_norm.weight
F32
F32
[2816]
blk.20.post_ffw_norm.weight
F32
F32
[2816]
blk.20.post_ffw_norm_1.weight
F32
F32
[2816]
blk.20.post_ffw_norm_2.weight
F32
F32
[2816]
blk.20.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.21
blk.21.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.21.attn_k_norm.weight
F32
F32
[256]
blk.21.attn_norm.weight
F32
F32
[2816]
blk.21.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.21.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.21.attn_q_norm.weight
F32
F32
[256]
blk.21.attn_v.weight
Q4_K
Q4_K
[2816, 2048]
blk.21.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.21.ffn_down_exps.scale
F32
F32
[128]
blk.21.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.21.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.21.ffn_gate_inp.scale
F32
F32
[2816]
blk.21.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.21.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.21.ffn_norm.weight
F32
F32
[2816]
blk.21.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.21.layer_output_scale.weight
F32
F32
[1]
blk.21.post_attention_norm.weight
F32
F32
[2816]
blk.21.post_ffw_norm.weight
F32
F32
[2816]
blk.21.post_ffw_norm_1.weight
F32
F32
[2816]
blk.21.post_ffw_norm_2.weight
F32
F32
[2816]
blk.21.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.22
blk.22.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.22.attn_k_norm.weight
F32
F32
[256]
blk.22.attn_norm.weight
F32
F32
[2816]
blk.22.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.22.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.22.attn_q_norm.weight
F32
F32
[256]
blk.22.attn_v.weight
Q4_K
Q4_K
[2816, 2048]
blk.22.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.22.ffn_down_exps.scale
F32
F32
[128]
blk.22.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.22.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.22.ffn_gate_inp.scale
F32
F32
[2816]
blk.22.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.22.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.22.ffn_norm.weight
F32
F32
[2816]
blk.22.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.22.layer_output_scale.weight
F32
F32
[1]
blk.22.post_attention_norm.weight
F32
F32
[2816]
blk.22.post_ffw_norm.weight
F32
F32
[2816]
blk.22.post_ffw_norm_1.weight
F32
F32
[2816]
blk.22.post_ffw_norm_2.weight
F32
F32
[2816]
blk.22.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.23
blk.23.attn_k.weight
Q4_K
Q4_K
[2816, 1024]
blk.23.attn_k_norm.weight
F32
F32
[512]
blk.23.attn_norm.weight
F32
F32
[2816]
blk.23.attn_output.weight
Q4_K
Q4_K
[8192, 2816]
blk.23.attn_q.weight
Q4_K
Q4_K
[2816, 8192]
blk.23.attn_q_norm.weight
F32
F32
[512]
blk.23.ffn_down.weight
Q8_0
Q8_0
[2112, 2816]
blk.23.ffn_down_exps.scale
F32
F32
[128]
blk.23.ffn_down_exps.weight
Q8_0
Q8_0
[704, 2816, 128]
blk.23.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.23.ffn_gate_inp.scale
F32
F32
[2816]
blk.23.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.23.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.23.ffn_norm.weight
F32
F32
[2816]
blk.23.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.23.layer_output_scale.weight
F32
F32
[1]
blk.23.post_attention_norm.weight
F32
F32
[2816]
blk.23.post_ffw_norm.weight
F32
F32
[2816]
blk.23.post_ffw_norm_1.weight
F32
F32
[2816]
blk.23.post_ffw_norm_2.weight
F32
F32
[2816]
blk.23.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.24
blk.24.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.24.attn_k_norm.weight
F32
F32
[256]
blk.24.attn_norm.weight
F32
F32
[2816]
blk.24.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.24.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.24.attn_q_norm.weight
F32
F32
[256]
blk.24.attn_v.weight
Q6_K
Q6_K
[2816, 2048]
blk.24.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.24.ffn_down_exps.scale
F32
F32
[128]
blk.24.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.24.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.24.ffn_gate_inp.scale
F32
F32
[2816]
blk.24.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.24.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.24.ffn_norm.weight
F32
F32
[2816]
blk.24.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.24.layer_output_scale.weight
F32
F32
[1]
blk.24.post_attention_norm.weight
F32
F32
[2816]
blk.24.post_ffw_norm.weight
F32
F32
[2816]
blk.24.post_ffw_norm_1.weight
F32
F32
[2816]
blk.24.post_ffw_norm_2.weight
F32
F32
[2816]
blk.24.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.25
blk.25.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.25.attn_k_norm.weight
F32
F32
[256]
blk.25.attn_norm.weight
F32
F32
[2816]
blk.25.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.25.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.25.attn_q_norm.weight
F32
F32
[256]
blk.25.attn_v.weight
Q4_K
Q4_K
[2816, 2048]
blk.25.ffn_down.weight
Q5_0
Q5_0
[2112, 2816]
blk.25.ffn_down_exps.scale
F32
F32
[128]
blk.25.ffn_down_exps.weight
Q5_0
Q5_0
[704, 2816, 128]
blk.25.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.25.ffn_gate_inp.scale
F32
F32
[2816]
blk.25.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.25.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.25.ffn_norm.weight
F32
F32
[2816]
blk.25.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.25.layer_output_scale.weight
F32
F32
[1]
blk.25.post_attention_norm.weight
F32
F32
[2816]
blk.25.post_ffw_norm.weight
F32
F32
[2816]
blk.25.post_ffw_norm_1.weight
F32
F32
[2816]
blk.25.post_ffw_norm_2.weight
F32
F32
[2816]
blk.25.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.26
blk.26.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.26.attn_k_norm.weight
F32
F32
[256]
blk.26.attn_norm.weight
F32
F32
[2816]
blk.26.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.26.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.26.attn_q_norm.weight
F32
F32
[256]
blk.26.attn_v.weight
Q4_K
Q4_K
[2816, 2048]
blk.26.ffn_down.weight
Q8_0
Q8_0
[2112, 2816]
blk.26.ffn_down_exps.scale
F32
F32
[128]
blk.26.ffn_down_exps.weight
Q8_0
Q8_0
[704, 2816, 128]
blk.26.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.26.ffn_gate_inp.scale
F32
F32
[2816]
blk.26.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.26.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.26.ffn_norm.weight
F32
F32
[2816]
blk.26.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.26.layer_output_scale.weight
F32
F32
[1]
blk.26.post_attention_norm.weight
F32
F32
[2816]
blk.26.post_ffw_norm.weight
F32
F32
[2816]
blk.26.post_ffw_norm_1.weight
F32
F32
[2816]
blk.26.post_ffw_norm_2.weight
F32
F32
[2816]
blk.26.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.27
blk.27.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.27.attn_k_norm.weight
F32
F32
[256]
blk.27.attn_norm.weight
F32
F32
[2816]
blk.27.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.27.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.27.attn_q_norm.weight
F32
F32
[256]
blk.27.attn_v.weight
Q6_K
Q6_K
[2816, 2048]
blk.27.ffn_down.weight
Q8_0
Q8_0
[2112, 2816]
blk.27.ffn_down_exps.scale
F32
F32
[128]
blk.27.ffn_down_exps.weight
Q8_0
Q8_0
[704, 2816, 128]
blk.27.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.27.ffn_gate_inp.scale
F32
F32
[2816]
blk.27.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.27.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.27.ffn_norm.weight
F32
F32
[2816]
blk.27.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.27.layer_output_scale.weight
F32
F32
[1]
blk.27.post_attention_norm.weight
F32
F32
[2816]
blk.27.post_ffw_norm.weight
F32
F32
[2816]
blk.27.post_ffw_norm_1.weight
F32
F32
[2816]
blk.27.post_ffw_norm_2.weight
F32
F32
[2816]
blk.27.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.28
blk.28.attn_k.weight
Q4_K
Q4_K
[2816, 2048]
blk.28.attn_k_norm.weight
F32
F32
[256]
blk.28.attn_norm.weight
F32
F32
[2816]
blk.28.attn_output.weight
Q4_K
Q4_K
[4096, 2816]
blk.28.attn_q.weight
Q4_K
Q4_K
[2816, 4096]
blk.28.attn_q_norm.weight
F32
F32
[256]
blk.28.attn_v.weight
Q4_K
Q4_K
[2816, 2048]
blk.28.ffn_down.weight
Q8_0
Q8_0
[2112, 2816]
blk.28.ffn_down_exps.scale
F32
F32
[128]
blk.28.ffn_down_exps.weight
Q8_0
Q8_0
[704, 2816, 128]
blk.28.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.28.ffn_gate_inp.scale
F32
F32
[2816]
blk.28.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.28.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.28.ffn_norm.weight
F32
F32
[2816]
blk.28.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.28.layer_output_scale.weight
F32
F32
[1]
blk.28.post_attention_norm.weight
F32
F32
[2816]
blk.28.post_ffw_norm.weight
F32
F32
[2816]
blk.28.post_ffw_norm_1.weight
F32
F32
[2816]
blk.28.post_ffw_norm_2.weight
F32
F32
[2816]
blk.28.pre_ffw_norm_2.weight
F32
F32
[2816]
blk.29
blk.29.attn_k.weight
Q4_K
Q4_K
[2816, 1024]
blk.29.attn_k_norm.weight
F32
F32
[512]
blk.29.attn_norm.weight
F32
F32
[2816]
blk.29.attn_output.weight
Q4_K
Q4_K
[8192, 2816]
blk.29.attn_q.weight
Q4_K
Q4_K
[2816, 8192]
blk.29.attn_q_norm.weight
F32
F32
[512]
blk.29.ffn_down.weight
Q8_0
Q8_0
[2112, 2816]
blk.29.ffn_down_exps.scale
F32
F32
[128]
blk.29.ffn_down_exps.weight
Q8_0
Q8_0
[704, 2816, 128]
blk.29.ffn_gate.weight
Q4_K
Q4_K
[2816, 2112]
blk.29.ffn_gate_inp.scale
F32
F32
[2816]
blk.29.ffn_gate_inp.weight
F32
F32
[2816, 128]
blk.29.ffn_gate_up_exps.weight
Q4_K
Q4_K
[2816, 1408, 128]
blk.29.ffn_norm.weight
F32
F32
[2816]
blk.29.ffn_up.weight
Q4_K
Q4_K
[2816, 2112]
blk.29.layer_output_scale.weight
F32
F32
[1]
blk.29.post_attention_norm.weight
F32
F32
[2816]
blk.29.post_ffw_norm.weight
F32
F32
[2816]
blk.29.post_ffw_norm_1.weight
F32
F32
[2816]
blk.29.post_ffw_norm_2.weight
F32
F32
[2816]
blk.29.pre_ffw_norm_2.weight
F32
F32
[2816]
mm.input_projection.weight
F16
F16
[1152, 2816]
rope_freqs.weight
F32
F32
[256]
v.blk.0
v.blk.0.attn_k.weight
F16
F16
[1152, 1152]
v.blk.0.attn_k_norm.weight
F32
F32
[72]
v.blk.0.attn_out.weight
F16
F16
[1152, 1152]
v.blk.0.attn_post_norm.weight
F32
F32
[1152]
v.blk.0.attn_q.weight
F16
F16
[1152, 1152]
v.blk.0.attn_q_norm.weight
F32
F32
[72]
v.blk.0.attn_v.weight
F16
F16
[1152, 1152]
v.blk.0.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.0.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.0.ffn_post_norm.weight
F32
F32
[1152]
v.blk.0.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.0.ln1.weight
F32
F32
[1152]
v.blk.0.ln2.weight
F32
F32
[1152]
v.blk.1
v.blk.1.attn_k.weight
F16
F16
[1152, 1152]
v.blk.1.attn_k_norm.weight
F32
F32
[72]
v.blk.1.attn_out.weight
F16
F16
[1152, 1152]
v.blk.1.attn_post_norm.weight
F32
F32
[1152]
v.blk.1.attn_q.weight
F16
F16
[1152, 1152]
v.blk.1.attn_q_norm.weight
F32
F32
[72]
v.blk.1.attn_v.weight
F16
F16
[1152, 1152]
v.blk.1.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.1.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.1.ffn_post_norm.weight
F32
F32
[1152]
v.blk.1.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.1.ln1.weight
F32
F32
[1152]
v.blk.1.ln2.weight
F32
F32
[1152]
v.blk.2
v.blk.2.attn_k.weight
F16
F16
[1152, 1152]
v.blk.2.attn_k_norm.weight
F32
F32
[72]
v.blk.2.attn_out.weight
F16
F16
[1152, 1152]
v.blk.2.attn_post_norm.weight
F32
F32
[1152]
v.blk.2.attn_q.weight
F16
F16
[1152, 1152]
v.blk.2.attn_q_norm.weight
F32
F32
[72]
v.blk.2.attn_v.weight
F16
F16
[1152, 1152]
v.blk.2.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.2.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.2.ffn_post_norm.weight
F32
F32
[1152]
v.blk.2.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.2.ln1.weight
F32
F32
[1152]
v.blk.2.ln2.weight
F32
F32
[1152]
v.blk.3
v.blk.3.attn_k.weight
F16
F16
[1152, 1152]
v.blk.3.attn_k_norm.weight
F32
F32
[72]
v.blk.3.attn_out.weight
F16
F16
[1152, 1152]
v.blk.3.attn_post_norm.weight
F32
F32
[1152]
v.blk.3.attn_q.weight
F16
F16
[1152, 1152]
v.blk.3.attn_q_norm.weight
F32
F32
[72]
v.blk.3.attn_v.weight
F16
F16
[1152, 1152]
v.blk.3.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.3.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.3.ffn_post_norm.weight
F32
F32
[1152]
v.blk.3.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.3.ln1.weight
F32
F32
[1152]
v.blk.3.ln2.weight
F32
F32
[1152]
v.blk.4
v.blk.4.attn_k.weight
F16
F16
[1152, 1152]
v.blk.4.attn_k_norm.weight
F32
F32
[72]
v.blk.4.attn_out.weight
F16
F16
[1152, 1152]
v.blk.4.attn_post_norm.weight
F32
F32
[1152]
v.blk.4.attn_q.weight
F16
F16
[1152, 1152]
v.blk.4.attn_q_norm.weight
F32
F32
[72]
v.blk.4.attn_v.weight
F16
F16
[1152, 1152]
v.blk.4.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.4.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.4.ffn_post_norm.weight
F32
F32
[1152]
v.blk.4.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.4.ln1.weight
F32
F32
[1152]
v.blk.4.ln2.weight
F32
F32
[1152]
v.blk.5
v.blk.5.attn_k.weight
F16
F16
[1152, 1152]
v.blk.5.attn_k_norm.weight
F32
F32
[72]
v.blk.5.attn_out.weight
F16
F16
[1152, 1152]
v.blk.5.attn_post_norm.weight
F32
F32
[1152]
v.blk.5.attn_q.weight
F16
F16
[1152, 1152]
v.blk.5.attn_q_norm.weight
F32
F32
[72]
v.blk.5.attn_v.weight
F16
F16
[1152, 1152]
v.blk.5.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.5.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.5.ffn_post_norm.weight
F32
F32
[1152]
v.blk.5.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.5.ln1.weight
F32
F32
[1152]
v.blk.5.ln2.weight
F32
F32
[1152]
v.blk.6
v.blk.6.attn_k.weight
F16
F16
[1152, 1152]
v.blk.6.attn_k_norm.weight
F32
F32
[72]
v.blk.6.attn_out.weight
F16
F16
[1152, 1152]
v.blk.6.attn_post_norm.weight
F32
F32
[1152]
v.blk.6.attn_q.weight
F16
F16
[1152, 1152]
v.blk.6.attn_q_norm.weight
F32
F32
[72]
v.blk.6.attn_v.weight
F16
F16
[1152, 1152]
v.blk.6.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.6.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.6.ffn_post_norm.weight
F32
F32
[1152]
v.blk.6.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.6.ln1.weight
F32
F32
[1152]
v.blk.6.ln2.weight
F32
F32
[1152]
v.blk.7
v.blk.7.attn_k.weight
F16
F16
[1152, 1152]
v.blk.7.attn_k_norm.weight
F32
F32
[72]
v.blk.7.attn_out.weight
F16
F16
[1152, 1152]
v.blk.7.attn_post_norm.weight
F32
F32
[1152]
v.blk.7.attn_q.weight
F16
F16
[1152, 1152]
v.blk.7.attn_q_norm.weight
F32
F32
[72]
v.blk.7.attn_v.weight
F16
F16
[1152, 1152]
v.blk.7.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.7.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.7.ffn_post_norm.weight
F32
F32
[1152]
v.blk.7.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.7.ln1.weight
F32
F32
[1152]
v.blk.7.ln2.weight
F32
F32
[1152]
v.blk.8
v.blk.8.attn_k.weight
F16
F16
[1152, 1152]
v.blk.8.attn_k_norm.weight
F32
F32
[72]
v.blk.8.attn_out.weight
F16
F16
[1152, 1152]
v.blk.8.attn_post_norm.weight
F32
F32
[1152]
v.blk.8.attn_q.weight
F16
F16
[1152, 1152]
v.blk.8.attn_q_norm.weight
F32
F32
[72]
v.blk.8.attn_v.weight
F16
F16
[1152, 1152]
v.blk.8.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.8.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.8.ffn_post_norm.weight
F32
F32
[1152]
v.blk.8.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.8.ln1.weight
F32
F32
[1152]
v.blk.8.ln2.weight
F32
F32
[1152]
v.blk.9
v.blk.9.attn_k.weight
F16
F16
[1152, 1152]
v.blk.9.attn_k_norm.weight
F32
F32
[72]
v.blk.9.attn_out.weight
F16
F16
[1152, 1152]
v.blk.9.attn_post_norm.weight
F32
F32
[1152]
v.blk.9.attn_q.weight
F16
F16
[1152, 1152]
v.blk.9.attn_q_norm.weight
F32
F32
[72]
v.blk.9.attn_v.weight
F16
F16
[1152, 1152]
v.blk.9.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.9.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.9.ffn_post_norm.weight
F32
F32
[1152]
v.blk.9.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.9.ln1.weight
F32
F32
[1152]
v.blk.9.ln2.weight
F32
F32
[1152]
v.blk.10
v.blk.10.attn_k.weight
F16
F16
[1152, 1152]
v.blk.10.attn_k_norm.weight
F32
F32
[72]
v.blk.10.attn_out.weight
F16
F16
[1152, 1152]
v.blk.10.attn_post_norm.weight
F32
F32
[1152]
v.blk.10.attn_q.weight
F16
F16
[1152, 1152]
v.blk.10.attn_q_norm.weight
F32
F32
[72]
v.blk.10.attn_v.weight
F16
F16
[1152, 1152]
v.blk.10.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.10.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.10.ffn_post_norm.weight
F32
F32
[1152]
v.blk.10.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.10.ln1.weight
F32
F32
[1152]
v.blk.10.ln2.weight
F32
F32
[1152]
v.blk.11
v.blk.11.attn_k.weight
F16
F16
[1152, 1152]
v.blk.11.attn_k_norm.weight
F32
F32
[72]
v.blk.11.attn_out.weight
F16
F16
[1152, 1152]
v.blk.11.attn_post_norm.weight
F32
F32
[1152]
v.blk.11.attn_q.weight
F16
F16
[1152, 1152]
v.blk.11.attn_q_norm.weight
F32
F32
[72]
v.blk.11.attn_v.weight
F16
F16
[1152, 1152]
v.blk.11.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.11.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.11.ffn_post_norm.weight
F32
F32
[1152]
v.blk.11.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.11.ln1.weight
F32
F32
[1152]
v.blk.11.ln2.weight
F32
F32
[1152]
v.blk.12
v.blk.12.attn_k.weight
F16
F16
[1152, 1152]
v.blk.12.attn_k_norm.weight
F32
F32
[72]
v.blk.12.attn_out.weight
F16
F16
[1152, 1152]
v.blk.12.attn_post_norm.weight
F32
F32
[1152]
v.blk.12.attn_q.weight
F16
F16
[1152, 1152]
v.blk.12.attn_q_norm.weight
F32
F32
[72]
v.blk.12.attn_v.weight
F16
F16
[1152, 1152]
v.blk.12.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.12.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.12.ffn_post_norm.weight
F32
F32
[1152]
v.blk.12.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.12.ln1.weight
F32
F32
[1152]
v.blk.12.ln2.weight
F32
F32
[1152]
v.blk.13
v.blk.13.attn_k.weight
F16
F16
[1152, 1152]
v.blk.13.attn_k_norm.weight
F32
F32
[72]
v.blk.13.attn_out.weight
F16
F16
[1152, 1152]
v.blk.13.attn_post_norm.weight
F32
F32
[1152]
v.blk.13.attn_q.weight
F16
F16
[1152, 1152]
v.blk.13.attn_q_norm.weight
F32
F32
[72]
v.blk.13.attn_v.weight
F16
F16
[1152, 1152]
v.blk.13.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.13.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.13.ffn_post_norm.weight
F32
F32
[1152]
v.blk.13.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.13.ln1.weight
F32
F32
[1152]
v.blk.13.ln2.weight
F32
F32
[1152]
v.blk.14
v.blk.14.attn_k.weight
F16
F16
[1152, 1152]
v.blk.14.attn_k_norm.weight
F32
F32
[72]
v.blk.14.attn_out.weight
F16
F16
[1152, 1152]
v.blk.14.attn_post_norm.weight
F32
F32
[1152]
v.blk.14.attn_q.weight
F16
F16
[1152, 1152]
v.blk.14.attn_q_norm.weight
F32
F32
[72]
v.blk.14.attn_v.weight
F16
F16
[1152, 1152]
v.blk.14.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.14.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.14.ffn_post_norm.weight
F32
F32
[1152]
v.blk.14.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.14.ln1.weight
F32
F32
[1152]
v.blk.14.ln2.weight
F32
F32
[1152]
v.blk.15
v.blk.15.attn_k.weight
F16
F16
[1152, 1152]
v.blk.15.attn_k_norm.weight
F32
F32
[72]
v.blk.15.attn_out.weight
F16
F16
[1152, 1152]
v.blk.15.attn_post_norm.weight
F32
F32
[1152]
v.blk.15.attn_q.weight
F16
F16
[1152, 1152]
v.blk.15.attn_q_norm.weight
F32
F32
[72]
v.blk.15.attn_v.weight
F16
F16
[1152, 1152]
v.blk.15.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.15.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.15.ffn_post_norm.weight
F32
F32
[1152]
v.blk.15.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.15.ln1.weight
F32
F32
[1152]
v.blk.15.ln2.weight
F32
F32
[1152]
v.blk.16
v.blk.16.attn_k.weight
F16
F16
[1152, 1152]
v.blk.16.attn_k_norm.weight
F32
F32
[72]
v.blk.16.attn_out.weight
F16
F16
[1152, 1152]
v.blk.16.attn_post_norm.weight
F32
F32
[1152]
v.blk.16.attn_q.weight
F16
F16
[1152, 1152]
v.blk.16.attn_q_norm.weight
F32
F32
[72]
v.blk.16.attn_v.weight
F16
F16
[1152, 1152]
v.blk.16.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.16.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.16.ffn_post_norm.weight
F32
F32
[1152]
v.blk.16.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.16.ln1.weight
F32
F32
[1152]
v.blk.16.ln2.weight
F32
F32
[1152]
v.blk.17
v.blk.17.attn_k.weight
F16
F16
[1152, 1152]
v.blk.17.attn_k_norm.weight
F32
F32
[72]
v.blk.17.attn_out.weight
F16
F16
[1152, 1152]
v.blk.17.attn_post_norm.weight
F32
F32
[1152]
v.blk.17.attn_q.weight
F16
F16
[1152, 1152]
v.blk.17.attn_q_norm.weight
F32
F32
[72]
v.blk.17.attn_v.weight
F16
F16
[1152, 1152]
v.blk.17.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.17.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.17.ffn_post_norm.weight
F32
F32
[1152]
v.blk.17.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.17.ln1.weight
F32
F32
[1152]
v.blk.17.ln2.weight
F32
F32
[1152]
v.blk.18
v.blk.18.attn_k.weight
F16
F16
[1152, 1152]
v.blk.18.attn_k_norm.weight
F32
F32
[72]
v.blk.18.attn_out.weight
F16
F16
[1152, 1152]
v.blk.18.attn_post_norm.weight
F32
F32
[1152]
v.blk.18.attn_q.weight
F16
F16
[1152, 1152]
v.blk.18.attn_q_norm.weight
F32
F32
[72]
v.blk.18.attn_v.weight
F16
F16
[1152, 1152]
v.blk.18.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.18.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.18.ffn_post_norm.weight
F32
F32
[1152]
v.blk.18.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.18.ln1.weight
F32
F32
[1152]
v.blk.18.ln2.weight
F32
F32
[1152]
v.blk.19
v.blk.19.attn_k.weight
F16
F16
[1152, 1152]
v.blk.19.attn_k_norm.weight
F32
F32
[72]
v.blk.19.attn_out.weight
F16
F16
[1152, 1152]
v.blk.19.attn_post_norm.weight
F32
F32
[1152]
v.blk.19.attn_q.weight
F16
F16
[1152, 1152]
v.blk.19.attn_q_norm.weight
F32
F32
[72]
v.blk.19.attn_v.weight
F16
F16
[1152, 1152]
v.blk.19.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.19.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.19.ffn_post_norm.weight
F32
F32
[1152]
v.blk.19.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.19.ln1.weight
F32
F32
[1152]
v.blk.19.ln2.weight
F32
F32
[1152]
v.blk.20
v.blk.20.attn_k.weight
F16
F16
[1152, 1152]
v.blk.20.attn_k_norm.weight
F32
F32
[72]
v.blk.20.attn_out.weight
F16
F16
[1152, 1152]
v.blk.20.attn_post_norm.weight
F32
F32
[1152]
v.blk.20.attn_q.weight
F16
F16
[1152, 1152]
v.blk.20.attn_q_norm.weight
F32
F32
[72]
v.blk.20.attn_v.weight
F16
F16
[1152, 1152]
v.blk.20.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.20.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.20.ffn_post_norm.weight
F32
F32
[1152]
v.blk.20.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.20.ln1.weight
F32
F32
[1152]
v.blk.20.ln2.weight
F32
F32
[1152]
v.blk.21
v.blk.21.attn_k.weight
F16
F16
[1152, 1152]
v.blk.21.attn_k_norm.weight
F32
F32
[72]
v.blk.21.attn_out.weight
F16
F16
[1152, 1152]
v.blk.21.attn_post_norm.weight
F32
F32
[1152]
v.blk.21.attn_q.weight
F16
F16
[1152, 1152]
v.blk.21.attn_q_norm.weight
F32
F32
[72]
v.blk.21.attn_v.weight
F16
F16
[1152, 1152]
v.blk.21.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.21.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.21.ffn_post_norm.weight
F32
F32
[1152]
v.blk.21.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.21.ln1.weight
F32
F32
[1152]
v.blk.21.ln2.weight
F32
F32
[1152]
v.blk.22
v.blk.22.attn_k.weight
F16
F16
[1152, 1152]
v.blk.22.attn_k_norm.weight
F32
F32
[72]
v.blk.22.attn_out.weight
F16
F16
[1152, 1152]
v.blk.22.attn_post_norm.weight
F32
F32
[1152]
v.blk.22.attn_q.weight
F16
F16
[1152, 1152]
v.blk.22.attn_q_norm.weight
F32
F32
[72]
v.blk.22.attn_v.weight
F16
F16
[1152, 1152]
v.blk.22.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.22.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.22.ffn_post_norm.weight
F32
F32
[1152]
v.blk.22.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.22.ln1.weight
F32
F32
[1152]
v.blk.22.ln2.weight
F32
F32
[1152]
v.blk.23
v.blk.23.attn_k.weight
F16
F16
[1152, 1152]
v.blk.23.attn_k_norm.weight
F32
F32
[72]
v.blk.23.attn_out.weight
F16
F16
[1152, 1152]
v.blk.23.attn_post_norm.weight
F32
F32
[1152]
v.blk.23.attn_q.weight
F16
F16
[1152, 1152]
v.blk.23.attn_q_norm.weight
F32
F32
[72]
v.blk.23.attn_v.weight
F16
F16
[1152, 1152]
v.blk.23.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.23.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.23.ffn_post_norm.weight
F32
F32
[1152]
v.blk.23.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.23.ln1.weight
F32
F32
[1152]
v.blk.23.ln2.weight
F32
F32
[1152]
v.blk.24
v.blk.24.attn_k.weight
F16
F16
[1152, 1152]
v.blk.24.attn_k_norm.weight
F32
F32
[72]
v.blk.24.attn_out.weight
F16
F16
[1152, 1152]
v.blk.24.attn_post_norm.weight
F32
F32
[1152]
v.blk.24.attn_q.weight
F16
F16
[1152, 1152]
v.blk.24.attn_q_norm.weight
F32
F32
[72]
v.blk.24.attn_v.weight
F16
F16
[1152, 1152]
v.blk.24.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.24.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.24.ffn_post_norm.weight
F32
F32
[1152]
v.blk.24.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.24.ln1.weight
F32
F32
[1152]
v.blk.24.ln2.weight
F32
F32
[1152]
v.blk.25
v.blk.25.attn_k.weight
F16
F16
[1152, 1152]
v.blk.25.attn_k_norm.weight
F32
F32
[72]
v.blk.25.attn_out.weight
F16
F16
[1152, 1152]
v.blk.25.attn_post_norm.weight
F32
F32
[1152]
v.blk.25.attn_q.weight
F16
F16
[1152, 1152]
v.blk.25.attn_q_norm.weight
F32
F32
[72]
v.blk.25.attn_v.weight
F16
F16
[1152, 1152]
v.blk.25.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.25.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.25.ffn_post_norm.weight
F32
F32
[1152]
v.blk.25.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.25.ln1.weight
F32
F32
[1152]
v.blk.25.ln2.weight
F32
F32
[1152]
v.blk.26
v.blk.26.attn_k.weight
F16
F16
[1152, 1152]
v.blk.26.attn_k_norm.weight
F32
F32
[72]
v.blk.26.attn_out.weight
F16
F16
[1152, 1152]
v.blk.26.attn_post_norm.weight
F32
F32
[1152]
v.blk.26.attn_q.weight
F16
F16
[1152, 1152]
v.blk.26.attn_q_norm.weight
F32
F32
[72]
v.blk.26.attn_v.weight
F16
F16
[1152, 1152]
v.blk.26.ffn_down.weight
F16
F16
[4304, 1152]
v.blk.26.ffn_gate.weight
F16
F16
[1152, 4304]
v.blk.26.ffn_post_norm.weight
F32
F32
[1152]
v.blk.26.ffn_up.weight
F16
F16
[1152, 4304]
v.blk.26.ln1.weight
F32
F32
[1152]
v.blk.26.ln2.weight
F32
F32
[1152]
v.patch_embd.weight
F16
F16
[16, 16, 3, 1152]
v.position_embd.weight
F32
F32
[1152, 10240, 2]
v.std_bias
F32
F32
[1152]
v.std_scale
F32
F32
[1152]
output_norm.weight
F32
F32
[2816]