tokenizer.model seems to be empty?

#1
by ggerganov - opened

Just 131 bytes - does not look right

tokenizer.model seems to be a git lfs link checked in as text file. this is clearly wrong.

it seems to be the same as llama2, atleast the hash it says it is , is the same.
oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
https://huggingface.co/meta-llama/Llama-2-7b/blob/main/tokenizer.model

update: tested and works. it is indeed the llama2 tokenizer.model

edit: Q8_0 wikitest perplexity comes out to be 354.0950 +/- 2.40774 right now. which sounds about right.
f32 : 352.8210 +/- 2.39951

$ bin/main -m ../models/TinyLlama-1.1B-step-50K-105b/ggml-model-f32.gguf -t 10 -p "The meaning of life"
Log start
main: build = 1173 (e4386f4)
main: seed  = 1693823704
llama_model_loader: loaded meta data with 17 key-value pairs and 201 tensors from ../models/TinyLlama-1.1B-step-50K-105b/ggml-model-f32.gguf (version GGUF V2 (latest))
llama_model_loader: - tensor    0:                    output.weight f32      [  2048, 32000,     1,     1 ]
llama_model_loader: - tensor    1:                token_embd.weight f32      [  2048, 32000,     1,     1 ]
llama_model_loader: - tensor    2:           blk.0.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor    3:              blk.0.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor    4:              blk.0.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor    5:              blk.0.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor    6:         blk.0.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor    7:            blk.0.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor    8:            blk.0.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor    9:              blk.0.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   10:            blk.0.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor   11:           blk.1.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   12:              blk.1.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   13:              blk.1.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   14:              blk.1.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   15:         blk.1.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   16:            blk.1.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   17:            blk.1.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   18:              blk.1.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   19:            blk.1.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor   20:           blk.2.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   21:              blk.2.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   22:              blk.2.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   23:              blk.2.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   24:         blk.2.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   25:            blk.2.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   26:            blk.2.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   27:              blk.2.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   28:            blk.2.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor   29:           blk.3.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   30:              blk.3.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   31:              blk.3.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   32:              blk.3.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   33:         blk.3.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   34:            blk.3.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   35:            blk.3.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   36:              blk.3.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   37:            blk.3.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor   38:           blk.4.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   39:              blk.4.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   40:              blk.4.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   41:              blk.4.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   42:         blk.4.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   43:            blk.4.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   44:            blk.4.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   45:              blk.4.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   46:            blk.4.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor   47:           blk.5.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   48:              blk.5.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   49:              blk.5.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   50:              blk.5.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   51:         blk.5.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   52:            blk.5.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   53:            blk.5.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   54:              blk.5.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   55:            blk.5.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor   56:           blk.6.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   57:              blk.6.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   58:              blk.6.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   59:              blk.6.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   60:         blk.6.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   61:            blk.6.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   62:            blk.6.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   63:              blk.6.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   64:            blk.6.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor   65:           blk.7.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   66:              blk.7.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   67:              blk.7.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   68:              blk.7.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   69:         blk.7.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   70:            blk.7.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   71:            blk.7.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   72:              blk.7.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   73:            blk.7.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor   74:           blk.8.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   75:              blk.8.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   76:              blk.8.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   77:              blk.8.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   78:         blk.8.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   79:            blk.8.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   80:            blk.8.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   81:              blk.8.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   82:            blk.8.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor   83:           blk.9.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   84:              blk.9.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   85:              blk.9.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   86:              blk.9.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   87:         blk.9.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   88:            blk.9.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   89:            blk.9.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   90:              blk.9.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   91:            blk.9.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor   92:          blk.10.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   93:             blk.10.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   94:             blk.10.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   95:             blk.10.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor   96:        blk.10.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor   97:           blk.10.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor   98:           blk.10.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor   99:             blk.10.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  100:           blk.10.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor  101:          blk.11.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  102:             blk.11.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  103:             blk.11.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  104:             blk.11.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  105:        blk.11.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  106:           blk.11.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  107:           blk.11.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  108:             blk.11.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  109:           blk.11.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor  110:          blk.12.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  111:             blk.12.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  112:             blk.12.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  113:             blk.12.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  114:        blk.12.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  115:           blk.12.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  116:           blk.12.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  117:             blk.12.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  118:           blk.12.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor  119:          blk.13.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  120:             blk.13.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  121:             blk.13.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  122:             blk.13.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  123:        blk.13.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  124:           blk.13.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  125:           blk.13.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  126:             blk.13.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  127:           blk.13.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor  128:          blk.14.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  129:             blk.14.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  130:             blk.14.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  131:             blk.14.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  132:        blk.14.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  133:           blk.14.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  134:           blk.14.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  135:             blk.14.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  136:           blk.14.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor  137:          blk.15.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  138:             blk.15.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  139:             blk.15.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  140:             blk.15.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  141:        blk.15.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  142:           blk.15.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  143:           blk.15.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  144:             blk.15.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  145:           blk.15.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor  146:          blk.16.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  147:             blk.16.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  148:             blk.16.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  149:             blk.16.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  150:        blk.16.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  151:           blk.16.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  152:           blk.16.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  153:             blk.16.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  154:           blk.16.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor  155:          blk.17.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  156:             blk.17.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  157:             blk.17.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  158:             blk.17.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  159:        blk.17.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  160:           blk.17.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  161:           blk.17.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  162:             blk.17.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  163:           blk.17.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor  164:          blk.18.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  165:             blk.18.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  166:             blk.18.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  167:             blk.18.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  168:        blk.18.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  169:           blk.18.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  170:           blk.18.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  171:             blk.18.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  172:           blk.18.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor  173:          blk.19.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  174:             blk.19.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  175:             blk.19.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  176:             blk.19.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  177:        blk.19.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  178:           blk.19.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  179:           blk.19.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  180:             blk.19.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  181:           blk.19.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor  182:          blk.20.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  183:             blk.20.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  184:             blk.20.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  185:             blk.20.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  186:        blk.20.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  187:           blk.20.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  188:           blk.20.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  189:             blk.20.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  190:           blk.20.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor  191:          blk.21.attn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  192:             blk.21.attn_q.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  193:             blk.21.attn_k.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  194:             blk.21.attn_v.weight f32      [  2048,   256,     1,     1 ]
llama_model_loader: - tensor  195:        blk.21.attn_output.weight f32      [  2048,  2048,     1,     1 ]
llama_model_loader: - tensor  196:           blk.21.ffn_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - tensor  197:           blk.21.ffn_gate.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  198:             blk.21.ffn_up.weight f32      [  2048,  5632,     1,     1 ]
llama_model_loader: - tensor  199:           blk.21.ffn_down.weight f32      [  5632,  2048,     1,     1 ]
llama_model_loader: - tensor  200:               output_norm.weight f32      [  2048,     1,     1,     1 ]
llama_model_loader: - kv   0:                       general.architecture str
llama_model_loader: - kv   1:                               general.name str
llama_model_loader: - kv   2:                       llama.context_length u32
llama_model_loader: - kv   3:                     llama.embedding_length u32
llama_model_loader: - kv   4:                          llama.block_count u32
llama_model_loader: - kv   5:                  llama.feed_forward_length u32
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32
llama_model_loader: - kv   7:                 llama.attention.head_count u32
llama_model_loader: - kv   8:              llama.attention.head_count_kv u32
llama_model_loader: - kv   9:     llama.attention.layer_norm_rms_epsilon f32
llama_model_loader: - kv  10:                       tokenizer.ggml.model str
llama_model_loader: - kv  11:                      tokenizer.ggml.tokens arr
llama_model_loader: - kv  12:                      tokenizer.ggml.scores arr
llama_model_loader: - kv  13:                  tokenizer.ggml.token_type arr
llama_model_loader: - kv  14:                tokenizer.ggml.bos_token_id u32
llama_model_loader: - kv  15:                tokenizer.ggml.eos_token_id u32
llama_model_loader: - kv  16:            tokenizer.ggml.unknown_token_id u32
llama_model_loader: - type  f32:  201 tensors
llm_load_print_meta: format         = GGUF V2 (latest)
llm_load_print_meta: arch           = llama
llm_load_print_meta: vocab type     = SPM
llm_load_print_meta: n_vocab        = 32000
llm_load_print_meta: n_merges       = 0
llm_load_print_meta: n_ctx_train    = 2048
llm_load_print_meta: n_ctx          = 512
llm_load_print_meta: n_embd         = 2048
llm_load_print_meta: n_head         = 32
llm_load_print_meta: n_head_kv      = 4
llm_load_print_meta: n_layer        = 22
llm_load_print_meta: n_rot          = 64
llm_load_print_meta: n_gqa          = 8
llm_load_print_meta: f_norm_eps     = 1,0e-05
llm_load_print_meta: f_norm_rms_eps = 1,0e-05
llm_load_print_meta: n_ff           = 5632
llm_load_print_meta: freq_base      = 10000,0
llm_load_print_meta: freq_scale     = 1
llm_load_print_meta: model type     = ?B
llm_load_print_meta: model ftype    = all F32 (guessed)
llm_load_print_meta: model size     = 1,10 B
llm_load_print_meta: general.name   = models
llm_load_print_meta: BOS token = 1 '<s>'
llm_load_print_meta: EOS token = 2 '</s>'
llm_load_print_meta: UNK token = 0 '<unk>'
llm_load_print_meta: LF token  = 13 '<0x0A>'
llm_load_tensors: ggml ctx size =    0,06 MB
llm_load_tensors: mem required  = 4196,42 MB (+   11,00 MB per state)
...........................................................................................
llama_new_context_with_model: kv self size  =   11,00 MB
llama_new_context_with_model: compute buffer total size =   67,97 MB

system_info: n_threads = 10 / 24 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
sampling: repeat_last_n = 64, repeat_penalty = 1,100000, presence_penalty = 0,000000, frequency_penalty = 0,000000, top_k = 40, tfs_z = 1,000000, top_p = 0,950000, typical_p = 1,000000, temp = 0,800000, mirostat = 0, mirostat_lr = 0,100000, mirostat_ent = 5,000000
generate: n_ctx = 512, n_batch = 512, n_predict = -1, n_keep = 0


 The meaning of life and the meaning to be to be able to exist. I can say to have any knowledge and has the the opportunity to come in the experience a few days, a little time as long as far.
the most probably was to me. The most and the most difficult one of in, you. To have been with them, the next that it is not a lot on the fact so and would think the more than the other is
TinyLlama org

Thanks for spotting this! I guess HuggingFace might substitute the tokenizer with a softlink for less storage. I can successfully load with AutoModel.from_pretrained(). But it might not work with git clone. Will update the file!

TinyLlama org

I have uploaded the correct tokenizer. Will close this issue. Feel free to reopen it if the problem persists.

PY007 changed discussion status to closed

Sign up or log in to comment