Update README.md
Browse files
README.md
CHANGED
@@ -13,9 +13,9 @@ FLM-2-52B-Instruct utilizes the standard GPT-style decoder-only transformer arch
|
|
13 |
* Embedding and language model head untied
|
14 |
* Input and output multiplier
|
15 |
|
16 |
-
| Models
|
17 |
-
| -------------
|
18 |
-
| FLM-2-52B-Instruct-2407
|
19 |
|
20 |
# Training details
|
21 |
|
|
|
13 |
* Embedding and language model head untied
|
14 |
* Input and output multiplier
|
15 |
|
16 |
+
| Models | layer<br>number | attention<br>heads | hidden<br>size | ffn hidden<br>size | vocab<br>size | params<br>count |
|
17 |
+
| ------------- | :-------------: | :----------------: | :------------: | :----------------: | :-----------: | :--------------: |
|
18 |
+
| FLM-2-52B-Instruct-2407 | 64 | 64 | 8,192 | 21,824 | 80,000 | 52.85 B |
|
19 |
|
20 |
# Training details
|
21 |
|