Update README.md
Browse files
README.md
CHANGED
@@ -5,12 +5,14 @@ language:
|
|
5 |
- vi
|
6 |
- id
|
7 |
- th
|
8 |
-
-
|
9 |
- ta
|
10 |
- ms
|
11 |
- km
|
12 |
- lo
|
13 |
- my
|
|
|
|
|
14 |
license: gemma
|
15 |
---
|
16 |
# Gemma2 9B CPT SEA-LIONv3
|
@@ -30,10 +32,10 @@ The continued pre-training data for Gemma2 9B CPT SEA-LIONv3 base model encompas
|
|
30 |
- **Developed by:** Products Pillar, AI Singapore
|
31 |
- **Funded by:** Singapore NRF
|
32 |
- **Model type:** Decoder
|
33 |
-
- **Languages:** English, Chinese, Vietnamese, Indonesian, Thai,
|
34 |
- **License:** [Gemma Community License](https://ai.google.dev/gemma/terms)
|
35 |
|
36 |
-
For tokenisation, the model employs the default tokenizer used in Gemma-2-9B.
|
37 |
|
38 |
### Benchmark Performance
|
39 |
We evaluated Gemma2 9B CPT SEA-LIONv3 base model on general language capabilities.
|
@@ -42,9 +44,9 @@ We evaluated Gemma2 9B CPT SEA-LIONv3 base model on general language capabilitie
|
|
42 |
For the evaluation of general language capabilities, we employed the [SEA HELM (also known as BHASA) evaluation benchmark](https://arxiv.org/abs/2309.06085v2) across a variety of tasks.
|
43 |
These tasks include Question Answering (QA), Sentiment Analysis (Sentiment), Toxicity Detection (Toxicity), Translation in both directions (Eng>Lang & Lang>Eng), Abstractive Summarization (Summ), Causal Reasoning (Causal) and Natural Language Inference (NLI).
|
44 |
|
45 |
-
Note: SEA HELM is implemented using prompts
|
46 |
|
47 |
-
The evaluation was done **five-shot** with native prompts
|
48 |
|
49 |
For more details on Gemma2 9B CPT SEA-LIONv3 base benchmark performance, please refer to the SEA HELM leaderboard, https://leaderboard.sea-lion.ai/
|
50 |
|
@@ -78,8 +80,8 @@ Gemma2 9B CPT SEA-LIONv3 base model was continued pre-trained on 200B tokens of
|
|
78 |
| SEA-LION Pile - Indonesian | 20.8 | 1 | 20.8 | 10.40 |
|
79 |
| Wiki* + News* + WangChanBERTa - Thai | 1.3 | 4 | 5.2 | 2.60 |
|
80 |
| SEA-LION Pile - Thai | 14.8 | 1 | 14.8 | 7.40 |
|
81 |
-
| Wiki* + News -
|
82 |
-
| SEA-LION Pile -
|
83 |
| Wiki* + News - Tamil | 0.1 | 4 | 0.3 | 0.14 |
|
84 |
| SEA-LION Pile - Tamil | 0.7 | 1 | 0.7 | 0.36 |
|
85 |
| Wiki* + News - Malay | 0.1 | 4 | 0.6 | 0.29 |
|
@@ -141,11 +143,10 @@ For more info, please contact us using this [SEA-LION Inquiry Form](https://form
|
|
141 |
|
142 |
## Disclaimer
|
143 |
|
144 |
-
This the repository for the base model.
|
145 |
The model has _not_ been aligned for safety.
|
146 |
Developers and users should perform their own safety fine-tuning and related security measures.
|
147 |
-
In no event shall the authors be held liable for any claim, damages, or other liability
|
148 |
-
arising from the use of the released weights and codes.
|
149 |
|
150 |
|
151 |
## References
|
|
|
5 |
- vi
|
6 |
- id
|
7 |
- th
|
8 |
+
- fil
|
9 |
- ta
|
10 |
- ms
|
11 |
- km
|
12 |
- lo
|
13 |
- my
|
14 |
+
- jv
|
15 |
+
- su
|
16 |
license: gemma
|
17 |
---
|
18 |
# Gemma2 9B CPT SEA-LIONv3
|
|
|
32 |
- **Developed by:** Products Pillar, AI Singapore
|
33 |
- **Funded by:** Singapore NRF
|
34 |
- **Model type:** Decoder
|
35 |
+
- **Languages:** English, Chinese, Vietnamese, Indonesian, Thai, Filipino, Tamil, Malay, Khmer, Lao, Burmese, Javanese, Sundanese
|
36 |
- **License:** [Gemma Community License](https://ai.google.dev/gemma/terms)
|
37 |
|
38 |
+
For tokenisation, the model employs the default tokenizer used in Gemma-2-9B. The model has a context length of 8192.
|
39 |
|
40 |
### Benchmark Performance
|
41 |
We evaluated Gemma2 9B CPT SEA-LIONv3 base model on general language capabilities.
|
|
|
44 |
For the evaluation of general language capabilities, we employed the [SEA HELM (also known as BHASA) evaluation benchmark](https://arxiv.org/abs/2309.06085v2) across a variety of tasks.
|
45 |
These tasks include Question Answering (QA), Sentiment Analysis (Sentiment), Toxicity Detection (Toxicity), Translation in both directions (Eng>Lang & Lang>Eng), Abstractive Summarization (Summ), Causal Reasoning (Causal) and Natural Language Inference (NLI).
|
46 |
|
47 |
+
Note: SEA HELM is implemented using prompts to elicit answers in a strict format. For all tasks, the model is expected to provide an answer tag from which the answer is automatically extracted. For tasks where options are provided, the answer should comprise one of the pre-defined options. The scores for each task is normalised to account for baseline performance due to random chance.
|
48 |
|
49 |
+
The evaluation was done **five-shot** with native prompts on a sample of 100-1000 instances for each dataset.
|
50 |
|
51 |
For more details on Gemma2 9B CPT SEA-LIONv3 base benchmark performance, please refer to the SEA HELM leaderboard, https://leaderboard.sea-lion.ai/
|
52 |
|
|
|
80 |
| SEA-LION Pile - Indonesian | 20.8 | 1 | 20.8 | 10.40 |
|
81 |
| Wiki* + News* + WangChanBERTa - Thai | 1.3 | 4 | 5.2 | 2.60 |
|
82 |
| SEA-LION Pile - Thai | 14.8 | 1 | 14.8 | 7.40 |
|
83 |
+
| Wiki* + News - Filipino | 0.2 | 4 | 0.9 | 0.43 |
|
84 |
+
| SEA-LION Pile - Filipino | 2.1 | 1 | 2.1 | 1.07 |
|
85 |
| Wiki* + News - Tamil | 0.1 | 4 | 0.3 | 0.14 |
|
86 |
| SEA-LION Pile - Tamil | 0.7 | 1 | 0.7 | 0.36 |
|
87 |
| Wiki* + News - Malay | 0.1 | 4 | 0.6 | 0.29 |
|
|
|
143 |
|
144 |
## Disclaimer
|
145 |
|
146 |
+
This is the repository for the base model.
|
147 |
The model has _not_ been aligned for safety.
|
148 |
Developers and users should perform their own safety fine-tuning and related security measures.
|
149 |
+
In no event shall the authors be held liable for any claim, damages, or other liability arising from the use of the released weights and codes.
|
|
|
150 |
|
151 |
|
152 |
## References
|