ahmetustun viraat commited on
Commit
31021ed
1 Parent(s): 2d84ea7

Add new examples to model card (#3)

Browse files

- Add hindi and turkish examples for generate (47df30604000b54d174aae0e12154c94e0b105cc)


Co-authored-by: Viraat Aryabumi <[email protected]>

Files changed (1) hide show
  1. README.md +19 -12
README.md CHANGED
@@ -115,7 +115,7 @@ metrics:
115
 
116
  <img src="aya-fig1.png" alt="Aya model summary image" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
117
 
118
- # Model Card for Aya Model
119
 
120
  ## Model Summary
121
 
@@ -128,7 +128,7 @@ metrics:
128
  - **Developed by:** Cohere For AI
129
  - **Model type:** a Transformer style autoregressive massively multilingual language model.
130
  - **Paper**: [Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model](arxiv.com)
131
- - **Point of Contact**: [Ahmet Ustun](mailto:ahmet@cohere.com)
132
  - **Languages**: Refer to the list of languages in the `language` section of this model card.
133
  - **License**: Apache-2.0
134
  - **Model**: [Aya](https://huggingface.co/CohereForAI/aya)
@@ -141,22 +141,31 @@ metrics:
141
  # pip install -q transformers
142
  from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
143
 
144
- checkpoint = "CohereForAI/aya_model"
145
 
146
  tokenizer = AutoTokenizer.from_pretrained(checkpoint)
147
  aya_model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint)
148
 
149
- inputs = tokenizer.encode("Translate to English: Je t’aime.", return_tensors="pt")
150
- outputs = aya_model.generate(inputs)
151
- print(tokenizer.decode(outputs[0]))
 
 
 
 
 
 
 
 
 
152
  ```
153
 
154
  ## Model Details
155
 
156
- ### Training
157
 
158
  - Architecture: Same as [mt5-xxl](https://huggingface.co/google/mt5-xxl)
159
- - Number of Finetuning Samples: 25M
160
  - Batch size: 256
161
  - Hardware: TPUv4-128
162
  - Software: T5X, Jax
@@ -190,15 +199,13 @@ We hope that the release of the Aya model will make community-based redteaming e
190
 
191
  ```
192
  @article{,
193
- title={},
194
- author={},
195
  journal={Preprint},
196
  year={2024}
197
  }
198
  ```
199
 
200
- **APA:**
201
-
202
  ## Languages Covered
203
 
204
  Below is the list of languages used in finetuning the Aya Model. We group languages into higher-, mid-, and lower-resourcedness based on a language classification by [Joshi et. al, 2020](https://microsoft.github.io/linguisticdiversity/). For further details, refer to our [paper]()
 
115
 
116
  <img src="aya-fig1.png" alt="Aya model summary image" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
117
 
118
+ # Model Card for Aya 101
119
 
120
  ## Model Summary
121
 
 
128
  - **Developed by:** Cohere For AI
129
  - **Model type:** a Transformer style autoregressive massively multilingual language model.
130
  - **Paper**: [Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model](arxiv.com)
131
+ - **Point of Contact**: Cohere For AI: [cohere.for.ai](cohere.for.ai)
132
  - **Languages**: Refer to the list of languages in the `language` section of this model card.
133
  - **License**: Apache-2.0
134
  - **Model**: [Aya](https://huggingface.co/CohereForAI/aya)
 
141
  # pip install -q transformers
142
  from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
143
 
144
+ checkpoint = "CohereForAI/aya-101"
145
 
146
  tokenizer = AutoTokenizer.from_pretrained(checkpoint)
147
  aya_model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint)
148
 
149
+ # Turkish to English translation
150
+ tur_inputs = tokenizer.encode("Translate to English: Aya cok dilli bir dil modelidir.", return_tensors="pt")
151
+ tur_outputs = aya_model.generate(inputs, max_new_tokens=128)
152
+ print(tokenizer.decode(tur_outputs[0]))
153
+ # Aya is a multi-lingual language model
154
+
155
+ # Q: Why are there so many languages in India?
156
+ hin_inputs = tokenizer.encode("भारत में इतनी सारी भाषाएँ क्यों हैं?", return_tensors="pt")
157
+ hin_outputs = aya_model.generate(inputs, max_new_tokens=128)
158
+ print(tokenizer.decode(hin_outputs[0]))
159
+ # Expected output: भारत में कई भाषाएँ हैं और विभिन्न भाषाओं के बोली जाने वाले लोग हैं। यह विभिन्नता भाषाई विविधता और सांस्कृतिक विविधता का परिणाम है। Translates to "India has many languages and people speaking different languages. This diversity is the result of linguistic diversity and cultural diversity."
160
+
161
  ```
162
 
163
  ## Model Details
164
 
165
+ ### Finetuning
166
 
167
  - Architecture: Same as [mt5-xxl](https://huggingface.co/google/mt5-xxl)
168
+ - Number of Samples seen during Finetuning: 25M
169
  - Batch size: 256
170
  - Hardware: TPUv4-128
171
  - Software: T5X, Jax
 
199
 
200
  ```
201
  @article{,
202
+ title={Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model},
203
+ author={Ahmet Üstün, Viraat Aryabumi, Zheng-Xin Yong, Wei-Yin Ko, Daniel D'souza, Gbemileke Onilude, Neel Bhandari, Shivalika Singh, Hui-Lee Ooi, Amr Kayid, Freddie Vargus, Phil Blunsom, Shayne Longpre, Niklas Muennighoff, Marzieh Fadaee, Julia Kreutzer, Sara Hooker},
204
  journal={Preprint},
205
  year={2024}
206
  }
207
  ```
208
 
 
 
209
  ## Languages Covered
210
 
211
  Below is the list of languages used in finetuning the Aya Model. We group languages into higher-, mid-, and lower-resourcedness based on a language classification by [Joshi et. al, 2020](https://microsoft.github.io/linguisticdiversity/). For further details, refer to our [paper]()