Files changed (1) hide show
  1. README.md +89 -75
README.md CHANGED
@@ -69,7 +69,7 @@ language:
69
  - my
70
  - ne
71
  - nl
72
- - no
73
  - ny
74
  - pa
75
  - pl
@@ -108,65 +108,78 @@ language:
108
  tags:
109
  - text2text-generation
110
  widget:
111
- - text: "<table>
112
- <tr>
113
- <th>Name</th>
114
- <th>Explanation</th>
115
- <th>Example models</th>
116
- </tr>
117
- <tr>
118
- <td><a href=https://huggingface.co/datasets/bigscience/xP3>xP3</a></t>
119
- <td>Mixture of 13 training tasks in 46 languages with English prompts</td>
120
- <td><a href=https://huggingface.co/bigscience/bloomz>bloomz</a> & <a href=https://huggingface.co/bigscience/mt0-xxl>mt0-xxl</a></td>
121
- </tr>
122
- <tr>
123
- <td><a href=https://huggingface.co/datasets/bigscience/xP3mt>xP3mt</a></t>
124
- <td>Mixture of 13 training tasks in 46 languages with prompts in 20 languages (machine-translated from English)</td>
125
- <td><a href=https://huggingface.co/bigscience/bloomz-mt>bloomz-mt</a> & <a href=https://huggingface.co/bigscience/mt0-xxl-mt>mt0-xxl-mt</a></td>
126
- </tr>
127
- <tr>
128
- <td><a href=https://huggingface.co/datasets/bigscience/xP3all>xP3all</a></t>
129
- <td>xP3 + our evaluation datasets adding an additional 3 tasks for a total of 16 tasks in 46 languages with English prompts</td>
130
- <td></td>
131
- </tr>
132
- <tr>
133
- <td><a href=https://huggingface.co/datasets/bigscience/xP3megds>xP3megds</a></t>
134
- <td><a href=https://github.com/bigscience-workshop/Megatron-DeepSpeed>Megatron-DeepSpeed</a> processed version of xP3</td>
135
- <td><a href=https://huggingface.co/bigscience/bloomz>bloomz</a></td>
136
- </tr>
137
- <tr>
138
- <td><a href=https://huggingface.co/datasets/Muennighoff/P3>P3</a></t>
139
- <td>Repreprocessed version of the English-only <a href=https://huggingface.co/datasets/bigscience/P3>P3</a> with 8 training tasks</td>
140
- <td><a href=https://huggingface.co/bigscience/bloomz-p3>bloomz-p3</a> & <a href=https://huggingface.co/bigscience/mt0-xxl-p3>mt0-xxl-p3</a></td>
141
- </tr>
142
- </table>
143
- Which dataset has the most tasks?"
144
- example_title: "en-en struct-to-text"
145
- - text: "Life is beautiful! Translate to Mongolian."
146
- example_title: "mn-en translation"
147
- - text: "Le mot japonais «憂鬱» veut dire quoi en Odia?"
148
- example_title: "jp-or-fr translation"
149
- - text: "Stell mir eine schwierige Quiz Frage bei der es um Astronomie geht. Bitte stell die Frage auf Norwegisch."
150
- example_title: "de-nb quiz"
151
- - text: "We present BLOOMZ & mT0, a family of models capable of following human instructions in dozens of languages zero-shot. We finetune BLOOM & mT5 pretrained multilingual language models on our crosslingual task mixture (xP3) and find our resulting models capable of crosslingual generalization to unseen tasks & languages.
152
- What are the keywords in Chinese?"
153
- example_title: "zh-en keywords"
154
- - text: "一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。Would you rate the previous review as positive, neutral or negative?"
155
- example_title: "zh-en sentiment"
156
- - text: "一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。你认为这句话的立场是赞扬、中立还是批评?"
157
- example_title: "zh-zh sentiment"
158
- - text: "Suggest at least five related search terms to \"Mạng neural nhân tạo\"."
159
- example_title: "vi-en query"
160
- - text: "Proposez au moins cinq mots clés concernant «Réseau de neurones artificiels»."
161
- example_title: "fr-fr query"
162
- - text: "Explain in a sentence in Telugu what is backpropagation in neural networks."
163
- example_title: "te-en qa"
164
- - text: "Why is the sky blue?"
165
- example_title: "en-en qa"
166
- - text: "Write a fairy tale about a troll saving a princess from a dangerous dragon. The fairy tale is a masterpiece that has achieved praise worldwide and its moral is \"Heroes Come in All Shapes and Sizes\". Story (in Spanish):"
167
- example_title: "es-en fable"
168
- - text: "Write a fable about wood elves living in a forest that is suddenly invaded by ogres. The fable is a masterpiece that has achieved praise worldwide and its moral is \"Violence is the last refuge of the incompetent\". Fable (in Hindi):"
169
- example_title: "hi-en fable"
 
 
 
 
 
 
 
 
 
 
 
 
 
170
  model-index:
171
  - name: mt0-xxl
172
  results:
@@ -268,7 +281,7 @@ model-index:
268
  revision: 9dbd830a06fea8b1c49d6e5ef2004a08d9f45094
269
  metrics:
270
  - type: Accuracy
271
- value: 43.0
272
  - task:
273
  type: Natural language inference
274
  dataset:
@@ -345,7 +358,7 @@ model-index:
345
  revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16
346
  metrics:
347
  - type: Accuracy
348
- value: 59.0
349
  - task:
350
  type: Natural language inference
351
  dataset:
@@ -472,7 +485,7 @@ model-index:
472
  dataset:
473
  type: story_cloze
474
  name: StoryCloze (2016)
475
- config: "2016"
476
  split: validation
477
  revision: e724c6f8cdf7c7a2fb229d862226e15b023ee4db
478
  metrics:
@@ -488,7 +501,7 @@ model-index:
488
  revision: 9e12063561e7e6c79099feb6d5a493142584e9e2
489
  metrics:
490
  - type: Accuracy
491
- value: 93.0
492
  - task:
493
  type: Sentence completion
494
  dataset:
@@ -499,7 +512,7 @@ model-index:
499
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
500
  metrics:
501
  - type: Accuracy
502
- value: 79.0
503
  - task:
504
  type: Sentence completion
505
  dataset:
@@ -510,7 +523,7 @@ model-index:
510
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
511
  metrics:
512
  - type: Accuracy
513
- value: 81.0
514
  - task:
515
  type: Sentence completion
516
  dataset:
@@ -521,7 +534,7 @@ model-index:
521
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
522
  metrics:
523
  - type: Accuracy
524
- value: 92.0
525
  - task:
526
  type: Sentence completion
527
  dataset:
@@ -532,7 +545,7 @@ model-index:
532
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
533
  metrics:
534
  - type: Accuracy
535
- value: 90.0
536
  - task:
537
  type: Sentence completion
538
  dataset:
@@ -543,7 +556,7 @@ model-index:
543
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
544
  metrics:
545
  - type: Accuracy
546
- value: 59.0
547
  - task:
548
  type: Sentence completion
549
  dataset:
@@ -554,7 +567,7 @@ model-index:
554
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
555
  metrics:
556
  - type: Accuracy
557
- value: 79.0
558
  - task:
559
  type: Sentence completion
560
  dataset:
@@ -565,7 +578,7 @@ model-index:
565
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
566
  metrics:
567
  - type: Accuracy
568
- value: 84.0
569
  - task:
570
  type: Sentence completion
571
  dataset:
@@ -576,7 +589,7 @@ model-index:
576
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
577
  metrics:
578
  - type: Accuracy
579
- value: 77.0
580
  - task:
581
  type: Sentence completion
582
  dataset:
@@ -587,7 +600,7 @@ model-index:
587
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
588
  metrics:
589
  - type: Accuracy
590
- value: 79.0
591
  - task:
592
  type: Sentence completion
593
  dataset:
@@ -598,7 +611,7 @@ model-index:
598
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
599
  metrics:
600
  - type: Accuracy
601
- value: 88.0
602
  - task:
603
  type: Sentence completion
604
  dataset:
@@ -609,7 +622,7 @@ model-index:
609
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
610
  metrics:
611
  - type: Accuracy
612
- value: 89.0
613
  - task:
614
  type: Sentence completion
615
  dataset:
@@ -720,6 +733,7 @@ model-index:
720
  metrics:
721
  - type: Accuracy
722
  value: 93.85
 
723
  ---
724
 
725
  ![xmtf](https://github.com/bigscience-workshop/xmtf/blob/master/xmtf_banner.png?raw=true)
 
69
  - my
70
  - ne
71
  - nl
72
+ - 'no'
73
  - ny
74
  - pa
75
  - pl
 
108
  tags:
109
  - text2text-generation
110
  widget:
111
+ - text: >-
112
+ <table> <tr> <th>Name</th> <th>Explanation</th> <th>Example models</th>
113
+ </tr> <tr> <td><a
114
+ href=https://huggingface.co/datasets/bigscience/xP3>xP3</a></t> <td>Mixture
115
+ of 13 training tasks in 46 languages with English prompts</td> <td><a
116
+ href=https://huggingface.co/bigscience/bloomz>bloomz</a> & <a
117
+ href=https://huggingface.co/bigscience/mt0-xxl>mt0-xxl</a></td> </tr> <tr>
118
+ <td><a href=https://huggingface.co/datasets/bigscience/xP3mt>xP3mt</a></t>
119
+ <td>Mixture of 13 training tasks in 46 languages with prompts in 20
120
+ languages (machine-translated from English)</td> <td><a
121
+ href=https://huggingface.co/bigscience/bloomz-mt>bloomz-mt</a> & <a
122
+ href=https://huggingface.co/bigscience/mt0-xxl-mt>mt0-xxl-mt</a></td> </tr>
123
+ <tr> <td><a
124
+ href=https://huggingface.co/datasets/bigscience/xP3all>xP3all</a></t>
125
+ <td>xP3 + our evaluation datasets adding an additional 3 tasks for a total
126
+ of 16 tasks in 46 languages with English prompts</td> <td></td> </tr> <tr>
127
+ <td><a
128
+ href=https://huggingface.co/datasets/bigscience/xP3megds>xP3megds</a></t>
129
+ <td><a
130
+ href=https://github.com/bigscience-workshop/Megatron-DeepSpeed>Megatron-DeepSpeed</a>
131
+ processed version of xP3</td> <td><a
132
+ href=https://huggingface.co/bigscience/bloomz>bloomz</a></td> </tr> <tr>
133
+ <td><a href=https://huggingface.co/datasets/Muennighoff/P3>P3</a></t>
134
+ <td>Repreprocessed version of the English-only <a
135
+ href=https://huggingface.co/datasets/bigscience/P3>P3</a> with 8 training
136
+ tasks</td> <td><a
137
+ href=https://huggingface.co/bigscience/bloomz-p3>bloomz-p3</a> & <a
138
+ href=https://huggingface.co/bigscience/mt0-xxl-p3>mt0-xxl-p3</a></td> </tr>
139
+ </table> Which dataset has the most tasks?
140
+ example_title: en-en struct-to-text
141
+ - text: Life is beautiful! Translate to Mongolian.
142
+ example_title: mn-en translation
143
+ - text: Le mot japonais «憂鬱» veut dire quoi en Odia?
144
+ example_title: jp-or-fr translation
145
+ - text: >-
146
+ Stell mir eine schwierige Quiz Frage bei der es um Astronomie geht. Bitte
147
+ stell die Frage auf Norwegisch.
148
+ example_title: de-nb quiz
149
+ - text: >-
150
+ We present BLOOMZ & mT0, a family of models capable of following human
151
+ instructions in dozens of languages zero-shot. We finetune BLOOM & mT5
152
+ pretrained multilingual language models on our crosslingual task mixture
153
+ (xP3) and find our resulting models capable of crosslingual generalization
154
+ to unseen tasks & languages. What are the keywords in Chinese?
155
+ example_title: zh-en keywords
156
+ - text: >-
157
+ 一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。Would you rate the previous
158
+ review as positive, neutral or negative?
159
+ example_title: zh-en sentiment
160
+ - text: 一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。你认为这句话的立场是赞扬、中立还是批评?
161
+ example_title: zh-zh sentiment
162
+ - text: Suggest at least five related search terms to "Mạng neural nhân tạo".
163
+ example_title: vi-en query
164
+ - text: >-
165
+ Proposez au moins cinq mots clés concernant «Réseau de neurones
166
+ artificiels».
167
+ example_title: fr-fr query
168
+ - text: Explain in a sentence in Telugu what is backpropagation in neural networks.
169
+ example_title: te-en qa
170
+ - text: Why is the sky blue?
171
+ example_title: en-en qa
172
+ - text: >-
173
+ Write a fairy tale about a troll saving a princess from a dangerous dragon.
174
+ The fairy tale is a masterpiece that has achieved praise worldwide and its
175
+ moral is "Heroes Come in All Shapes and Sizes". Story (in Spanish):
176
+ example_title: es-en fable
177
+ - text: >-
178
+ Write a fable about wood elves living in a forest that is suddenly invaded
179
+ by ogres. The fable is a masterpiece that has achieved praise worldwide and
180
+ its moral is "Violence is the last refuge of the incompetent". Fable (in
181
+ Hindi):
182
+ example_title: hi-en fable
183
  model-index:
184
  - name: mt0-xxl
185
  results:
 
281
  revision: 9dbd830a06fea8b1c49d6e5ef2004a08d9f45094
282
  metrics:
283
  - type: Accuracy
284
+ value: 43
285
  - task:
286
  type: Natural language inference
287
  dataset:
 
358
  revision: a5a45e4ff92d5d3f34de70aaf4b72c3bdf9f7f16
359
  metrics:
360
  - type: Accuracy
361
+ value: 59
362
  - task:
363
  type: Natural language inference
364
  dataset:
 
485
  dataset:
486
  type: story_cloze
487
  name: StoryCloze (2016)
488
+ config: '2016'
489
  split: validation
490
  revision: e724c6f8cdf7c7a2fb229d862226e15b023ee4db
491
  metrics:
 
501
  revision: 9e12063561e7e6c79099feb6d5a493142584e9e2
502
  metrics:
503
  - type: Accuracy
504
+ value: 93
505
  - task:
506
  type: Sentence completion
507
  dataset:
 
512
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
513
  metrics:
514
  - type: Accuracy
515
+ value: 79
516
  - task:
517
  type: Sentence completion
518
  dataset:
 
523
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
524
  metrics:
525
  - type: Accuracy
526
+ value: 81
527
  - task:
528
  type: Sentence completion
529
  dataset:
 
534
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
535
  metrics:
536
  - type: Accuracy
537
+ value: 92
538
  - task:
539
  type: Sentence completion
540
  dataset:
 
545
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
546
  metrics:
547
  - type: Accuracy
548
+ value: 90
549
  - task:
550
  type: Sentence completion
551
  dataset:
 
556
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
557
  metrics:
558
  - type: Accuracy
559
+ value: 59
560
  - task:
561
  type: Sentence completion
562
  dataset:
 
567
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
568
  metrics:
569
  - type: Accuracy
570
+ value: 79
571
  - task:
572
  type: Sentence completion
573
  dataset:
 
578
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
579
  metrics:
580
  - type: Accuracy
581
+ value: 84
582
  - task:
583
  type: Sentence completion
584
  dataset:
 
589
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
590
  metrics:
591
  - type: Accuracy
592
+ value: 77
593
  - task:
594
  type: Sentence completion
595
  dataset:
 
600
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
601
  metrics:
602
  - type: Accuracy
603
+ value: 79
604
  - task:
605
  type: Sentence completion
606
  dataset:
 
611
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
612
  metrics:
613
  - type: Accuracy
614
+ value: 88
615
  - task:
616
  type: Sentence completion
617
  dataset:
 
622
  revision: 37f73c60fb123111fa5af5f9b705d0b3747fd187
623
  metrics:
624
  - type: Accuracy
625
+ value: 89
626
  - task:
627
  type: Sentence completion
628
  dataset:
 
733
  metrics:
734
  - type: Accuracy
735
  value: 93.85
736
+ pipeline_tag: text2text-generation
737
  ---
738
 
739
  ![xmtf](https://github.com/bigscience-workshop/xmtf/blob/master/xmtf_banner.png?raw=true)