nicholasKluge
commited on
Commit
•
b38dc02
1
Parent(s):
45c0ff2
Update README.md
Browse files
README.md
CHANGED
@@ -33,7 +33,7 @@ co2_eq_emissions:
|
|
33 |
geographical_location: Germany
|
34 |
hardware_used: NVIDIA A100-SXM4-40GB
|
35 |
---
|
36 |
-
# TeenyTinyLlama-
|
37 |
|
38 |
<img src="./logo-round.png" alt="A little llama wearing a mushroom hat and a monocle." height="200">
|
39 |
|
@@ -101,7 +101,7 @@ These are the main arguments used in the training of this model:
|
|
101 |
|
102 |
## Intended Uses
|
103 |
|
104 |
-
The primary intended use of TeenyTinyLlama is to research the behavior, functionality, and limitations of large language models. Checkpoints saved during training are intended to provide a controlled setting for performing scientific experiments. You may also further fine-tune and adapt TeenyTinyLlama-
|
105 |
|
106 |
## Basic usage
|
107 |
|
@@ -110,7 +110,7 @@ Using the `pipeline`:
|
|
110 |
```python
|
111 |
from transformers import pipeline
|
112 |
|
113 |
-
generator = pipeline("text-generation", model="nicholasKluge/Teeny-tiny-llama-
|
114 |
|
115 |
completions = generator("Astronomia é a ciência", num_return_sequences=2, max_new_tokens=100)
|
116 |
|
@@ -125,8 +125,8 @@ from transformers import AutoTokenizer, AutoModelForCausalLM
|
|
125 |
import torch
|
126 |
|
127 |
# Load model and the tokenizer
|
128 |
-
tokenizer = AutoTokenizer.from_pretrained("nicholasKluge/Teeny-tiny-llama-
|
129 |
-
model = AutoModelForCausalLM.from_pretrained("nicholasKluge/Teeny-tiny-llama-
|
130 |
|
131 |
# Pass the model to your device
|
132 |
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
@@ -170,7 +170,7 @@ for i, completion in enumerate(completions):
|
|
170 |
|
171 |
| Models | Average | [ARC](https://arxiv.org/abs/1803.05457) | [Hellaswag](https://arxiv.org/abs/1905.07830) | [MMLU](https://arxiv.org/abs/2009.03300) | [TruthfulQA](https://arxiv.org/abs/2109.07958) |
|
172 |
|-------------------------------------------------------------------------------------|---------|-----------------------------------------|-----------------------------------------------|------------------------------------------|------------------------------------------------|
|
173 |
-
| [TeenyTinyLlama-
|
174 |
| [Pythia-160m](https://huggingface.co/EleutherAI/pythia-160m-deduped)* | 31.16 | 24.06 | 31.39 | 24.86 | 44.34 |
|
175 |
| [OPT-125m](https://huggingface.co/facebook/opt-125m)* | 30.80 | 22.87 | 31.47 | 26.02 | 42.87 |
|
176 |
| [Gpt2-portuguese-small](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 30.22 | 22.48 | 29.62 | 27.36 | 41.44 |
|
@@ -184,7 +184,7 @@ for i, completion in enumerate(completions):
|
|
184 |
|
185 |
| Models | [IMDB](https://huggingface.co/datasets/christykoh/imdb_pt) | [FaQuAD-NLI](https://huggingface.co/datasets/ruanchaves/faquad-nli) | [HateBr](https://huggingface.co/datasets/ruanchaves/hatebr) | [Assin2](https://huggingface.co/datasets/assin2)| [AgNews](https://huggingface.co/datasets/maritaca-ai/ag_news_pt) |
|
186 |
|--------------------------------------------------------------------------------------------|------------------------------------------------------------|---------------------------------------------------------------------|-------------------------------------------------------------|-------------------------------------------------|------------------------------------------------------------------|
|
187 |
-
| [Teeny Tiny Llama
|
188 |
| [Bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) | 92.22 | 93.07 | 91.28 | 87.45 | 94.19 |
|
189 |
| [Bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased)| 93.58 | 92.26 | 91.57 | 88.97 | 94.11 |
|
190 |
| [Gpt2-small-portuguese](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 91.60 | 86.46 | 87.42 | 86.11 | 94.07 |
|
@@ -195,7 +195,7 @@ for i, completion in enumerate(completions):
|
|
195 |
|
196 |
@misc{nicholas22llama,
|
197 |
doi = {10.5281/zenodo.6989727},
|
198 |
-
url = {https://huggingface.co/nicholasKluge/TeenyTinyLlama-
|
199 |
author = {Nicholas Kluge Corrêa},
|
200 |
title = {TeenyTinyLlama},
|
201 |
year = {2023},
|
@@ -211,4 +211,4 @@ This repository was built as part of the RAIES ([Rede de Inteligência Artificia
|
|
211 |
|
212 |
## License
|
213 |
|
214 |
-
TeenyTinyLlama-
|
|
|
33 |
geographical_location: Germany
|
34 |
hardware_used: NVIDIA A100-SXM4-40GB
|
35 |
---
|
36 |
+
# TeenyTinyLlama-160m
|
37 |
|
38 |
<img src="./logo-round.png" alt="A little llama wearing a mushroom hat and a monocle." height="200">
|
39 |
|
|
|
101 |
|
102 |
## Intended Uses
|
103 |
|
104 |
+
The primary intended use of TeenyTinyLlama is to research the behavior, functionality, and limitations of large language models. Checkpoints saved during training are intended to provide a controlled setting for performing scientific experiments. You may also further fine-tune and adapt TeenyTinyLlama-160m for deployment, as long as your use is in accordance with the Apache 2.0 license. If you decide to use pre-trained TeenyTinyLlama-160m as a basis for your fine-tuned model, please conduct your own risk and bias assessment.
|
105 |
|
106 |
## Basic usage
|
107 |
|
|
|
110 |
```python
|
111 |
from transformers import pipeline
|
112 |
|
113 |
+
generator = pipeline("text-generation", model="nicholasKluge/Teeny-tiny-llama-160m")
|
114 |
|
115 |
completions = generator("Astronomia é a ciência", num_return_sequences=2, max_new_tokens=100)
|
116 |
|
|
|
125 |
import torch
|
126 |
|
127 |
# Load model and the tokenizer
|
128 |
+
tokenizer = AutoTokenizer.from_pretrained("nicholasKluge/Teeny-tiny-llama-160m", revision='main')
|
129 |
+
model = AutoModelForCausalLM.from_pretrained("nicholasKluge/Teeny-tiny-llama-160m", revision='main')
|
130 |
|
131 |
# Pass the model to your device
|
132 |
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
|
|
170 |
|
171 |
| Models | Average | [ARC](https://arxiv.org/abs/1803.05457) | [Hellaswag](https://arxiv.org/abs/1905.07830) | [MMLU](https://arxiv.org/abs/2009.03300) | [TruthfulQA](https://arxiv.org/abs/2109.07958) |
|
172 |
|-------------------------------------------------------------------------------------|---------|-----------------------------------------|-----------------------------------------------|------------------------------------------|------------------------------------------------|
|
173 |
+
| [TeenyTinyLlama-160m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m) | 31.16 | 26.15 | 29.29 | 28.11 | 41.12 |
|
174 |
| [Pythia-160m](https://huggingface.co/EleutherAI/pythia-160m-deduped)* | 31.16 | 24.06 | 31.39 | 24.86 | 44.34 |
|
175 |
| [OPT-125m](https://huggingface.co/facebook/opt-125m)* | 30.80 | 22.87 | 31.47 | 26.02 | 42.87 |
|
176 |
| [Gpt2-portuguese-small](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 30.22 | 22.48 | 29.62 | 27.36 | 41.44 |
|
|
|
184 |
|
185 |
| Models | [IMDB](https://huggingface.co/datasets/christykoh/imdb_pt) | [FaQuAD-NLI](https://huggingface.co/datasets/ruanchaves/faquad-nli) | [HateBr](https://huggingface.co/datasets/ruanchaves/hatebr) | [Assin2](https://huggingface.co/datasets/assin2)| [AgNews](https://huggingface.co/datasets/maritaca-ai/ag_news_pt) |
|
186 |
|--------------------------------------------------------------------------------------------|------------------------------------------------------------|---------------------------------------------------------------------|-------------------------------------------------------------|-------------------------------------------------|------------------------------------------------------------------|
|
187 |
+
| [Teeny Tiny Llama 160m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m) | 91.14 | 90.00 | 90.71 | 85.78 | 94.05 |
|
188 |
| [Bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) | 92.22 | 93.07 | 91.28 | 87.45 | 94.19 |
|
189 |
| [Bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased)| 93.58 | 92.26 | 91.57 | 88.97 | 94.11 |
|
190 |
| [Gpt2-small-portuguese](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 91.60 | 86.46 | 87.42 | 86.11 | 94.07 |
|
|
|
195 |
|
196 |
@misc{nicholas22llama,
|
197 |
doi = {10.5281/zenodo.6989727},
|
198 |
+
url = {https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m},
|
199 |
author = {Nicholas Kluge Corrêa},
|
200 |
title = {TeenyTinyLlama},
|
201 |
year = {2023},
|
|
|
211 |
|
212 |
## License
|
213 |
|
214 |
+
TeenyTinyLlama-160m is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.
|