nicholasKluge
/

TeenyTinyLlama-160m

@@ -33,7 +33,7 @@ co2_eq_emissions:
   geographical_location: Germany
   hardware_used: NVIDIA A100-SXM4-40GB
 ---
-# TeenyTinyLlama-162m
 <img src="./logo-round.png" alt="A little llama wearing a mushroom hat and a monocle." height="200">
@@ -101,7 +101,7 @@ These are the main arguments used in the training of this model:
 ## Intended Uses
-The primary intended use of TeenyTinyLlama is to research the behavior, functionality, and limitations of large language models. Checkpoints saved during training are intended to provide a controlled setting for performing scientific experiments. You may also further fine-tune and adapt TeenyTinyLlama-162m for deployment, as long as your use is in accordance with the Apache 2.0 license. If you decide to use pre-trained TeenyTinyLlama-162 as a basis for your fine-tuned model, please conduct your own risk and bias assessment.
 ## Basic usage
@@ -110,7 +110,7 @@ Using the `pipeline`:
 ```python
 from transformers import pipeline
-generator = pipeline("text-generation", model="nicholasKluge/Teeny-tiny-llama-162m")
 completions  = generator("Astronomia é a ciência", num_return_sequences=2, max_new_tokens=100)
@@ -125,8 +125,8 @@ from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
 # Load model and the tokenizer
-tokenizer = AutoTokenizer.from_pretrained("nicholasKluge/Teeny-tiny-llama-162m", revision='main')
-model = AutoModelForCausalLM.from_pretrained("nicholasKluge/Teeny-tiny-llama-162m", revision='main')
 # Pass the model to your device
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
@@ -170,7 +170,7 @@ for i, completion in enumerate(completions):
 | Models                                                                              | Average | [ARC](https://arxiv.org/abs/1803.05457) | [Hellaswag](https://arxiv.org/abs/1905.07830) | [MMLU](https://arxiv.org/abs/2009.03300) | [TruthfulQA](https://arxiv.org/abs/2109.07958) |
 |-------------------------------------------------------------------------------------|---------|-----------------------------------------|-----------------------------------------------|------------------------------------------|------------------------------------------------|
-| [TeenyTinyLlama-162m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-162m)     | 31.16   | 26.15                                   | 29.29                                         | 28.11                                    | 41.12                                          |
 | [Pythia-160m](https://huggingface.co/EleutherAI/pythia-160m-deduped)*               | 31.16   | 24.06                                   | 31.39                                         | 24.86                                    | 44.34                                          |
 | [OPT-125m](https://huggingface.co/facebook/opt-125m)*                               | 30.80   | 22.87                                   | 31.47                                         | 26.02                                    | 42.87                                          |
 | [Gpt2-portuguese-small](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 30.22   | 22.48                                   | 29.62                                         | 27.36                                    | 41.44                                          |
@@ -184,7 +184,7 @@ for i, completion in enumerate(completions):
 | Models                                                                                     | [IMDB](https://huggingface.co/datasets/christykoh/imdb_pt) | [FaQuAD-NLI](https://huggingface.co/datasets/ruanchaves/faquad-nli) | [HateBr](https://huggingface.co/datasets/ruanchaves/hatebr) | [Assin2](https://huggingface.co/datasets/assin2)| [AgNews](https://huggingface.co/datasets/maritaca-ai/ag_news_pt) |
 |--------------------------------------------------------------------------------------------|------------------------------------------------------------|---------------------------------------------------------------------|-------------------------------------------------------------|-------------------------------------------------|------------------------------------------------------------------|
-| [Teeny Tiny Llama 162m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-162m)          | 91.14                                                      | 90.00                                                               | 90.71                                                       | 85.78                                           | 94.05                                                            |
 | [Bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) | 92.22                                                      | 93.07                                                               | 91.28                                                       | 87.45                                           | 94.19                                                            |
 | [Bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased)| 93.58                                                      | 92.26                                                               | 91.57                                                       | 88.97                                           | 94.11                                                            |
 | [Gpt2-small-portuguese](https://huggingface.co/pierreguillou/gpt2-small-portuguese)        | 91.60                                                      | 86.46                                                               | 87.42                                                       | 86.11                                           | 94.07                                                            |
@@ -195,7 +195,7 @@ for i, completion in enumerate(completions):
 @misc{nicholas22llama,
   doi = {10.5281/zenodo.6989727},
-  url = {https://huggingface.co/nicholasKluge/TeenyTinyLlama-162m},
   author = {Nicholas Kluge Corrêa},
   title = {TeenyTinyLlama},
   year = {2023},
@@ -211,4 +211,4 @@ This repository was built as part of the RAIES ([Rede de Inteligência Artificia
 ## License
-TeenyTinyLlama-162m is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.

   geographical_location: Germany
   hardware_used: NVIDIA A100-SXM4-40GB
 ---
+# TeenyTinyLlama-160m
 <img src="./logo-round.png" alt="A little llama wearing a mushroom hat and a monocle." height="200">
 ## Intended Uses
+The primary intended use of TeenyTinyLlama is to research the behavior, functionality, and limitations of large language models. Checkpoints saved during training are intended to provide a controlled setting for performing scientific experiments. You may also further fine-tune and adapt TeenyTinyLlama-160m for deployment, as long as your use is in accordance with the Apache 2.0 license. If you decide to use pre-trained TeenyTinyLlama-160m as a basis for your fine-tuned model, please conduct your own risk and bias assessment.
 ## Basic usage
 ```python
 from transformers import pipeline
+generator = pipeline("text-generation", model="nicholasKluge/Teeny-tiny-llama-160m")
 completions  = generator("Astronomia é a ciência", num_return_sequences=2, max_new_tokens=100)
 import torch
 # Load model and the tokenizer
+tokenizer = AutoTokenizer.from_pretrained("nicholasKluge/Teeny-tiny-llama-160m", revision='main')
+model = AutoModelForCausalLM.from_pretrained("nicholasKluge/Teeny-tiny-llama-160m", revision='main')
 # Pass the model to your device
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
 | Models                                                                              | Average | [ARC](https://arxiv.org/abs/1803.05457) | [Hellaswag](https://arxiv.org/abs/1905.07830) | [MMLU](https://arxiv.org/abs/2009.03300) | [TruthfulQA](https://arxiv.org/abs/2109.07958) |
 |-------------------------------------------------------------------------------------|---------|-----------------------------------------|-----------------------------------------------|------------------------------------------|------------------------------------------------|
+| [TeenyTinyLlama-160m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m)     | 31.16   | 26.15                                   | 29.29                                         | 28.11                                    | 41.12                                          |
 | [Pythia-160m](https://huggingface.co/EleutherAI/pythia-160m-deduped)*               | 31.16   | 24.06                                   | 31.39                                         | 24.86                                    | 44.34                                          |
 | [OPT-125m](https://huggingface.co/facebook/opt-125m)*                               | 30.80   | 22.87                                   | 31.47                                         | 26.02                                    | 42.87                                          |
 | [Gpt2-portuguese-small](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 30.22   | 22.48                                   | 29.62                                         | 27.36                                    | 41.44                                          |
 | Models                                                                                     | [IMDB](https://huggingface.co/datasets/christykoh/imdb_pt) | [FaQuAD-NLI](https://huggingface.co/datasets/ruanchaves/faquad-nli) | [HateBr](https://huggingface.co/datasets/ruanchaves/hatebr) | [Assin2](https://huggingface.co/datasets/assin2)| [AgNews](https://huggingface.co/datasets/maritaca-ai/ag_news_pt) |
 |--------------------------------------------------------------------------------------------|------------------------------------------------------------|---------------------------------------------------------------------|-------------------------------------------------------------|-------------------------------------------------|------------------------------------------------------------------|
+| [Teeny Tiny Llama 160m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m)          | 91.14                                                      | 90.00                                                               | 90.71                                                       | 85.78                                           | 94.05                                                            |
 | [Bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) | 92.22                                                      | 93.07                                                               | 91.28                                                       | 87.45                                           | 94.19                                                            |
 | [Bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased)| 93.58                                                      | 92.26                                                               | 91.57                                                       | 88.97                                           | 94.11                                                            |
 | [Gpt2-small-portuguese](https://huggingface.co/pierreguillou/gpt2-small-portuguese)        | 91.60                                                      | 86.46                                                               | 87.42                                                       | 86.11                                           | 94.07                                                            |
 @misc{nicholas22llama,
   doi = {10.5281/zenodo.6989727},
+  url = {https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m},
   author = {Nicholas Kluge Corrêa},
   title = {TeenyTinyLlama},
   year = {2023},
 ## License
+TeenyTinyLlama-160m is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.