Pico-OpenLAiNN
Collection
3 items
•
Updated
Hey there fellow researchers, developers, and AI enthusiasts! Today I'm releasing a new, slightly less smol open LLM. This LLM was trained on the full 32B tokens that the entire Open-PicoLAiNN family is trained on.
You can find the GGUF quants of this model here.
This specific version of Pico LAiNN was trained on just 32B tokens of the fineweb dataset.
To start using these models, you can simply load them via the Hugging Face transformers
library:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
MODEL_NAME = "UUFO-Aigis/Pico-OpenLAiNN-250M" #Replace 100M with 250M or 500M if you prefer those models.
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
def generate_text(prompt, model, tokenizer, max_length=512, temperature=1, top_k=50, top_p=0.95):
inputs = tokenizer.encode(prompt, return_tensors="pt")
outputs = model.generate(
inputs,
max_length=max_length,
temperature=temperature,
top_k=top_k,
top_p=top_p,
do_sample=True
)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
return generated_text
def main():
# Define your prompt
prompt = "According to all known laws of aviation, there is no way a bee should be able to fly."
generated_text = generate_text(prompt, model, tokenizer)
print(generated_text)
if __name__ == "__main__":
main()
Tasks | Value | Stderr | |
---|---|---|---|
arc_challenge | 0.1988 | ± | 0.0117 |
arc_easy | 0.4503 | ± | 0.0102 |
boolq | 0.5907 | ± | 0.0086 |
hellaswag | 0.3215 | ± | 0.0047 |
lambada_openai | 0.3280 | ± | 0.0065 |
piqa | 0.6594 | ± | 0.0111 |
winogrande | 0.5028 | ± | 0.0141 |
If you find these models useful and decide to use these models, a link to this repository would be highly appreciated. I am a one man show running this. Thanks 🤗
If you have questions, Please reach out to me at [email protected]