Model Card for seeweb/SeewebLLM-it
The model is a fine-tuned version of LLama-2-7b-chat-hf specialized into italian speaking.
- Backbone Model: LLama2
- Language(s) : Italian
- Finetuned from model: LLama-2-7b-chat-hf
- Contributors: Lorenzo Rocchi @ Seeweb
Bias, Risks, and Limitations
The model may not produce 100% correct output sentences.
Training script
The following repository contains scripts and instructions used for the finetuning and testing:
https://github.com/itsrocchi/finetuning-llama2-ita.git
Inference and confrontation w/LLama 2
here's a little python snippet to perform inference
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
tokenizer = AutoTokenizer.from_pretrained("itsrocchi/SeewebLLM-it-ver2")
model = AutoModelForCausalLM.from_pretrained(
"itsrocchi/SeewebLLM-it-ver2",
device_map="auto",
torch_dtype=torch.float16,
load_in_8bit=True,
rope_scaling={"type": "dynamic", "factor": 2}
)
# eventualmente si possono modificare i parametri di model e tokenizer
# inserendo il percorso assoluto della directory locale del modello
prompt = "### User:\nDescrivi cos' è l'intelligenza artificiale\n\n### Assistant:\n"
#modificare ciò che è scritto tra "User" ed "assistant per personalizzare il prompt"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
output = model.generate(**inputs, streamer=streamer, use_cache=True, max_new_tokens=float('inf'))
output_text = tokenizer.decode(output[0], skip_special_tokens=True)
Output finetuned model:
L'intelligenza artificiale è un campo di studio che si occupa di creare sistemi che possono imparare, raffinare e prendere decisioni come gli esseri umani. Questi sistemi possono essere utilizzati in molte applicazioni, come la robotica, la visione artificiale, il machine learning e l'elaborazione del linguaggio naturale.
Output LLama 2:
Certainly, I'd be happy to help you understand what artificial intelligence (AI) is!
Artificial intelligence (AI) refers to the development of computer systems that can perform tasks that typically require human intelligence, such as learning, problem-solving, decision-making, and perception. AI systems use algorithms and machine learning techniques to analyze data, make predictions, and take actions based on that data.
There are several types of AI, including:
Narrow or weak AI: This type of AI is designed to perform a specific task, such as playing chess or recognizing faces. Narrow AI is the most common form of AI and is used in many applications, such as virtual assistants, language translation, and image recognition.
General or strong AI: This type of AI is designed to perform any intellectual task that a human can. General AI has the potential to revolutionize many industries, including healthcare, finance, and education.
Superintelligence: This type of AI is significantly more intelligent than the best human minds. Superintelligence could potentially solve complex problems that are currently unsolvable, but it also raises concerns about safety and control.
Some examples of AI include:
Self-driving cars: These vehicles use a combination of sensors and machine learning algorithms to navigate roads and avoid obstacles.
Virtual assistants: These AI-powered tools, such as Amazon's Alexa and Google Assistant, can perform tasks such as scheduling appointments, setting reminders, and controlling smart home devices.
Language translation: AI systems can translate text and speech in real-time, allowing for more effective communication between people who speak different languages.
It's important to note that AI is not a single entity, but rather a rapidly evolving field that involves many different techniques and technologies. As AI continues to advance, it has the potential to transform many aspects of our lives, from the way we work and interact with each other to the way we make decisions and solve problems.
Conclusion
LLama 2 can understand the question and gives the user a very specific and overall a better answer compared to the one given by the fine-tuned model. However the fine-tuned model answers with a sentence written in a perfect italian, which is what we were trying to achieve with this fine-tuning process.
Training Data and Details
The dataset used is seeweb/Seeweb-it-292-forLLM, a dataset containing approx. 300 italian prompt-answer conversations.
The training has been made on RTX A6000, inside Seeweb's Cloud Server GPU
What next?
The model must be improved: a much bigger dataset needs to be created so that the model can learn many more ways to answer.
- Downloads last month
- 5,719