2121-8
/

TinySlime-1.1B-v1.0

Text Generation

OpenAccess AI Collective

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

TinySlime-1.1B-v1.0 / README_en.md

2121-8's picture

Upload 2 files

ca5c70e verified 5 months ago

|

3.14 kB


	# TinyLlama-1.1B Intermediate Step Model

	This repository contains the pre-trained model `TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T`, fine-tuned on the `augmxnt/shisa-pretrain-en-ja-v1` dataset. The model has been trained on 5.5 billion tokens, offering a robust performance for various natural language processing (NLP) tasks.

	## Model Overview

	- Base Model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
	- Training Dataset: augmxnt/shisa-pretrain-en-ja-v1
	- Training Tokens: 5.5 billion

	This model is designed for a range of NLP tasks, including but not limited to language translation, text generation, and sentiment analysis. It is particularly effective in handling bilingual content in English and Japanese.

	## Usage

	### Installation

	To use this model, you'll need to install the `transformers` library from Hugging Face:

	```bash
	pip install transformers
	```

	### Loading the Model

	You can load the model using the `transformers` library as follows:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name)
	```

	### Generating Text

	Here is an example of how to generate text using the loaded model:

	```python
	input_text = "Translate the following English text to Japanese: Hello, how are you?"
	input_ids = tokenizer.encode(input_text, return_tensors='pt')

	# Generate text
	outputs = model.generate(input_ids, max_length=50, num_return_sequences=1)
	generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

	print(generated_text)
	```

	## Model Performance

	This model has been trained on a diverse dataset to ensure high performance across various tasks. Below are some benchmark results:

	- Language Translation: Achieves high accuracy in translating between English and Japanese.
	- Text Generation: Produces coherent and contextually relevant text for prompts in both languages.
	- Sentiment Analysis: Effectively classifies sentiments with a high degree of accuracy.

	## Fine-Tuning

	For users interested in fine-tuning this model on their own datasets, the following code snippet provides a starting point:

	```python
	from transformers import Trainer, TrainingArguments

	training_args = TrainingArguments(
	output_dir='./results',
	num_train_epochs=3,
	per_device_train_batch_size=4,
	save_steps=10_000,
	save_total_limit=2,
	)

	trainer = Trainer(
	model=model,
	args=training_args,
	train_dataset=my_train_dataset,
	eval_dataset=my_eval_dataset,
	)

	trainer.train()
	```

	Replace `my_train_dataset` and `my_eval_dataset` with your own dataset objects.

	## Acknowledgements

	This model was built upon the work of the TinyLlama project and trained using the `augmxnt/shisa-pretrain-en-ja-v1` dataset. We acknowledge their contributions to the NLP community.

	## License

	This model is released under the [MIT License](LICENSE).

	## Contact

	For questions or feedback, please open an issue in this repository or contact us at [[email protected]].