Baby Nandi
Baby Nandi (part of the Nandi series of Telugu LLMs) is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt to develop smaller and efficient Indic LLMs, useful for practical purposes. It beats the original gemma-2b overall, but still is behind the latest gemma-2b-1.1-it.
π Benchmarks
Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
---|---|---|---|---|---|
bharadwajswarna/gemma-2b-sft-teluguπ | 38.99 | 21.53 | 55.56 | 48.33 | 30.56 |
google/gemma-2b-it π | 36.1 | 23.76 | 43.6 | 47.64 | 29.41 |
google/gemma-2b π | 34.26 | 22.7 | 43.35 | 39.96 | 31.03 |
Training Process & Datasets :
- Gemma 2b Base model has been further pretrained on a part of AI4Bharat Sangraha dataset (280k Telugu Samples).
- SFT on a mix of Telugu Alpaca + Telugu GPTeacher from Telugu LLM Labs and English Alpaca
You can find the link to this model here : Gemma-2b-Telugu-Base-Model
Training Duration :
- Pretraining for 6 epochs, nearly 35 hours (This might not be enough)
- SFT for 3 epochs
Inference Prompt Template:
"""
### Instruction:
{}
### Input:
{}
### Response:
{}
"""
Developer :
Bharadwaj Swarna
You can reach out to me for any questions/suggestions/collaborations.
- Downloads last month
- 12
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.