Edit model card

Model Overview

Model Name: Fikri

Model Type: Language Model

Language: Turkish

Model Size: 8B parameters

Base Model: Llama 3.1

Development Hardware: 2x NVIDIA RTX 4090 GPU

Description:

Fikri, meaning "intellectual" or "of thought" in Turkish. This model is the first and lightest in our lineup, fine-tuned specifically for low-stream Turkish tasks.

Influencing Paper: LoRA Learns Less and Forgets Less

Model Architecture

Base Model: Llama 3.1 8B 

Base Model Fine-tuning Data Size: ~1B tokens of high-quality Turkish data 

Instruction Training Data Size: 200k Turkish instructions

Training Information

Fikri was trained with the following statistics and configuration:

  • Training Loss: 0.996
  • Instruction Training Runtime: (~24 hours)
  • Epochs: 1.0

Lora Configuration:

  • r = 128
  • lora_alpha = 32
  • learning_rate = 5e-5

Usage

Fikri is primarily designed for tasks requiring understanding and generation of Turkish text. Its light configuration and optimized training data make it suitable for various applications, from conversational AI to text summarization, while maintaining efficiency and relevance to Turkish language nuances.

Acknowledgments

Fikri is a testimony to collaborative innovation, inspired by cutting-edge research and dedicated to advancing the capabilities of artificial intelligence in the Turkish language.

If you have any questions, feedback, or need support, feel free to reach out to our development team.

Brew Interactive/AI Guild https://brewww.com

Downloads last month
24
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for BrewInteractive/fikri-3.1-8B-Instruct

Finetuned
this model

Dataset used to train BrewInteractive/fikri-3.1-8B-Instruct