XeroCodes
/

xenith-3b

Text Generation

Model card Files Files and versions Community

Edit model card

Xenith-3B

Xenith-3B is a fine-tuned language model based on the microsoft/Phi-3-mini-4k-instruct model. It has been specifically trained on the AlignmentLab-AI/alpaca-cot-collection dataset, which focuses on chain-of-thought reasoning and instruction following.

Model Overview

Model Name: Xenith-3B
Base Model: microsoft/Phi-3-mini-4k-instruct
Fine-Tuned On: AlignmentLab-AI/alpaca-cot-collection
Model Size: 3 Billion parameters
Architecture: Transformer-based LLM

Training Details

Objective: Fine-tune the base model to enhance its performance on tasks requiring complex reasoning and multi-step problem-solving.
Training Duration: 10 epochs
Batch Size: 8
Learning Rate: 3e-5
Optimizer: AdamW
Hardware Used: 2x NVIDIA L4 GPUs

Performance

Xenith-3B excels in tasks that require:

Chain-of-thought reasoning
Instruction following
Contextual understanding
Complex problem-solving
The model has shown significant improvements in these areas compared to the base model.

Downloads last month: 6

Safetensors

Model size

3.82B params

Tensor type

BF16

·

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for XeroCodes/xenith-3b

Base model

microsoft/Phi-3-mini-4k-instruct

Adapter

(291)

this model

Dataset used to train XeroCodes/xenith-3b