Edit model card

haijian06/Yi-1.5-6B-Chat-Agent_sft

Overview

The haijian06/Yi-1.5-6B-Chat-Agent_sft model is an advanced conversational agent built upon the Yi-1.5-6B-Chat model. This model has been fine-tuned to enhance its capabilities in handling agent tasks and function calls, making it a versatile tool for a variety of applications.

Features

  • Improved Conversational Abilities: Enhanced dialogue management and natural language understanding.
  • Function Call Capability: Supports complex function call operations, making it suitable for automation and task handling.
  • High Performance: Optimized for speed and accuracy in responses.

Installation

To use this model, you need to have Python and the necessary libraries installed. You can install the required dependencies using the following commands:

pip install torch transformers

Usage

Here is a basic example of how to use the haijian06/Yi-1.5-6B-Chat-Agent_sft model:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the model and tokenizer
model_name = "haijian06/Yi-1.5-6B-Chat-Agent_sft"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Generate a response
input_text = "Hello, how can I assist you today?"
input_ids = tokenizer.encode(input_text, return_tensors='pt')

with torch.no_grad():
    output = model.generate(input_ids, max_length=50)

response = tokenizer.decode(output[0], skip_special_tokens=True)
print(response)

Fine-Tuning

To fine-tune this model on your own dataset, follow these steps:

  1. Prepare your dataset in a suitable format.
  2. Use the Trainer class from the transformers library for training.

Example training script:

from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir='./results',         
    num_train_epochs=3,             
    per_device_train_batch_size=4,  
    per_device_eval_batch_size=4,   
    warmup_steps=500,               
    weight_decay=0.01,              
    logging_dir='./logs',           
)

trainer = Trainer(
    model=model,                        
    args=training_args,                 
    train_dataset=train_dataset,         
    eval_dataset=eval_dataset            
)

trainer.train()

Contributing

Contributions are welcome! Please fork this repository and submit a pull request with your improvements.

License

This work is a derivative of Yi-1.5-6B by 01.AI, used under the Apache 2.0 License.

Acknowledgements

This model is built upon the Yi-1.5-6B-Chat model. Special thanks to the developers and contributors of the original model.


For more information, please visit our GitHub repository.

Downloads last month
5
Safetensors
Model size
6.06B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.