Edit model card

Model Card: hjys_LLM_final (42dot LLM-SFT-1.3B Fine-Tuned Version)

Model Overview

The 42dot LLM-SFT-1.3B is a fine-tuned version of the large language model developed by 42dot, specifically undergoing Supervised Fine-Tuning (SFT) to enhance its ability to follow natural language instructions. This model aims to improve scores on the ko-CommonGen V2 task, for which it was fine-tuned using the beomi/KoAlpaca-v1.1a dataset.

Dataset

The beomi/KoAlpaca-v1.1a dataset used for fine-tuning offers a rich resource for Korean natural language processing, contributing to the advancement of the model's language understanding and generation capabilities.

Goal

The primary goal of this model is to improve scores on the ko-CommonGen V2 task, which involves generating meaningful sentences using given words, assessing the model's creativity and language comprehension. This model is equipped to effectively use specific keywords to generate meaningful sentences.

Fine-Tuning Details

  • Parameters: 1.3B
  • Layers: 24
  • Attention Heads: 32
  • Hidden Size: 2,048
  • FFN Size: 5,632
  • Maximum Length: 4,096 tokens
  • Training Time: 5 GPU hours on NVIDIA A100 (Google Colab Pro+)

Limitations and Ethical Considerations

Like other LLMs, the 42dot LLM-SFT-1.3B may produce hallucinated or biased content. Users should be aware of these limitations and take appropriate actions.

Disclaimer

Contents generated by this model do not necessarily reflect the views of 42dot Inc. All responsibility lies with the end-user, and 42dot assumes no liability.

License

This model is available for non-commercial use only, under the Creative Commons Attribution-NonCommercial 4.0 (CC BY-NC 4.0) license.

Downloads last month
1,784
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train junga/hjys_LLM_final