Edit model card

gpt2-finetuned-codeparrot

This model is a fine-tuned version of GPT-2 tailored for code generation tasks. It has been adapted to better handle programming-related text through additional training on a dataset of code snippets and documentation.

Model Description

The gpt2-finetuned-codeparrot is a fine-tuned GPT-2 model that has been specifically trained to improve performance on code generation and related tasks. It leverages the transformer architecture to generate coherent and contextually relevant code based on the input prompts. This model is particularly useful for generating code snippets, assisting with code completion, and providing contextually relevant programming-related information.

Key Features:

  • Architecture: Transformer-based language model
  • Base Model: GPT-2
  • Fine-Tuned For: Code generation and programming-related tasks

Intended Uses & Limitations

Intended Uses:

  • Code Generation: Generate code snippets based on input prompts.
  • Code Completion: Assist in completing code segments.
  • Documentation Generation: Produce or improve programming documentation.
  • Programming Assistance: Provide contextually relevant help for programming tasks.

Limitations:

  • Dataset Bias: The model’s performance is dependent on the quality and diversity of the dataset used for fine-tuning. It may exhibit biases or limitations based on the nature of the training data.
  • Code Quality: The generated code may require review and debugging, as the model might not always produce syntactically or semantically correct code.
  • Limited Understanding: The model may not fully understand complex code logic or context, leading to potential inaccuracies in generated code or documentation.

Training and Evaluation Data

Dataset:

The model was fine-tuned on a diverse collection of code snippets and programming-related documents. Details of the dataset, including specific sources and data characteristics, are not provided.

Evaluation:

Evaluation metrics and results are not provided. Users should conduct their own evaluations to assess the model's performance on specific tasks or datasets.

Training Procedure

Training Hyperparameters:

  • Learning Rate: 0.0005
  • Train Batch Size: 32
  • Eval Batch Size: 32
  • Seed: 42
  • Gradient Accumulation Steps: 8
  • Total Train Batch Size: 256
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • LR Scheduler Type: Cosine
  • LR Scheduler Warmup Steps: 1000
  • Number of Epochs: 1
  • Mixed Precision Training: Native AMP

Training Results:

Specific training results, such as loss values or performance metrics, are not provided. Users are encouraged to assess the model's performance in their own applications.

Framework Versions

  • Transformers: 4.42.4
  • PyTorch: 2.3.1+cu121
  • Datasets: 2.21.0
  • Tokenizers: 0.19.1

Code Example

import torch
from transformers import pipeline

device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
pipe = pipeline(
    "text-generation",
    model="Ashaduzzaman/gpt2-finetuned-codeparrot",
    device=device
)

# Example usage
prompt = "def fibonacci(n):"
generated_code = pipe(prompt, max_length=50, num_return_sequences=1)
print(generated_code[0]['generated_text'])
Downloads last month
11
Safetensors
Model size
124M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ashaduzzaman/gpt2-finetuned-codeparrot

Finetuned
(1172)
this model

Datasets used to train ashaduzzaman/gpt2-finetuned-codeparrot