JoaoJunior
commited on
Commit
•
6db07f3
1
Parent(s):
632e106
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,22 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
datasets:
|
3 |
+
- JoaoJunior/python_java_dataset_APR
|
4 |
+
tags:
|
5 |
+
- APR
|
6 |
+
- AI
|
7 |
+
---
|
8 |
+
# Introduction
|
9 |
+
This model, JoaoJunior/T5_APR_java_python_v4, is a fine-tuned version of the pre-trained CodeT5 model from Salesforce. The model is designed to understand and generate code, with a specific focus on bug fixing tasks in Python and Java languages.
|
10 |
+
|
11 |
+
# Description
|
12 |
+
The CodeT5 model was introduced in the paper "CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation". This model leverages the semantics conveyed from the developer-assigned identifiers in the code, allowing for effective code understanding and generation tasks.
|
13 |
+
|
14 |
+
JoaoJunior/T5_APR_java_python_v4 was trained on the python_java_dataset_APR dataset, which contains pairs of bugged and fixed code in Python and Java. This dataset was created using the coconut_java2006 and coconut_python2010 datasets from the CoCoNuT project as its base.
|
15 |
+
|
16 |
+
# Objective
|
17 |
+
The primary objective of this model is to identify and fix bugs in Python and Java code. By fine-tuning the CodeT5 model on the python_java_dataset_APR dataset, this model aims to effectively learn the patterns and structures of these languages, enabling it to accurately detect and correct errors.
|
18 |
+
|
19 |
+
# References
|
20 |
+
- CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation by Yue Wang, Weishi Wang, Shafiq Joty, Steven C.H. Hoi
|
21 |
+
- python_java_dataset_APR: A dataset containing pairs of bugged and fixed code in Python and Java, created using the CoCoNuT project's coconut_java2006 and coconut_python2010 datasets
|
22 |
+
- CoCoNuT: Combining Context-Aware Neural Translation Models using Ensemble for Program Repair
|