JoaoJunior commited on
Commit
6db07f3
1 Parent(s): 632e106

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -0
README.md ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - JoaoJunior/python_java_dataset_APR
4
+ tags:
5
+ - APR
6
+ - AI
7
+ ---
8
+ # Introduction
9
+ This model, JoaoJunior/T5_APR_java_python_v4, is a fine-tuned version of the pre-trained CodeT5 model from Salesforce. The model is designed to understand and generate code, with a specific focus on bug fixing tasks in Python and Java languages.
10
+
11
+ # Description
12
+ The CodeT5 model was introduced in the paper "CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation". This model leverages the semantics conveyed from the developer-assigned identifiers in the code, allowing for effective code understanding and generation tasks.
13
+
14
+ JoaoJunior/T5_APR_java_python_v4 was trained on the python_java_dataset_APR dataset, which contains pairs of bugged and fixed code in Python and Java. This dataset was created using the coconut_java2006 and coconut_python2010 datasets from the CoCoNuT project as its base.
15
+
16
+ # Objective
17
+ The primary objective of this model is to identify and fix bugs in Python and Java code. By fine-tuning the CodeT5 model on the python_java_dataset_APR dataset, this model aims to effectively learn the patterns and structures of these languages, enabling it to accurately detect and correct errors.
18
+
19
+ # References
20
+ - CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation by Yue Wang, Weishi Wang, Shafiq Joty, Steven C.H. Hoi
21
+ - python_java_dataset_APR: A dataset containing pairs of bugged and fixed code in Python and Java, created using the CoCoNuT project's coconut_java2006 and coconut_python2010 datasets
22
+ - CoCoNuT: Combining Context-Aware Neural Translation Models using Ensemble for Program Repair