ljcnju commited on
Commit
23d31af
1 Parent(s): 40af1e6

End of training

Browse files
Files changed (2) hide show
  1. README.md +31 -12
  2. generation_config.json +1 -1
README.md CHANGED
@@ -13,14 +13,12 @@ should probably proofread and complete it, then remove this comment. -->
13
  # CodeBertForCodeTrans
14
 
15
  This model is a fine-tuned version of [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base) on an unknown dataset.
 
 
16
 
17
  ## Model description
18
 
19
- This model is fine-tuned on CodeXGLUE codetrans dataset. It can only translate java code to c-sharp code.
20
- Prompt:
21
- ```python
22
- "#translate this java code to c-sharp code:\njava:<Your java code>"
23
- ```
24
 
25
  ## Intended uses & limitations
26
 
@@ -35,24 +33,45 @@ More information needed
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
- - learning_rate: 1e-05
39
  - train_batch_size: 16
40
  - eval_batch_size: 16
41
  - seed: 42
42
- - gradient_accumulation_steps: 2
43
- - total_train_batch_size: 32
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: linear
46
  - lr_scheduler_warmup_steps: 12354.0
47
  - num_epochs: 20
 
48
 
49
  ### Training results
50
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
 
52
 
53
  ### Framework versions
54
 
55
- - Transformers 4.33.2
56
- - Pytorch 2.0.1+cu117
57
- - Datasets 2.16.1
58
- - Tokenizers 0.13.3
 
13
  # CodeBertForCodeTrans
14
 
15
  This model is a fine-tuned version of [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base) on an unknown dataset.
16
+ It achieves the following results on the evaluation set:
17
+ - Loss: 0.0006
18
 
19
  ## Model description
20
 
21
+ More information needed
 
 
 
 
22
 
23
  ## Intended uses & limitations
24
 
 
33
  ### Training hyperparameters
34
 
35
  The following hyperparameters were used during training:
36
+ - learning_rate: 5e-05
37
  - train_batch_size: 16
38
  - eval_batch_size: 16
39
  - seed: 42
 
 
40
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
41
  - lr_scheduler_type: linear
42
  - lr_scheduler_warmup_steps: 12354.0
43
  - num_epochs: 20
44
+ - mixed_precision_training: Native AMP
45
 
46
  ### Training results
47
 
48
+ | Training Loss | Epoch | Step | Validation Loss |
49
+ |:-------------:|:-----:|:-----:|:---------------:|
50
+ | 5.7169 | 1.0 | 644 | 4.5075 |
51
+ | 3.0571 | 2.0 | 1288 | 2.1423 |
52
+ | 0.7391 | 3.0 | 1932 | 0.2866 |
53
+ | 0.1028 | 4.0 | 2576 | 0.0219 |
54
+ | 0.0158 | 5.0 | 3220 | 0.0047 |
55
+ | 0.0065 | 6.0 | 3864 | 0.0024 |
56
+ | 0.0036 | 7.0 | 4508 | 0.0020 |
57
+ | 0.0028 | 8.0 | 5152 | 0.0014 |
58
+ | 0.0018 | 9.0 | 5796 | 0.0010 |
59
+ | 0.0023 | 10.0 | 6440 | 0.0017 |
60
+ | 0.002 | 11.0 | 7084 | 0.0009 |
61
+ | 0.002 | 12.0 | 7728 | 0.0012 |
62
+ | 0.0015 | 13.0 | 8372 | 0.0020 |
63
+ | 0.0028 | 14.0 | 9016 | 0.0010 |
64
+ | 0.0015 | 15.0 | 9660 | 0.0007 |
65
+ | 0.0027 | 16.0 | 10304 | 0.0015 |
66
+ | 0.002 | 17.0 | 10948 | 0.0007 |
67
+ | 0.0011 | 18.0 | 11592 | 0.0009 |
68
+ | 0.0019 | 19.0 | 12236 | 0.0007 |
69
+ | 0.0003 | 20.0 | 12880 | 0.0006 |
70
 
71
 
72
  ### Framework versions
73
 
74
+ - Transformers 4.37.2
75
+ - Pytorch 2.1.2+cu121
76
+ - Datasets 2.15.0
77
+ - Tokenizers 0.15.0
generation_config.json CHANGED
@@ -3,5 +3,5 @@
3
  "bos_token_id": 0,
4
  "eos_token_id": 2,
5
  "pad_token_id": 1,
6
- "transformers_version": "4.33.2"
7
  }
 
3
  "bos_token_id": 0,
4
  "eos_token_id": 2,
5
  "pad_token_id": 1,
6
+ "transformers_version": "4.37.2"
7
  }