Ateeqq commited on
Commit
15da904
1 Parent(s): 3a8cadd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -11
README.md CHANGED
@@ -27,9 +27,7 @@ datasets:
27
 
28
  ## Overview
29
 
30
- This repository contains a fine-tuned model for generating high-quality product descriptions. The model is based on the `t5-base` and has 223 million parameters. It has been fine-tuned on the Amazon Product Dataset, which contains 10 million examples, with the cleaned version having 0.5 million examples. This is a test version trained on 0.1 million examples, and it will be updated to the latest 0.5 million cleaned examples soon.
31
-
32
- Developed by team at https://exnrt.com
33
 
34
  ## Usage
35
 
@@ -91,12 +89,7 @@ generate_description(title)
91
  ## Features
92
 
93
  - **Architecture**: t5-base (223M parameters)
94
- - **Dataset**: Amazon Product Dataset
95
- - **Original**: 10 million examples
96
- - **Cleaned**: 0.5 million examples
97
- - **Training**:
98
- - **Current Version**: Trained on 0.1 million cleaned examples
99
- - **Upcoming Update**: Will be trained on 0.5 million cleaned examples
100
  - **Training Time**:
101
  - **Hardware**: Colab T4 GPU
102
  - **Speed**: 4.91 iterations/second
@@ -110,8 +103,8 @@ generate_description(title)
110
 
111
  ## Data Preparation
112
 
113
- - **Training Data**: First 100,000 examples from `train`
114
- - **Evaluation Data**: First 10,000 examples from `test`
115
  - **Source Max Token Length**: 50
116
  - **Target Max Token Length**: 300
117
  - **Batch Size**: 1
 
27
 
28
  ## Overview
29
 
30
+ This repository contains a fine-tuned model for generating high-quality product descriptions. The model is based on the `t5-base` and has 223 million parameters. It has been fine-tuned on the Amazon Product Dataset, which contains 10 million examples, with the cleaned version having 0.5 million examples.
 
 
31
 
32
  ## Usage
33
 
 
89
  ## Features
90
 
91
  - **Architecture**: t5-base (223M parameters)
92
+ - **Training Dataset**: Trained on 0.5 million cleaned examples
 
 
 
 
 
93
  - **Training Time**:
94
  - **Hardware**: Colab T4 GPU
95
  - **Speed**: 4.91 iterations/second
 
103
 
104
  ## Data Preparation
105
 
106
+ - **Training Data**: First 250,000 examples for `train`
107
+ - **Validation Data**: First 40,000 examples for `validation`
108
  - **Source Max Token Length**: 50
109
  - **Target Max Token Length**: 300
110
  - **Batch Size**: 1