Update README.md
Browse files
README.md
CHANGED
@@ -27,9 +27,7 @@ datasets:
|
|
27 |
|
28 |
## Overview
|
29 |
|
30 |
-
This repository contains a fine-tuned model for generating high-quality product descriptions. The model is based on the `t5-base` and has 223 million parameters. It has been fine-tuned on the Amazon Product Dataset, which contains 10 million examples, with the cleaned version having 0.5 million examples.
|
31 |
-
|
32 |
-
Developed by team at https://exnrt.com
|
33 |
|
34 |
## Usage
|
35 |
|
@@ -91,12 +89,7 @@ generate_description(title)
|
|
91 |
## Features
|
92 |
|
93 |
- **Architecture**: t5-base (223M parameters)
|
94 |
-
- **Dataset**:
|
95 |
-
- **Original**: 10 million examples
|
96 |
-
- **Cleaned**: 0.5 million examples
|
97 |
-
- **Training**:
|
98 |
-
- **Current Version**: Trained on 0.1 million cleaned examples
|
99 |
-
- **Upcoming Update**: Will be trained on 0.5 million cleaned examples
|
100 |
- **Training Time**:
|
101 |
- **Hardware**: Colab T4 GPU
|
102 |
- **Speed**: 4.91 iterations/second
|
@@ -110,8 +103,8 @@ generate_description(title)
|
|
110 |
|
111 |
## Data Preparation
|
112 |
|
113 |
-
- **Training Data**: First
|
114 |
-
- **
|
115 |
- **Source Max Token Length**: 50
|
116 |
- **Target Max Token Length**: 300
|
117 |
- **Batch Size**: 1
|
|
|
27 |
|
28 |
## Overview
|
29 |
|
30 |
+
This repository contains a fine-tuned model for generating high-quality product descriptions. The model is based on the `t5-base` and has 223 million parameters. It has been fine-tuned on the Amazon Product Dataset, which contains 10 million examples, with the cleaned version having 0.5 million examples.
|
|
|
|
|
31 |
|
32 |
## Usage
|
33 |
|
|
|
89 |
## Features
|
90 |
|
91 |
- **Architecture**: t5-base (223M parameters)
|
92 |
+
- **Training Dataset**: Trained on 0.5 million cleaned examples
|
|
|
|
|
|
|
|
|
|
|
93 |
- **Training Time**:
|
94 |
- **Hardware**: Colab T4 GPU
|
95 |
- **Speed**: 4.91 iterations/second
|
|
|
103 |
|
104 |
## Data Preparation
|
105 |
|
106 |
+
- **Training Data**: First 250,000 examples for `train`
|
107 |
+
- **Validation Data**: First 40,000 examples for `validation`
|
108 |
- **Source Max Token Length**: 50
|
109 |
- **Target Max Token Length**: 300
|
110 |
- **Batch Size**: 1
|