#Fine-tuned on the base IAM handwritten TrOCR model
Datasets used were:
- Imgur5k
- English Handwritten Characters from Kaggle
Outliers was removed using Z-Score for the image width and IQR for the image height. Cleaned Dataset: 208141 Original Dataset: 210122
Note that only 20 percent of the data was used and used random sampling of value 42. Number of training examples: 33302 Number of validation examples: 8326
I used these training arguments based on GPT's suggestion because it would be too expensive for me to run the original configuration with 100 percent of the data.
training_args = Seq2SeqTrainingArguments( predict_with_generate=True, eval_strategy="epoch", per_device_train_batch_size=16, per_device_eval_batch_size=16, fp16=True, output_dir="./", logging_steps=500, save_steps=5000, eval_steps=1000, num_train_epochs=2 )
CER: 0.082 WER: 0.184
license: mit datasets: - staghado/IMGUR-dataset language: - en metrics: - cer base_model: - microsoft/trocr-base-handwritten
- Downloads last month
- 6