hezarai
/

crnn-fa-printed-96-long

Model card Files Files and versions Community

arxyzan commited on Nov 27, 2023

Commit

820b26f

•

1 Parent(s): e614896

Update README.md

Files changed (1) hide show

README.md +8 -1

README.md CHANGED Viewed

@@ -7,4 +7,11 @@ tags:
 - hezar
 - image-to-text
 pipeline_tag: image-to-text
----

 - hezar
 - image-to-text
 pipeline_tag: image-to-text
+---
+A CRNN model for Persian OCR. This model is based on a simple CNN + LSTM architecture inspired by [this paper](https://arxiv.org/abs/1507.05717). This is a successor model to
+our previous model [hezarai/crnn-base-fa-64x256](https://huggingface.co/hezarai/crnn-base-fa-64x256). The dataset for training this model was almost 5 times larger and the
+maximum output length supported by this model has been increased from 32 to 48 characters. (The model can actually output 96 characters including blank but to tackle CTC decoding challenges no samples
+longer than 48 characters have been fed to the model).
+Note that this model is only optimized for printed/scanned documents and supports up to 50-ish characters. (For an end-to-end OCR pipeline, use a text detector model first to
+extract text boxes preferrably in word-level and then use this model), but it can be used to be fine-tuned on other domains like license plate or handwritten texts.