Kansallisarkisto
/

censusrecords-table-detection

Object Detection

Model card Files Files and versions Community

MikkoLipsanen commited on 9 days ago

Commit

c5f8542

•

1 Parent(s): 4469b41

Create README.md

Browse files

Files changed (1) hide show

README.md +93 -0

README.md ADDED Viewed

	@@ -0,0 +1,93 @@

+---
+base_model:
+- Ultralytics/YOLOv8
+---
+## Text column and row line intersection detection from Finnish census records from the 1930s
+The model is trained to find the intersection points of table column and cell lines from digitized census record documents
+from the 1930s. The model has been trained using yolov8x by Ultralytics as the base model.
+## Intended uses & limitations
+<img src='census_intersection_example.jpg' width='500'>
+The model has been trained to detect intersection points from specific kinds of tables, and probably generalizes badly to other,
+very different table types.
+## Training data
+Training dataset consisted of 218 digitized and annotated documents containing tables, while validation
+dataset contained 25 annotated document images.
+## Training procedure
+This model was trained using 2 NVIDIA RTX A6000 GPUs with the following hyperparameters:
+- image size: 2560
+- initial learning rate (lr0): 0.00098
+- final learning rate (lrf): 0.01285
+- maximum number of detections per image (max_det): 500
+- train batch size: 2
+- epochs: 100
+- patience: 30 epochs
+- warmup_epochs: 3.91327
+- optimizer: AdamW
+- workers: 4
+- momentum: 0.90725
+- warmup_momentum: 0.72051
+- weight_decay: 0.00061
+- box loss weight (box): 9.34214
+- classification loss weight (cls): 0.34133
+- distribution focal loss weight (dfl): 1.83008
+- hue augment (hsv_h): 0.01126
+- saturation augment (hsv_s): 0.84221
+- brightness augment (hsv_v): 0.435
+- translation augment (translate): 0.11692
+- scale augment (scale): 0.45713
+- flip augment (fliplr): 0.38368
+- mosaic augment (mosaic): 0.77082
+Default settings were used for other training hyperparameters (find more information [here](https://docs.ultralytics.com/modes/train/#train-settings)).
+Model training was performed using the following code:
+```
+from ultralytics import YOLO
+# Use pretrained Yolo segmentation model
+model = YOLO('yolov8x.pt')
+# Path to .yaml file where data location and object classes are defined
+yaml_path = 'intersections.yaml'
+# Start model training with the defined parameters
+model.train(data=yaml_path, name='model_name', epochs=100, imgsz=2560, max_det=500, workers=4, optimizer='AdamW',
+            lr0=0.00098, lrf=0.01285, momentum=0.90725, weight_decay=0.00061, warmup_epochs=3.91327, warmup_momentum=0.72051,
+            box=9.34214, cls=0.34133, dfl=1.83008, hsv_h=0.01126, hsv_s=0.84221, hsv_v=0.435, translate=0.11692,
+            scale=0.45713, fliplr=0.38368, mosaic=0.77082, seed=42, val=True, patience=30, batch=2, device='0,1')
+```
+## Evaluation results
+Evaluation results using the validation dataset are listed below:
+|Class|Images|Class instances|Box precision|Box recall|Box mAP50|Box mAP50-95|Mask precision|Mask recall|Mask mAP50|Mask mAP50-95
+-|-|-|-|-|-|-|-|-|-|-
+Text line|574|43156|0.912|0.888|0.949|0.701|0.935|0.907|0.954|0.55
+More information on the performance metrics can be found [here](https://docs.ultralytics.com/guides/yolo-performance-metrics/).
+## Inference
+If the model file `tuomiokirja_lines_05122023.pt` is downloaded to a folder `\models\tuomiokirja_lines_05122023.pt`
+and the input image path is `\data\image.jpg', inference can be perfomed using the following code:
+```
+from ultralytics import YOLO
+# Initialize model
+model = YOLO(`\models\tuomiokirja_lines_05122023.pt`)
+prediction_results = model.predict(source=`\data\image.jpg', save=True)
+```
+More information for available inference arguments can be found [here](https://docs.ultralytics.com/modes/predict/#inference-arguments).