MikkoLipsanen commited on
Commit
c5f8542
1 Parent(s): 4469b41

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +93 -0
README.md ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - Ultralytics/YOLOv8
4
+ ---
5
+
6
+ ## Text column and row line intersection detection from Finnish census records from the 1930s
7
+
8
+ The model is trained to find the intersection points of table column and cell lines from digitized census record documents
9
+ from the 1930s. The model has been trained using yolov8x by Ultralytics as the base model.
10
+
11
+
12
+ ## Intended uses & limitations
13
+
14
+ <img src='census_intersection_example.jpg' width='500'>
15
+
16
+ The model has been trained to detect intersection points from specific kinds of tables, and probably generalizes badly to other,
17
+ very different table types.
18
+
19
+ ## Training data
20
+
21
+ Training dataset consisted of 218 digitized and annotated documents containing tables, while validation
22
+ dataset contained 25 annotated document images.
23
+
24
+ ## Training procedure
25
+
26
+ This model was trained using 2 NVIDIA RTX A6000 GPUs with the following hyperparameters:
27
+
28
+ - image size: 2560
29
+ - initial learning rate (lr0): 0.00098
30
+ - final learning rate (lrf): 0.01285
31
+ - maximum number of detections per image (max_det): 500
32
+ - train batch size: 2
33
+ - epochs: 100
34
+ - patience: 30 epochs
35
+ - warmup_epochs: 3.91327
36
+ - optimizer: AdamW
37
+ - workers: 4
38
+ - momentum: 0.90725
39
+ - warmup_momentum: 0.72051
40
+ - weight_decay: 0.00061
41
+ - box loss weight (box): 9.34214
42
+ - classification loss weight (cls): 0.34133
43
+ - distribution focal loss weight (dfl): 1.83008
44
+ - hue augment (hsv_h): 0.01126
45
+ - saturation augment (hsv_s): 0.84221
46
+ - brightness augment (hsv_v): 0.435
47
+ - translation augment (translate): 0.11692
48
+ - scale augment (scale): 0.45713
49
+ - flip augment (fliplr): 0.38368
50
+ - mosaic augment (mosaic): 0.77082
51
+
52
+ Default settings were used for other training hyperparameters (find more information [here](https://docs.ultralytics.com/modes/train/#train-settings)).
53
+
54
+ Model training was performed using the following code:
55
+
56
+ ```
57
+ from ultralytics import YOLO
58
+
59
+ # Use pretrained Yolo segmentation model
60
+ model = YOLO('yolov8x.pt')
61
+
62
+ # Path to .yaml file where data location and object classes are defined
63
+ yaml_path = 'intersections.yaml'
64
+
65
+ # Start model training with the defined parameters
66
+ model.train(data=yaml_path, name='model_name', epochs=100, imgsz=2560, max_det=500, workers=4, optimizer='AdamW',
67
+ lr0=0.00098, lrf=0.01285, momentum=0.90725, weight_decay=0.00061, warmup_epochs=3.91327, warmup_momentum=0.72051,
68
+ box=9.34214, cls=0.34133, dfl=1.83008, hsv_h=0.01126, hsv_s=0.84221, hsv_v=0.435, translate=0.11692,
69
+ scale=0.45713, fliplr=0.38368, mosaic=0.77082, seed=42, val=True, patience=30, batch=2, device='0,1')
70
+ ```
71
+
72
+ ## Evaluation results
73
+
74
+ Evaluation results using the validation dataset are listed below:
75
+ |Class|Images|Class instances|Box precision|Box recall|Box mAP50|Box mAP50-95|Mask precision|Mask recall|Mask mAP50|Mask mAP50-95
76
+ -|-|-|-|-|-|-|-|-|-|-
77
+ Text line|574|43156|0.912|0.888|0.949|0.701|0.935|0.907|0.954|0.55
78
+
79
+ More information on the performance metrics can be found [here](https://docs.ultralytics.com/guides/yolo-performance-metrics/).
80
+
81
+ ## Inference
82
+
83
+ If the model file `tuomiokirja_lines_05122023.pt` is downloaded to a folder `\models\tuomiokirja_lines_05122023.pt`
84
+ and the input image path is `\data\image.jpg', inference can be perfomed using the following code:
85
+
86
+ ```
87
+ from ultralytics import YOLO
88
+
89
+ # Initialize model
90
+ model = YOLO(`\models\tuomiokirja_lines_05122023.pt`)
91
+ prediction_results = model.predict(source=`\data\image.jpg', save=True)
92
+ ```
93
+ More information for available inference arguments can be found [here](https://docs.ultralytics.com/modes/predict/#inference-arguments).