MikkoLipsanen
commited on
Commit
•
c5f8542
1
Parent(s):
4469b41
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,93 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model:
|
3 |
+
- Ultralytics/YOLOv8
|
4 |
+
---
|
5 |
+
|
6 |
+
## Text column and row line intersection detection from Finnish census records from the 1930s
|
7 |
+
|
8 |
+
The model is trained to find the intersection points of table column and cell lines from digitized census record documents
|
9 |
+
from the 1930s. The model has been trained using yolov8x by Ultralytics as the base model.
|
10 |
+
|
11 |
+
|
12 |
+
## Intended uses & limitations
|
13 |
+
|
14 |
+
<img src='census_intersection_example.jpg' width='500'>
|
15 |
+
|
16 |
+
The model has been trained to detect intersection points from specific kinds of tables, and probably generalizes badly to other,
|
17 |
+
very different table types.
|
18 |
+
|
19 |
+
## Training data
|
20 |
+
|
21 |
+
Training dataset consisted of 218 digitized and annotated documents containing tables, while validation
|
22 |
+
dataset contained 25 annotated document images.
|
23 |
+
|
24 |
+
## Training procedure
|
25 |
+
|
26 |
+
This model was trained using 2 NVIDIA RTX A6000 GPUs with the following hyperparameters:
|
27 |
+
|
28 |
+
- image size: 2560
|
29 |
+
- initial learning rate (lr0): 0.00098
|
30 |
+
- final learning rate (lrf): 0.01285
|
31 |
+
- maximum number of detections per image (max_det): 500
|
32 |
+
- train batch size: 2
|
33 |
+
- epochs: 100
|
34 |
+
- patience: 30 epochs
|
35 |
+
- warmup_epochs: 3.91327
|
36 |
+
- optimizer: AdamW
|
37 |
+
- workers: 4
|
38 |
+
- momentum: 0.90725
|
39 |
+
- warmup_momentum: 0.72051
|
40 |
+
- weight_decay: 0.00061
|
41 |
+
- box loss weight (box): 9.34214
|
42 |
+
- classification loss weight (cls): 0.34133
|
43 |
+
- distribution focal loss weight (dfl): 1.83008
|
44 |
+
- hue augment (hsv_h): 0.01126
|
45 |
+
- saturation augment (hsv_s): 0.84221
|
46 |
+
- brightness augment (hsv_v): 0.435
|
47 |
+
- translation augment (translate): 0.11692
|
48 |
+
- scale augment (scale): 0.45713
|
49 |
+
- flip augment (fliplr): 0.38368
|
50 |
+
- mosaic augment (mosaic): 0.77082
|
51 |
+
|
52 |
+
Default settings were used for other training hyperparameters (find more information [here](https://docs.ultralytics.com/modes/train/#train-settings)).
|
53 |
+
|
54 |
+
Model training was performed using the following code:
|
55 |
+
|
56 |
+
```
|
57 |
+
from ultralytics import YOLO
|
58 |
+
|
59 |
+
# Use pretrained Yolo segmentation model
|
60 |
+
model = YOLO('yolov8x.pt')
|
61 |
+
|
62 |
+
# Path to .yaml file where data location and object classes are defined
|
63 |
+
yaml_path = 'intersections.yaml'
|
64 |
+
|
65 |
+
# Start model training with the defined parameters
|
66 |
+
model.train(data=yaml_path, name='model_name', epochs=100, imgsz=2560, max_det=500, workers=4, optimizer='AdamW',
|
67 |
+
lr0=0.00098, lrf=0.01285, momentum=0.90725, weight_decay=0.00061, warmup_epochs=3.91327, warmup_momentum=0.72051,
|
68 |
+
box=9.34214, cls=0.34133, dfl=1.83008, hsv_h=0.01126, hsv_s=0.84221, hsv_v=0.435, translate=0.11692,
|
69 |
+
scale=0.45713, fliplr=0.38368, mosaic=0.77082, seed=42, val=True, patience=30, batch=2, device='0,1')
|
70 |
+
```
|
71 |
+
|
72 |
+
## Evaluation results
|
73 |
+
|
74 |
+
Evaluation results using the validation dataset are listed below:
|
75 |
+
|Class|Images|Class instances|Box precision|Box recall|Box mAP50|Box mAP50-95|Mask precision|Mask recall|Mask mAP50|Mask mAP50-95
|
76 |
+
-|-|-|-|-|-|-|-|-|-|-
|
77 |
+
Text line|574|43156|0.912|0.888|0.949|0.701|0.935|0.907|0.954|0.55
|
78 |
+
|
79 |
+
More information on the performance metrics can be found [here](https://docs.ultralytics.com/guides/yolo-performance-metrics/).
|
80 |
+
|
81 |
+
## Inference
|
82 |
+
|
83 |
+
If the model file `tuomiokirja_lines_05122023.pt` is downloaded to a folder `\models\tuomiokirja_lines_05122023.pt`
|
84 |
+
and the input image path is `\data\image.jpg', inference can be perfomed using the following code:
|
85 |
+
|
86 |
+
```
|
87 |
+
from ultralytics import YOLO
|
88 |
+
|
89 |
+
# Initialize model
|
90 |
+
model = YOLO(`\models\tuomiokirja_lines_05122023.pt`)
|
91 |
+
prediction_results = model.predict(source=`\data\image.jpg', save=True)
|
92 |
+
```
|
93 |
+
More information for available inference arguments can be found [here](https://docs.ultralytics.com/modes/predict/#inference-arguments).
|