schirrmacher
commited on
Commit
•
c35fa1f
1
Parent(s):
31d06fa
Upload folder using huggingface_hub
Browse files- .DS_Store +0 -0
- README.md +10 -55
- models/.DS_Store +0 -0
- models/ormbg.pth +2 -2
- utils/pth_to_onnx.py +2 -2
.DS_Store
ADDED
Binary file (6.15 kB). View file
|
|
README.md
CHANGED
@@ -15,11 +15,9 @@ datasets:
|
|
15 |
|
16 |
[>>> DEMO <<<](https://huggingface.co/spaces/schirrmacher/ormbg)
|
17 |
|
18 |
-
Join our [Research Discord Group](https://discord.gg/YYZ3D66t)!
|
19 |
-
|
20 |
![](examples.jpg)
|
21 |
|
22 |
-
This model is a **fully open-source background remover** optimized for images with humans. It is based on [Highly Accurate Dichotomous Image Segmentation research](https://github.com/xuebinqin/DIS). The model was trained with the synthetic [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans),
|
23 |
|
24 |
![](explanation.jpg)
|
25 |
|
@@ -31,62 +29,19 @@ This model is similar to [RMBG-1.4](https://huggingface.co/briaai/RMBG-1.4), but
|
|
31 |
python utils/inference.py
|
32 |
```
|
33 |
|
34 |
-
## Training
|
35 |
-
|
36 |
-
The model was trained on a NVIDIA GeForce RTX 4090 (10.000 iterations) with the [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans) which was created with [LayerDiffuse](https://github.com/layerdiffusion/LayerDiffuse) and [IC-Light](https://github.com/lllyasviel/IC-Light).
|
37 |
-
|
38 |
-
## Want to train your own model?
|
39 |
-
|
40 |
-
Checkout _Highly Accurate Dichotomous Image Segmentation_ code:
|
41 |
-
|
42 |
-
```
|
43 |
-
git clone https://github.com/xuebinqin/DIS.git
|
44 |
-
cd DIS
|
45 |
-
```
|
46 |
-
|
47 |
-
Follow the installation instructions on https://github.com/xuebinqin/DIS?tab=readme-ov-file#1-clone-this-repo.
|
48 |
-
Download or create some data ([like this](https://huggingface.co/datasets/schirrmacher/humans)) and place it into the DIS project folder.
|
49 |
-
|
50 |
-
I am using the folder structure:
|
51 |
-
|
52 |
-
- training/im (images)
|
53 |
-
- training/gt (ground truth)
|
54 |
-
- validation/im (images)
|
55 |
-
- validation/gt (ground truth)
|
56 |
-
|
57 |
-
Apply this git patch for setting the right paths and remove normalization of images:
|
58 |
-
|
59 |
-
```
|
60 |
-
git apply dis-repo.patch
|
61 |
-
```
|
62 |
-
|
63 |
-
Start training:
|
64 |
-
|
65 |
-
```
|
66 |
-
cd IS-Net
|
67 |
-
python train_valid_inference_main.py
|
68 |
-
```
|
69 |
-
|
70 |
-
Export to ONNX (modify paths if needed):
|
71 |
-
|
72 |
-
```
|
73 |
-
python utils/pth_to_onnx.py
|
74 |
-
```
|
75 |
-
|
76 |
# Research
|
77 |
|
78 |
-
|
79 |
-
|
80 |
-
Currently I am doing research how to close this gap. Latest research is about creating segmented humans with [LayerDiffuse](https://github.com/layerdiffusion/LayerDiffuse) and then apply [IC-Light](https://github.com/lllyasviel/IC-Light) for creating realistic light effects and shadows.
|
81 |
|
82 |
-
|
83 |
|
84 |
-
|
85 |
|
86 |
-
|
|
|
|
|
87 |
|
88 |
-
|
89 |
|
90 |
-
-
|
91 |
-
-
|
92 |
-
- more various backgrounds needed
|
|
|
15 |
|
16 |
[>>> DEMO <<<](https://huggingface.co/spaces/schirrmacher/ormbg)
|
17 |
|
|
|
|
|
18 |
![](examples.jpg)
|
19 |
|
20 |
+
This model is a **fully open-source background remover** optimized for images with humans. It is based on [Highly Accurate Dichotomous Image Segmentation research](https://github.com/xuebinqin/DIS). The model was trained with the synthetic [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans), [P3M-10k](https://paperswithcode.com/dataset/p3m-10k) and [AIM-500](https://paperswithcode.com/dataset/aim-500).
|
21 |
|
22 |
![](explanation.jpg)
|
23 |
|
|
|
29 |
python utils/inference.py
|
30 |
```
|
31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
# Research
|
33 |
|
34 |
+
I started training the model
|
|
|
|
|
35 |
|
36 |
+
Synthetic datasets have limitations for achieving great segmentation results. This is because artificial lighting, occlusion, scale or backgrounds create a gap between synthetic and real images. A "model trained solely on synthetic data generated with naïve domain randomization struggles to generalize on the real domain", see [PEOPLESANSPEOPLE: A Synthetic Data Generator for Human-Centric Computer Vision (2022)](https://arxiv.org/pdf/2112.09290).
|
37 |
|
38 |
+
Latest changes (05/07/2024):
|
39 |
|
40 |
+
- Added [P3M-10K](https://paperswithcode.com/dataset/p3m-10k) dataset for training and validation
|
41 |
+
- Added [AIM-500](https://paperswithcode.com/dataset/aim-500) dataset for training and validation
|
42 |
+
- Applied [Grid Dropout](https://albumentations.ai/docs/api_reference/augmentations/dropout/grid_dropout/) to make the model smarter
|
43 |
|
44 |
+
Next steps:
|
45 |
|
46 |
+
- Expand dataset
|
47 |
+
- Research on multi-step segmentation by incorporating [ViTMatte](https://github.com/hustvl/ViTMatte)
|
|
models/.DS_Store
ADDED
Binary file (6.15 kB). View file
|
|
models/ormbg.pth
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ba5817f4d73b494e60d077b4fa2c008c90ad1dc1eb5a7234a958fb0a699907c2
|
3 |
+
size 176720018
|
utils/pth_to_onnx.py
CHANGED
@@ -30,7 +30,7 @@ def export_to_onnx(model_path, onnx_path):
|
|
30 |
dummy_input,
|
31 |
onnx_path,
|
32 |
export_params=True,
|
33 |
-
opset_version=
|
34 |
do_constant_folding=True,
|
35 |
input_names=["input"],
|
36 |
output_names=["output"],
|
@@ -50,7 +50,7 @@ if __name__ == "__main__":
|
|
50 |
parser.add_argument(
|
51 |
"--onnx_path",
|
52 |
type=str,
|
53 |
-
default="./models/
|
54 |
help="The path where the ONNX model will be saved.",
|
55 |
)
|
56 |
|
|
|
30 |
dummy_input,
|
31 |
onnx_path,
|
32 |
export_params=True,
|
33 |
+
opset_version=11,
|
34 |
do_constant_folding=True,
|
35 |
input_names=["input"],
|
36 |
output_names=["output"],
|
|
|
50 |
parser.add_argument(
|
51 |
"--onnx_path",
|
52 |
type=str,
|
53 |
+
default="./models/gpu_itr_28000_traLoss_0.102_traTarLoss_0.0105_valLoss_0.1293_valTarLoss_0.015_maxF1_0.9947_mae_0.0059_time_0.015454.pth",
|
54 |
help="The path where the ONNX model will be saved.",
|
55 |
)
|
56 |
|