schirrmacher commited on
Commit
c35fa1f
1 Parent(s): 31d06fa

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. .DS_Store +0 -0
  2. README.md +10 -55
  3. models/.DS_Store +0 -0
  4. models/ormbg.pth +2 -2
  5. utils/pth_to_onnx.py +2 -2
.DS_Store ADDED
Binary file (6.15 kB). View file
 
README.md CHANGED
@@ -15,11 +15,9 @@ datasets:
15
 
16
  [>>> DEMO <<<](https://huggingface.co/spaces/schirrmacher/ormbg)
17
 
18
- Join our [Research Discord Group](https://discord.gg/YYZ3D66t)!
19
-
20
  ![](examples.jpg)
21
 
22
- This model is a **fully open-source background remover** optimized for images with humans. It is based on [Highly Accurate Dichotomous Image Segmentation research](https://github.com/xuebinqin/DIS). The model was trained with the synthetic [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans), a dataset crafted with [LayerDiffuse](https://github.com/layerdiffusion/LayerDiffuse) and [IC-Light](https://github.com/lllyasviel/IC-Light).
23
 
24
  ![](explanation.jpg)
25
 
@@ -31,62 +29,19 @@ This model is similar to [RMBG-1.4](https://huggingface.co/briaai/RMBG-1.4), but
31
  python utils/inference.py
32
  ```
33
 
34
- ## Training
35
-
36
- The model was trained on a NVIDIA GeForce RTX 4090 (10.000 iterations) with the [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans) which was created with [LayerDiffuse](https://github.com/layerdiffusion/LayerDiffuse) and [IC-Light](https://github.com/lllyasviel/IC-Light).
37
-
38
- ## Want to train your own model?
39
-
40
- Checkout _Highly Accurate Dichotomous Image Segmentation_ code:
41
-
42
- ```
43
- git clone https://github.com/xuebinqin/DIS.git
44
- cd DIS
45
- ```
46
-
47
- Follow the installation instructions on https://github.com/xuebinqin/DIS?tab=readme-ov-file#1-clone-this-repo.
48
- Download or create some data ([like this](https://huggingface.co/datasets/schirrmacher/humans)) and place it into the DIS project folder.
49
-
50
- I am using the folder structure:
51
-
52
- - training/im (images)
53
- - training/gt (ground truth)
54
- - validation/im (images)
55
- - validation/gt (ground truth)
56
-
57
- Apply this git patch for setting the right paths and remove normalization of images:
58
-
59
- ```
60
- git apply dis-repo.patch
61
- ```
62
-
63
- Start training:
64
-
65
- ```
66
- cd IS-Net
67
- python train_valid_inference_main.py
68
- ```
69
-
70
- Export to ONNX (modify paths if needed):
71
-
72
- ```
73
- python utils/pth_to_onnx.py
74
- ```
75
-
76
  # Research
77
 
78
- Synthetic datasets have limitations for achieving great segmentation results. This is because artificial lighting, occlusion, scale or backgrounds create a gap between synthetic and real images. A "model trained solely on synthetic data generated with naïve domain randomization struggles to generalize on the real domain", see [PEOPLESANSPEOPLE: A Synthetic Data Generator for Human-Centric Computer Vision (2022)](https://arxiv.org/pdf/2112.09290).
79
-
80
- Currently I am doing research how to close this gap. Latest research is about creating segmented humans with [LayerDiffuse](https://github.com/layerdiffusion/LayerDiffuse) and then apply [IC-Light](https://github.com/lllyasviel/IC-Light) for creating realistic light effects and shadows.
81
 
82
- ## Support
83
 
84
- This is the first iteration of the model, so there will be improvements!
85
 
86
- If you identify cases were the model fails, <a href='https://huggingface.co/schirrmacher/ormbg/discussions' target='_blank'>upload your examples</a>!
 
 
87
 
88
- Known issues (work in progress):
89
 
90
- - close-ups: from above, from below, profile, from side
91
- - minor issues with hair segmentation when hair creates loops
92
- - more various backgrounds needed
 
15
 
16
  [>>> DEMO <<<](https://huggingface.co/spaces/schirrmacher/ormbg)
17
 
 
 
18
  ![](examples.jpg)
19
 
20
+ This model is a **fully open-source background remover** optimized for images with humans. It is based on [Highly Accurate Dichotomous Image Segmentation research](https://github.com/xuebinqin/DIS). The model was trained with the synthetic [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans), [P3M-10k](https://paperswithcode.com/dataset/p3m-10k) and [AIM-500](https://paperswithcode.com/dataset/aim-500).
21
 
22
  ![](explanation.jpg)
23
 
 
29
  python utils/inference.py
30
  ```
31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  # Research
33
 
34
+ I started training the model
 
 
35
 
36
+ Synthetic datasets have limitations for achieving great segmentation results. This is because artificial lighting, occlusion, scale or backgrounds create a gap between synthetic and real images. A "model trained solely on synthetic data generated with naïve domain randomization struggles to generalize on the real domain", see [PEOPLESANSPEOPLE: A Synthetic Data Generator for Human-Centric Computer Vision (2022)](https://arxiv.org/pdf/2112.09290).
37
 
38
+ Latest changes (05/07/2024):
39
 
40
+ - Added [P3M-10K](https://paperswithcode.com/dataset/p3m-10k) dataset for training and validation
41
+ - Added [AIM-500](https://paperswithcode.com/dataset/aim-500) dataset for training and validation
42
+ - Applied [Grid Dropout](https://albumentations.ai/docs/api_reference/augmentations/dropout/grid_dropout/) to make the model smarter
43
 
44
+ Next steps:
45
 
46
+ - Expand dataset
47
+ - Research on multi-step segmentation by incorporating [ViTMatte](https://github.com/hustvl/ViTMatte)
 
models/.DS_Store ADDED
Binary file (6.15 kB). View file
 
models/ormbg.pth CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ea91a08b901277e040640859312c76048a6505cea56ecdcdd3ce6b1a27cfe8d3
3
- size 176717548
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ba5817f4d73b494e60d077b4fa2c008c90ad1dc1eb5a7234a958fb0a699907c2
3
+ size 176720018
utils/pth_to_onnx.py CHANGED
@@ -30,7 +30,7 @@ def export_to_onnx(model_path, onnx_path):
30
  dummy_input,
31
  onnx_path,
32
  export_params=True,
33
- opset_version=10,
34
  do_constant_folding=True,
35
  input_names=["input"],
36
  output_names=["output"],
@@ -50,7 +50,7 @@ if __name__ == "__main__":
50
  parser.add_argument(
51
  "--onnx_path",
52
  type=str,
53
- default="./models/example.onnx",
54
  help="The path where the ONNX model will be saved.",
55
  )
56
 
 
30
  dummy_input,
31
  onnx_path,
32
  export_params=True,
33
+ opset_version=11,
34
  do_constant_folding=True,
35
  input_names=["input"],
36
  output_names=["output"],
 
50
  parser.add_argument(
51
  "--onnx_path",
52
  type=str,
53
+ default="./models/gpu_itr_28000_traLoss_0.102_traTarLoss_0.0105_valLoss_0.1293_valTarLoss_0.015_maxF1_0.9947_mae_0.0059_time_0.015454.pth",
54
  help="The path where the ONNX model will be saved.",
55
  )
56