File size: 2,580 Bytes
150d962 d7b2280 8869360 150d962 d7b2280 150d962 8869360 656b5ab 1e165f4 c35fa1f 94f09f0 150d962 656b5ab 150d962 2ff84f1 1351f60 2ff84f1 c35fa1f 150d962 c35fa1f 8751bf1 c35fa1f 8751bf1 c35fa1f 8751bf1 1351f60 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
---
license: apache-2.0
tags:
- segmentation
- remove background
- background
- background-removal
- Pytorch
pretty_name: Open Remove Background Model
datasets:
- schirrmacher/humans
---
# Open Remove Background Model (ormbg)
[>>> DEMO <<<](https://huggingface.co/spaces/schirrmacher/ormbg)
Join our [Research Discord Group](https://discord.gg/YYZ3D66t)!
![](examples/image/image01_no_background.png)
This model is a **fully open-source background remover** optimized for images with humans. It is based on [Highly Accurate Dichotomous Image Segmentation research](https://github.com/xuebinqin/DIS). The model was trained with the synthetic [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans), [P3M-10k](https://paperswithcode.com/dataset/p3m-10k) and [AIM-500](https://paperswithcode.com/dataset/aim-500).
This model is similar to [RMBG-1.4](https://huggingface.co/briaai/RMBG-1.4), but with open training data/process and commercially free to use.
## Inference
```
python ormbg/inference.py
```
## Training
Install dependencies:
```
conda env create -f environment.yaml
conda activate ormbg
```
Replace dummy dataset with (training dataset)[https://huggingface.co/datasets/schirrmacher/humans].
```
python3 ormbg/train_model.py
```
# Research
I started training the model with synthetic images of the [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans) crafted with [LayerDiffuse](https://github.com/layerdiffusion/LayerDiffuse). However, I noticed that the model struggles to perform well on real images.
Synthetic datasets have limitations for achieving great segmentation results. This is because artificial lighting, occlusion, scale or backgrounds create a gap between synthetic and real images. A "model trained solely on synthetic data generated with naïve domain randomization struggles to generalize on the real domain", see [PEOPLESANSPEOPLE: A Synthetic Data Generator for Human-Centric Computer Vision (2022)](https://arxiv.org/pdf/2112.09290).
Latest changes (05/07/2024):
- Added [P3M-10K](https://paperswithcode.com/dataset/p3m-10k) dataset for training and validation
- Added [AIM-500](https://paperswithcode.com/dataset/aim-500) dataset for training and validation
- Applied [Grid Dropout](https://albumentations.ai/docs/api_reference/augmentations/dropout/grid_dropout/) to make the model smarter
Next steps:
- Expand dataset with synthetic and real images
- Research on multi-step segmentation/matting by incorporating [ViTMatte](https://github.com/hustvl/ViTMatte)
|