File size: 2,580 Bytes
150d962
 
 
d7b2280
8869360
 
 
 
150d962
d7b2280
 
150d962
 
 
 
8869360
 
656b5ab
 
 
1e165f4
c35fa1f
94f09f0
150d962
 
 
 
 
656b5ab
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
150d962
 
2ff84f1
 
1351f60
2ff84f1
c35fa1f
150d962
c35fa1f
8751bf1
c35fa1f
 
 
8751bf1
c35fa1f
8751bf1
1351f60
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
license: apache-2.0
tags:
  - segmentation
  - remove background
  - background
  - background-removal
  - Pytorch
pretty_name: Open Remove Background Model
datasets:
  - schirrmacher/humans
---

# Open Remove Background Model (ormbg)

[>>> DEMO <<<](https://huggingface.co/spaces/schirrmacher/ormbg)

Join our [Research Discord Group](https://discord.gg/YYZ3D66t)!

![](examples/image/image01_no_background.png)

This model is a **fully open-source background remover** optimized for images with humans. It is based on [Highly Accurate Dichotomous Image Segmentation research](https://github.com/xuebinqin/DIS). The model was trained with the synthetic [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans), [P3M-10k](https://paperswithcode.com/dataset/p3m-10k) and [AIM-500](https://paperswithcode.com/dataset/aim-500).

This model is similar to [RMBG-1.4](https://huggingface.co/briaai/RMBG-1.4), but with open training data/process and commercially free to use.

## Inference

```
python ormbg/inference.py
```

## Training

Install dependencies:

```
conda env create -f environment.yaml
conda activate ormbg
```

Replace dummy dataset with (training dataset)[https://huggingface.co/datasets/schirrmacher/humans].

```
python3 ormbg/train_model.py
```

# Research

I started training the model with synthetic images of the [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans) crafted with [LayerDiffuse](https://github.com/layerdiffusion/LayerDiffuse). However, I noticed that the model struggles to perform well on real images.

Synthetic datasets have limitations for achieving great segmentation results. This is because artificial lighting, occlusion, scale or backgrounds create a gap between synthetic and real images. A "model trained solely on synthetic data generated with naïve domain randomization struggles to generalize on the real domain", see [PEOPLESANSPEOPLE: A Synthetic Data Generator for Human-Centric Computer Vision (2022)](https://arxiv.org/pdf/2112.09290).

Latest changes (05/07/2024):

- Added [P3M-10K](https://paperswithcode.com/dataset/p3m-10k) dataset for training and validation
- Added [AIM-500](https://paperswithcode.com/dataset/aim-500) dataset for training and validation
- Applied [Grid Dropout](https://albumentations.ai/docs/api_reference/augmentations/dropout/grid_dropout/) to make the model smarter

Next steps:

- Expand dataset with synthetic and real images
- Research on multi-step segmentation/matting by incorporating [ViTMatte](https://github.com/hustvl/ViTMatte)