title: Open Remove Background Model (ormbg)
license: apache-2.0
tags:
- segmentation
- remove background
- background
- background-removal
- Pytorch
pretty_name: Open Remove Background Model
models:
- schirrmacher/ormbg
datasets:
- schirrmacher/humans
emoji: 💻
colorFrom: red
colorTo: red
sdk: gradio
sdk_version: 4.29.0
app_file: hf_space/app.py
pinned: false
Open Remove Background Model (ormbg)
Join our Research Discord Group!
This model is a fully open-source background remover optimized for images with humans. It is based on Highly Accurate Dichotomous Image Segmentation research. The model was trained with the synthetic Human Segmentation Dataset, P3M-10k, PPM-100 and AIM-500.
This model is similar to RMBG-1.4, but with open training data/process and commercially free to use.
Inference
python ormbg/inference.py
Training
Install dependencies:
conda env create -f environment.yaml
conda activate ormbg
Replace dummy dataset with training dataset.
python3 ormbg/train_model.py
Research
I started training the model with synthetic images of the Human Segmentation Dataset crafted with LayerDiffuse. However, I noticed that the model struggles to perform well on real images.
Synthetic datasets have limitations for achieving great segmentation results. This is because artificial lighting, occlusion, scale or backgrounds create a gap between synthetic and real images. A "model trained solely on synthetic data generated with naïve domain randomization struggles to generalize on the real domain", see PEOPLESANSPEOPLE: A Synthetic Data Generator for Human-Centric Computer Vision (2022).
Next steps:
- Expand dataset with synthetic and real images
- Research on state of the art loss functions
Latest changes (26/07/2024):
- Created synthetic dataset with 10k images, crafted with BlenderProc
- Removed training data created with LayerDiffuse, since it lacks the accuracy needed
- Improved model performance (after 100k iterations):
- F1: 0.9888 -> 0.9932
- MAE: 0.0113 -> 0.008
- Scores based on this validation dataset
05/07/2024
- Added P3M-10K dataset for training and validation
- Added AIM-500 dataset for training and validation
- Added PPM-100 dataset for training and validation
- Applied Grid Dropout to make the model smarter