Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models
Paper can be checked in Arxiv Preprint.
Code can be checked in GitHub.
Our proposed robust unlearning framework, AdvUnlearn, enhances diffusion models' safety by robustly erasing unwanted concepts through adversarial training, achieving an optimal balance between concept erasure and image generation quality.
Baselines
DM Unlearning Methods | Nudity | Van Gogh | Objects |
---|---|---|---|
ESD (Erased Stable Diffusion) | β | β | β |
FMN (Forget-Me-Not) | β | β | β |
AC (Ablating Concepts) | β | β | β |
UCE (Unified Concept Editing) | β | β | β |
SalUn (Saliency Unlearning) | β | β | β |
SH (ScissorHands) | β | β | β |
ED (EraseDiff) | β | β | β |
SPM (concept-SemiPermeable Membrane) | β | β | β |
AdvUnlearn (Ours) | β | β | β |
Cite Our Work
The preprint can be cited as follows:
@misc{zhang2024defensive,
title={Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models},
author={Yimeng Zhang and Xin Chen and Jinghan Jia and Yihua Zhang and Chongyu Fan and Jiancheng Liu and Mingyi Hong and Ke Ding and Sijia Liu},
year={2024},
eprint={2405.15234},
archivePrefix={arXiv},
primaryClass={cs.CV}
}