InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation
Abstract
In the field of personalized image generation, the ability to create images preserving concepts has significantly improved. Creating an image that naturally integrates multiple concepts in a cohesive and visually appealing composition can indeed be challenging. This paper introduces "InstantFamily," an approach that employs a novel masked cross-attention mechanism and a multimodal embedding stack to achieve zero-shot multi-ID image generation. Our method effectively preserves ID as it utilizes global and local features from a pre-trained face recognition model integrated with text conditions. Additionally, our masked cross-attention mechanism enables the precise control of multi-ID and composition in the generated images. We demonstrate the effectiveness of InstantFamily through experiments showing its dominance in generating images with multi-ID, while resolving well-known multi-ID generation problems. Additionally, our model achieves state-of-the-art performance in both single-ID and multi-ID preservation. Furthermore, our model exhibits remarkable scalability with a greater number of ID preservation than it was originally trained with.
Community
Weights?
Typo in figure2, gobal projection -> global projection
Is the code going to be released?
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models (2024)
- Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm (2024)
- From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation (2024)
- ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving (2024)
- ID-Animator: Zero-Shot Identity-Preserving Human Video Generation (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
waiting for code and pretrained checkpoints.
code or it did not happen :D
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper