pcuenq
/

mdm-flickr-64

Model card Files Files and versions Community

mdm-flickr-64 / README.md

pcuenq's picture

pcuenq HF staff

Upload folder using huggingface_hub (#1)

55c1df8 verified 3 months ago

|

history blame contribute delete

1.76 kB

	---
	license: apple-ascl
	tags:
	- mdm
	---

	# Matryoshka Diffusion Models

	Matryoshka Diffusion Models was introduced in [the paper of the same name](https://huggingface.co/papers/2310.15111), by Jiatao Gu,Shuangfei Zhai, Yizhe Zhang, Josh Susskind, Navdeep Jaitly.

	This repository contains the Flickr 64 checkpoint.

	![Generation Examples from the MDM repository](samples.png)

	### Highlights

	* This checkpoint was trained on a dataset of 50M text-image pairs collected from Flickr.
	* This model was trained using a single UNet (not nested), and generates images with a resolution of 64 × 64.
	* Despite training on relatively small datasets, MDMs show strong zero-shot capabilities of generating high-resolution images and videos.

	## Checkpoints

	\| Model \| Dataset \| Resolution \| Nested UNets \|
	\|---------------------------------------------------------\|------------\|-------------\|--------------\|
	\| [mdm-flickr-64](https://hf.co/pcuenq/mdm-flickr-64) \| Flickr 50M \| 64 × 64 \| ❎ \|
	\| [mdm-flickr-256](https://hf.co/pcuenq/mdm-flickr-256) \| Flickr 50M \| 256 × 256 \| ✅ \|
	\| [mdm-flickr-1024](https://hf.co/pcuenq/mdm-flickr-1024) \| Flickr 50M \| 1024 × 1024 \| ✅ \|

	## How to Use

	Please, refer to the [original repository](https://github.com/apple/ml-mdm) for training and inference instructions.

	## Citation

	```
	@misc{gu2023matryoshkadiffusionmodels,
	title={Matryoshka Diffusion Models},
	author={Jiatao Gu and Shuangfei Zhai and Yizhe Zhang and Josh Susskind and Navdeep Jaitly},
	year={2023},
	eprint={2310.15111},
	archivePrefix={arXiv},
	primaryClass={cs.CV},
	url={https://arxiv.org/abs/2310.15111},
	}
	```