Reading/Expts - a NagaSaiAbhinay Collection

NagaSaiAbhinay 's Collections

updated Sep 12

A collection of papers that I'm excited to experiment on. Once the expt is done, I'll probably move them to other collections.

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

Paper • 2408.07666 • Published Aug 14 • 2

Note Looking to apply it to DiT's. Sayak already did a simple weight merge for Flux dev and schnell. Can we extend and see what results can be derived from Flux ? For Auraflow, v0.2 has better prompt adherence than v0.3 but lower aesthetic quality. Can we extend that by figuring out a way to retain prompt adherence while also improving quality ? > Doesn't work, v0.2 and v0.3 have different pos embed dims.
CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models

Paper • 2407.15886 • Published Jul 21 • 1

Note The proposed unet architecture is pretty small and lightweight for a condition diffusion model. Can we extend this to use for other models like Canny conditioning ? Will have to sacrifice text condition or can we keep it ?
RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control

Paper • 2405.17401 • Published May 27 • 5

Note Based on two ideas: 1. SOC: Stochastic Optimal Control which is very similar to moving the point on the manifold closer as in inverse problems. But uses a style descriptor. 2. AFA: Averaging Forward Attention (iirc?)