Flash Attention Support

#41
by rameshch - opened

I see this constraint of Flash Attention not being supported currently with Llama-3.2-11B-Vision-Instruct model. Any assistance here

Any update of flash attention support ... specifically I am running into MllamaForConditionalGeneration does not support Flash Attention 2.0 yet.

Facing the same issue

Sign up or log in to comment