Enable Flash Attention

#1
by tsdocode - opened

Hi there, I saw you mentioned in spaces that applying Flash Attention can make model faster 2x~4x, did you implement it yet?
I've tried implement Flash Attention based on MusicGen FlashAttention but this does not speedup any!

Sign up or log in to comment