GGUF for ARM inference?

#4
by AaronFeng753 - opened

Hello, could you release these gguf for ARM inference? I want to run this model on android. Thank you so much!

Q4_0_8_8
Q4_0_4_8
Q4_0_4_4

@AaronFeng753 added :)

No idea why, but so far these arm ggufs always crashed the apps I'm using. Tried Q4_0_4_8 and Q4_0_4_4. Snapdragon 8s 3. Apps : Chatterui and layla lite.

Q4_0_4_4 is the only one that works for me on the 8g2 even though allegedly Q4_0_4_8 should work.. it's very odd

I have no idea which instruction sets the iPhone 15 Pro Max supports but only the Q4_0_4_4 works for me as well, the others makes my app crash.

AaronFeng753 changed discussion status to closed

Sign up or log in to comment