๐จ a new vision language model with 9x less image tokens, super efficient
๐ aligned with DPO for reducing hallucinations
โก๏ธ Apache 2.0 license ๐ฅ
Demo hf.co/spaces/NexaAIDev/omnivlm-dpo-demo
Model NexaAIDev/omnivision-968M
How long did it take to reply and what are your context window limits? Model type?
it takes 3-5 seconds to reply when the prompt is longer than 30-50 words on average but it increases linearly with number of tokens in the prompt, the one on the picture is llama 3 1b but the one i'm using right now is arco 2 which is a llama model, cannot keep any kind of general knowledge, i noticed with qwen 2 (and later confirmed with meta's model) that you don't need a lot of parameters to get general knowledge, you just need tons of data
as a model-tweaker is such a huge relief to know we have hf for years to come
@karpathy , @teknium and @KnutJaegersberg are cool guys to follow, but seriously, what happened to @TheBloke ?
congrats on this achievement!
Sorry, forgot to add it. It's added now as apache license.