Support added to openedai-vision
Great model! It's really smart and gets fine details well. Way more than just captions now.
Just FYI, and I hope it was ok (I took some liberties with your spaces code), I added support for llama-joycaption-alpha-two (from the demo space) to openedai-vision and also added multi-image support. See: https://github.com/matatonic/openedai-vision/blob/main/backend/joy-caption-latest.py
I'll add this repo once it's done. Thanks again!
Very cool, thank you.
I hope it was ok
Of course, the model is free for you and everyone to use!
And I recommend checking out the example code on the github now: https://github.com/fpgaminer/joycaption?tab=readme-ov-file#example-usage
I've simplified the usage now that the model is packaged into HF's Llava class. (No Processor yet, but at least this eliminates the custom ImageAdapter class and such).