Kudos, impressive model
#5
by
fblgit
- opened
Yo, amazing stuff .. speechless..
This is much more simpler than what people thinks.. definitively the community need to dig further on this and work harder. I'll be releasing some stuff using these LLaVa.
I still need some time for experimentation, but I wonder wether the vision can be reverted to project instead an embedding of a latent or merely noise|de-noise mask ? Have u tried anything such as this?
Definitively, will try to port anything of UNA into this.. i still need to figure out a bit more the code and get familiar with the architecture of LLaVa π
fblgit
changed discussion status to
closed
How do you run inference? Do you have a sample script? I haven't used LLaVA before.