YifanXu
/

libra-vision-tokenizer

Model card Files Files and versions Community

Edit model card

Libra Vision Tokenizer

Libra: Building Decoupled Vision System on Large Language Models

This repo provides the pretrained weight of Libra vision tokenizer trained with lookup-free quantization.

!!! NOTE !!!

Please merge the weights into llama-2-7b-chat-hf-libra (huggingface version of LLaMA2-7B-Chat).
Please download the pretrained CLIP model in huggingface and merge it into the path. The CLIP model can be downloaded here.

The files should be organized as:

llama-2-7b-chat-hf-libra/
|
│   # original llama files
|
├── ...
│   
│   # newly added vision tokenizer
│   
├── vision_tokenizer_config.yaml
├── vqgan.ckpt
│
│   # CLIP model
│
└── openai-clip-vit-large-patch14-336/
    └── ...

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference API

Unable to determine this model's library. Check the docs .

Collection including YifanXu/libra-vision-tokenizer

Libra

The official repo for the ICML2024 paper: Libra: Building Decoupled Vision System on Large Language Models • 3 items • Updated May 16