File size: 1,904 Bytes
b461971 6ca20df b28fc4b b461971 2c6047b b461971 2c6047b b461971 455c955 2c6047b b461971 2c6047b b461971 2c6047b b461971 2c6047b b461971 2c6047b b461971 2c6047b b461971 2c6047b b461971 2c6047b b461971 2c6047b b461971 2c6047b b461971 2c6047b b461971 2c6047b b461971 2c6047b c1fe742 2c6047b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
---
library_name: transformers
tags:
- llama-factory
- yi-vl
- llava
license: other
language:
- zh
- en
pipeline_tag: visual-question-answering
---
This is the Huggingface version of [Yi-VL-6B](https://huggingface.co/01-ai/Yi-VL-6B) model.
You may use this model for fine-tuning in downstream tasks, we recommend using our efficient fine-tuning toolkit. https://github.com/hiyouga/LLaMA-Factory
- **Developed by:** [01-AI](https://www.01.ai/).
- **Language(s) (NLP):** Chinese/English
- **License:** [Yi Series Model License](https://huggingface.co/01-ai/Yi-VL-34B/blob/main/LICENSE)
Usage:
```python
import requests
from PIL import Image
import torch
from transformers import AutoProcessor, AutoModelForVision2Seq
model_id = "BUAADreamer/Yi-VL-6B-hf"
messages = [
{ "role": "user", "content": "What's in the picture?" }
]
model = AutoModelForVision2Seq.from_pretrained(
model_id,
torch_dtype=torch.float16,
low_cpu_mem_usage=True,
).to(0)
processor = AutoProcessor.from_pretrained(model_id)
text = [processor.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False)]
images = [Image.open(requests.get(image_file, stream=True).raw)]
inputs = processor(text=prompt, images=images, return_tensors='pt').to(0, torch.float16)
output = model.generate(**inputs, max_new_tokens=200)
output = processor.batch_decode(output, skip_special_tokens=True)
print(output.split("Assistant:")[-1].strip())
```
You could also alternatively launch a Web demo by using the CLI command in [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
```bash
llamafactory-cli webchat \
--model_name_or_path BUAADreamer/Yi-VL-6B-hf \
--template yivl \
--visual_inputs
```
# [lmms-eval Evaluation Results](https://github.com/EvolvingLMMs-Lab/lmms-eval)
| Metric |Value|
|---------------------------------|----:|
| MMMU_val |36.8|
|CMMMU_val |32.2| |