tvl-mini
Description
This is LORA finetune of Qwen2-VL-2B on russian language.
Data
Dataset contains:
- GrandMaster-PRO-MAX dataset (68k samples)
- Visual Reasoning (36k samples) #Training in progress
- Captioning (34k samples) #Training in progress
- Knowledgeable VQA (35k samples) #Training in progress
- VQA (80k samples) #Training in progress
- Classification (21k samples) #Training in progress
- Conversations (11k samples) #Training in progress
Bechmarks
TODO
Quickstart
Your can simply run this notebook or run code below.
First install qwen-vl-utils and dev version of transformers:
pip install qwen-vl-utils
pip install --no-cache-dir git+https://github.com/huggingface/transformers@19e6e80e10118f855137b90740936c0b11ac397f
And then run:
from transformers import Qwen2VLForConditionalGeneration, AutoTokenizer, AutoProcessor
from qwen_vl_utils import process_vision_info
import torch
model = Qwen2VLForConditionalGeneration.from_pretrained(
"2Vasabi/tvl-mini-0.1", torch_dtype=torch.bfloat16, device_map="auto"
)
processor = AutoProcessor.from_pretrained("Qwen/Qwen2-VL-2B-Instruct")
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"image": "https://i.ibb.co/d0QL8s6/images.jpg",
},
{"type": "text", "text": "Кратко опиши что ты видишь на изображении"},
],
}
]
text = processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(
text=[text],
images=image_inputs,
videos=video_inputs,
padding=True,
return_tensors="pt",
)
inputs = inputs.to("cuda")
generated_ids = model.generate(**inputs, max_new_tokens=1000)
generated_ids_trimmed = [
out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_text)
- Downloads last month
- 71
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for 2Vasabi/tvl-mini-0.1
Base model
Qwen/Qwen2-VL-2B-Instruct