THAI-BLIP-2
fine-tuned for image captioning task from blip2-opt-2.7b-coco with MSCOCO2017 thai caption.
How to use:
from transformers import Blip2ForConditionalGeneration, Blip2Processor
from PIL import Image
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
processor = Blip2Processor.from_pretrained("kkatiz/THAI-BLIP-2")
model = Blip2ForConditionalGeneration.from_pretrained("kkatiz/THAI-BLIP-2", device_map=device, torch_dtype=torch.bfloat16)
img = Image.open("Your image...")
inputs = processor(images=img, return_tensors="pt").to(device, torch.bfloat16)
# Adjust your `max_length`
generated_ids = model.generate(**inputs, max_length=20)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)
print(generated_text)
- Downloads last month
- 16
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for kkatiz/THAI-BLIP-2
Base model
Salesforce/blip2-opt-2.7b-coco