Edit model card

ko-deplot

ko-deplot is a korean Visual-QA model based on the Google's Pix2Struct architecture. It was fine-tuned from Deplot, using korean chart image-text pairs.

ko-deplot은 Google의 Pix2Struct ꡬ쑰λ₯Ό 기반으둜 ν•œ ν•œκ΅­μ–΄ Visual-QA λͺ¨λΈμž…λ‹ˆλ‹€. Deplot λͺ¨λΈμ„ ν•œκ΅­μ–΄ 차트 이미지-ν…μŠ€νŠΈ 쌍 데이터셋을 μ΄μš©ν•˜μ—¬ νŒŒμΈνŠœλ‹ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

  • Developed by: NUUA
  • Model type: Visual Question Answering
  • License: apache-2.0
  • Finetuned from model: google/deplot

Model Usage

You can run a prediction by querying an input image together with a question as follows:

μ•„λž˜μ˜ μ½”λ“œλ₯Ό μ΄μš©ν•˜μ—¬ λͺ¨λΈ 좔둠을 ν•  수 μžˆμŠ΅λ‹ˆλ‹€:

from transformers import Pix2StructProcessor, Pix2StructForConditionalGeneration
from PIL import Image

processor = Pix2StructProcessor.from_pretrained('nuua/ko-deplot')
model = Pix2StructForConditionalGeneration.from_pretrained('nuua/ko-deplot')

IMAGE_PATH = "LOCAL_PATH_TO_IMAGE"
image = Image.open(IMAGE_PATH)

inputs = processor(images=image, text="Generate underlying data table of the figure below:", return_tensors="pt")
predictions = model.generate(**inputs, max_new_tokens=512)
print(processor.decode(predictions[0], skip_special_tokens=True))

Tokenizer Details

The model's tokenizer vocab was extended from 50,344 to 65,536 tokens using the following:

λͺ¨λΈμ˜ tokenizer vocab을 50344κ°œμ—μ„œ 65536개둜 μ•„λž˜λ₯Ό μ΄μš©ν•˜μ—¬ ν™•μž₯μ‹œν‚¨ ν›„ ν•™μŠ΅μ„ μ§„ν–‰ν•˜μ˜€μŠ΅λ‹ˆλ‹€:

Training Details

Training Data

Synthetic chart data from three libraries were used:

μ„Έ 개의 λΌμ΄λΈŒλŸ¬λ¦¬μ—μ„œ ν•©μ„± 차트 데이터λ₯Ό μƒμ„±ν•˜μ—¬ μ‚¬μš©ν•˜μ˜€μŠ΅λ‹ˆλ‹€:

Training Procedure

The model was first exposed to a short warmup stage, following its original paper. It was then trained using the chart data for 50,000 steps.

ν•™μŠ΅μ„ μœ„ν•΄ 처음 짧은 "warmup" 단계λ₯Ό 거쳐 ν•œκΈ€μ„ ν•™μŠ΅μ‹œν‚¨ ν›„ 50,000 μŠ€ν… λ™μ•ˆ 차트 데이터λ₯Ό ν•™μŠ΅μ‹œμΌ°μŠ΅λ‹ˆλ‹€.

Technical Specifications

Hardware

ko-deplot was trained by using A100 80G.

A100 80G GPUλ₯Ό μ΄μš©ν•˜μ—¬ ν•™μŠ΅ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

Contact

Any questions and suggestions, please use the discussion tab. If you want to contact us directly, email [email protected].

Downloads last month
226
Safetensors
Model size
306M params
Tensor type
F32
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for nuua/ko-deplot

Base model

google/deplot
Finetuned
(1)
this model

Spaces using nuua/ko-deplot 3