google
/

matcha-chart2text-statista

Visual Question Answering

image-text-to-text

Model card Files Files and versions Community

nielsr HF staff commited on Jul 22, 2023

Commit

3f270f4

•

1 Parent(s): 2176166

Update README.md

Files changed (1) hide show

README.md +22 -5

README.md CHANGED Viewed

@@ -8,6 +8,8 @@ language:
 inference: false
 pipeline_tag: visual-question-answering
 license: apache-2.0
 ---
 # Model card for MatCha - fine-tuned on Chart2text-statista
@@ -30,7 +32,26 @@ The abstract of the paper states that:
 # Using the model
-## Converting from T5x to huggingface
 You can use the [`convert_pix2struct_checkpoint_to_pytorch.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/models/pix2struct/convert_pix2struct_original_pytorch_to_hf.py) script as follows:
 ```bash
@@ -51,10 +72,6 @@ model.push_to_hub("USERNAME/MODEL_NAME")
 processor.push_to_hub("USERNAME/MODEL_NAME")
 ```
-## Run predictions
-To run predictions, refer to the [instructions presented in the `matcha-chartqa` model card](https://huggingface.co/ybelkada/matcha-chartqa#get-predictions-from-the-model).
 # Contribution
 This model was originally contributed by Fangyu Liu, Francesco Piccinno et al. and added to the Hugging Face ecosystem by [Younes Belkada](https://huggingface.co/ybelkada).

 inference: false
 pipeline_tag: visual-question-answering
 license: apache-2.0
+tags:
+- matcha
 ---
 # Model card for MatCha - fine-tuned on Chart2text-statista
 # Using the model
+You should ask specific questions to the model in order to get consistent generations. Here we are asking the model whether the sum of values that are in a chart are greater than the largest value.
+```python
+from transformers import Pix2StructProcessor, Pix2StructForConditionalGeneration
+import requests
+from PIL import Image
+processor = Pix2StructProcessor.from_pretrained('google/matcha-chart2text-statista')
+model = Pix2StructForConditionalGeneration.from_pretrained('google/matcha-chart2text-statista')
+url = "https://raw.githubusercontent.com/vis-nlp/ChartQA/main/ChartQA%20Dataset/val/png/20294671002019.png"
+image = Image.open(requests.get(url, stream=True).raw)
+inputs = processor(images=image, text="Is the sum of all 4 places greater than Laos?", return_tensors="pt")
+predictions = model.generate(**inputs, max_new_tokens=512)
+print(processor.decode(predictions[0], skip_special_tokens=True))
+>>> No
+```
+# Converting from T5x to huggingface
 You can use the [`convert_pix2struct_checkpoint_to_pytorch.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/models/pix2struct/convert_pix2struct_original_pytorch_to_hf.py) script as follows:
 ```bash
 processor.push_to_hub("USERNAME/MODEL_NAME")
 ```
 # Contribution
 This model was originally contributed by Fangyu Liu, Francesco Piccinno et al. and added to the Hugging Face ecosystem by [Younes Belkada](https://huggingface.co/ybelkada).