--- base_model: google/siglip-base-patch16-512 library_name: transformers.js --- https://huggingface.co/google/siglip-base-patch16-512 with ONNX weights to be compatible with Transformers.js. ## Usage (Transformers.js) If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@xenova/transformers) using: ```bash npm i @xenova/transformers ``` **Example:** Zero-shot image classification w/ `Xenova/siglip-base-patch16-512`: ```js import { pipeline } from '@xenova/transformers'; const classifier = await pipeline('zero-shot-image-classification', 'Xenova/siglip-base-patch16-512'); const url = 'http://images.cocodataset.org/val2017/000000039769.jpg'; const output = await classifier(url, ['2 cats', '2 dogs'], { hypothesis_template: 'a photo of {}', }); console.log(output); // [ // { score: 0.29906779527664185, label: '2 cats' }, // { score: 0.00009295559721067548, label: '2 dogs' } // ] ``` **Example:** Compute text embeddings with `SiglipTextModel`. ```javascript import { AutoTokenizer, SiglipTextModel } from '@xenova/transformers'; // Load tokenizer and text model const tokenizer = await AutoTokenizer.from_pretrained('Xenova/siglip-base-patch16-512'); const text_model = await SiglipTextModel.from_pretrained('Xenova/siglip-base-patch16-512'); // Run tokenization const texts = ['a photo of 2 cats', 'a photo of 2 dogs']; const text_inputs = tokenizer(texts, { padding: 'max_length', truncation: true }); // Compute embeddings const { pooler_output } = await text_model(text_inputs); // Tensor { // dims: [ 2, 768 ], // type: 'float32', // data: Float32Array(1536) [ ... ], // size: 1536 // } ``` **Example:** Compute vision embeddings with `SiglipVisionModel`. ```javascript import { AutoProcessor, SiglipVisionModel, RawImage} from '@xenova/transformers'; // Load processor and vision model const processor = await AutoProcessor.from_pretrained('Xenova/siglip-base-patch16-512'); const vision_model = await SiglipVisionModel.from_pretrained('Xenova/siglip-base-patch16-512'); // Read image and run processor const image = await RawImage.read('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg'); const image_inputs = await processor(image); // Compute embeddings const { pooler_output } = await vision_model(image_inputs); // Tensor { // dims: [ 1, 768 ], // type: 'float32', // data: Float32Array(768) [ ... ], // size: 768 // } ``` --- Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).