do you have onnx inference ?

#10

by zoldaten - opened 3 days ago

Discussion

zoldaten

3 days ago

hi!
saw onnx models presented. do you have inference for them ?

OriLib

BRIA AI org 2 days ago

@Xenova can you tell how to use onnx inference here?

Xenova

2 days ago

This should do the trick:

import onnxruntime as ort
import numpy as np
from PIL import Image
import requests

image_size = (1024, 1024)

def transform_image(image):
    # Resize image
    image = image.resize(image_size)
    
    # Convert image to NumPy array and normalize to [0, 1]
    image_array = np.asarray(image, dtype=np.float32) / 255.0
    
    # Normalize with given mean and std
    mean = np.array([0.485, 0.456, 0.406], dtype=np.float32)
    std = np.array([0.229, 0.224, 0.225], dtype=np.float32)
    normalized_image = (image_array - mean) / std
    
    # Rearrange dimensions to match tensor format (C, H, W)
    transformed_image = np.transpose(normalized_image, (2, 0, 1))
    
    return np.expand_dims(transformed_image, axis=0)

# Load image from URL
url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/ryan-gosling.jpg'
image = Image.open(requests.get(url, stream=True).raw)
pixel_values = transform_image(image)

# wget https://huggingface.co/briaai/RMBG-2.0/resolve/main/onnx/model.onnx
session = ort.InferenceSession('model.onnx')
outputs = session.run(['alphas'], {'pixel_values': pixel_values})

mask = Image.fromarray((outputs[0].squeeze() * 255).astype(np.uint8))
image.putalpha(mask.resize(image.size))
image

truman007

1 day ago

2024-11-20 14:32:32.5115801 [W:onnxruntime:, execution_frame.cc:651 onnxruntime::ExecutionFrame::AllocateMLValueTensorPreAllocateBuffer] Shape mismatch attempting to re-use buffer. {1,24,24,1536} != {1,256,1536}. Validate usage of dim_value (values should be > 0) and dim_param (all values with the same string should equate to the same size) in shapes in the model.

zoldaten

1 day ago

thanks, that works, but...

tume inference is the same ! i compared model_q4f16.onnx, model_uint8.onnx vs briaai/RMBG-2.0.
Onnx models got the same time and tend to increase it !
20 sec on CPU (standard) vs 37 sec (onnx).

More over i have error with onnxruntime==1.18.0, 1.19.2 (memory leak ?):
"[W:onnxruntime:, execution_frame.cc:660 AllocateMLValueTensorPreAllocateBuffer] Shape mismatch attempting to re-use buffer. {1,36,36,768} != {1,1024,768}. Validate usage of dim_value (values should be > 0) and dim_param (all values with the same string should equate to the same size) in shapes in the model."

Xenova

about 23 hours ago

Hmm, I didn't see those warnings for the fp32 model - do you also get it with that?

As for performance, q4 is typically slower than fp32 on CPU, but q8 should be faster.
Conversely, q4 should be faster on GPU, and q8 should be slower on GPU.
Would you like to run additional benchmarks?

zoldaten

about 11 hours ago

sure. but where is q8 model ?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment