OOM

#3
by hysts HF staff - opened

Though you are specifying imgsz here, OOM still occurs when a large image is input. So, you may want to resize the input image before feeding it to model.

I fix the queue bug, now it may be fine to use the imgsz from 512~1024.

@An-619 I don't think the OOM issue has been fixed yet. I duplicated your Space and input 3641x5097 image, but I got the following error:

Traceback (most recent call last):
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/gradio/routes.py", line 437, in run_predict
    output = await app.get_blocks().process_api(
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/gradio/blocks.py", line 1352, in process_api
    result = await self.call_function(
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/gradio/blocks.py", line 1077, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/home/user/app/app.py", line 162, in predict
    results = model(input, device=device, retina_masks=True, iou=0.7, conf=0.25, imgsz=input_size)
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/ultralytics/yolo/engine/model.py", line 111, in __call__
    return self.predict(source, stream, **kwargs)
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/ultralytics/yolo/engine/model.py", line 255, in predict
    return self.predictor.predict_cli(source=source) if is_cli else self.predictor(source=source, stream=stream)
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/ultralytics/yolo/engine/predictor.py", line 188, in __call__
    return list(self.stream_inference(source, model))  # merge list of Result into one
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/ultralytics/yolo/engine/predictor.py", line 248, in stream_inference
    self.results = self.postprocess(preds, im, im0s)
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/ultralytics/yolo/v8/segment/predict.py", line 37, in postprocess
    masks = ops.process_mask_native(proto[i], pred[:, 6:], pred[:, :4], orig_img.shape[:2])  # HWC
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/ultralytics/yolo/utils/ops.py", line 634, in process_mask_native
    masks = F.interpolate(masks[None], shape, mode='bilinear', align_corners=False)[0]  # CHW
  File "/home/user/.pyenv/versions/3.10.12/lib/python3.10/site-packages/torch/nn/functional.py", line 3959, in interpolate
    return torch._C._nn.upsample_bilinear2d(input, output_size, align_corners, scale_factors)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 19.01 GiB (GPU 0; 14.76 GiB total capacity; 1.63 GiB already allocated; 12.06 GiB free; 1.76 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

But OOM didn't occur when I added the following lines to predict.

    w, h = input.size
    scale = input_size / max(w, h)
    new_w = int(w * scale)
    new_h = int(h * scale)
    input = input.resize((new_w, new_h))

As your model is based on YOLO, I assumed that this was done when imgsz was passed, but apparently it's not.

Anyway, this may not be the best solution, but I guess you need to do something similar to avoid OOM.

You're right! I understand where the real problem is. I modify the code from your suggestion. Now it may be fine (I hope so)

Awesome! Thanks for the update.

hysts changed discussion status to closed

Sign up or log in to comment