Skip to content

Extremely high memory usage causing OOMs #164

@heerambavi1998

Description

@heerambavi1998

System Info / 系統信息

In self-hosted mode on 16G CPU (64G GPU) machine with a 300 page PDF, with layout detector enabled pipeline goes OOM.

The layout detector holds 3 copies of every page image in memory
simultaneously for the entire duration of process():

  1. image_batch - numpy arrays of ALL images created upfront
  2. pil_images - PIL copies reconverted from numpy for ALL images
  3. Input images list still in scope

For a 300-page PDF at 200 DPI (~11MB/page), this consumes ~10GB
before any model inference begins Peak image memory is currently not bounded by the page_queue maxsize .

Recommendation

  • Defer images_dict population from the loading thread to the layout
    detection thread. Previously, images were stored in the unbounded
    dict as fast as they could be loaded, bypassing the queue's
    backpressure (maxsize=100).
  • Free page images from images_dict immediately after region crops
    are extracted in _stream_process_layout_batch.

Layout detector changes:

  • Remove eager all-images-upfront materialization (image_batch and
    pil_images lists). Instead, slice input images per-chunk.
  • Use input PIL images directly with conditional RGB conversion
    (no-op when already RGB) instead of PIL→numpy→PIL round-trip.
  • Move visualization and result extraction into the per-chunk loop
    so chunk images can be freed immediately via del.

Who can help? / 谁可以帮助到您?

No response

Information / 问题信息

  • The official example scripts / 官方的示例脚本
  • My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

  1. Deploy the self hosted pipeline with default config and layout enabled
  2. Parse a large PDF (300 pages)

Expected behavior / 期待表现

Controlled CPU memory usage irrespective of PDF size.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions