System Info / 系統信息
In self-hosted mode on 16G CPU (64G GPU) machine with a 300 page PDF, with layout detector enabled pipeline goes OOM.
The layout detector holds 3 copies of every page image in memory
simultaneously for the entire duration of process():
image_batch - numpy arrays of ALL images created upfront
pil_images - PIL copies reconverted from numpy for ALL images
- Input
images list still in scope
For a 300-page PDF at 200 DPI (~11MB/page), this consumes ~10GB
before any model inference begins Peak image memory is currently not bounded by the page_queue maxsize .
Recommendation
- Defer images_dict population from the loading thread to the layout
detection thread. Previously, images were stored in the unbounded
dict as fast as they could be loaded, bypassing the queue's
backpressure (maxsize=100).
- Free page images from images_dict immediately after region crops
are extracted in _stream_process_layout_batch.
Layout detector changes:
- Remove eager all-images-upfront materialization (image_batch and
pil_images lists). Instead, slice input images per-chunk.
- Use input PIL images directly with conditional RGB conversion
(no-op when already RGB) instead of PIL→numpy→PIL round-trip.
- Move visualization and result extraction into the per-chunk loop
so chunk images can be freed immediately via del.
Who can help? / 谁可以帮助到您?
No response
Information / 问题信息
Reproduction / 复现过程
- Deploy the self hosted pipeline with default config and layout enabled
- Parse a large PDF (300 pages)
Expected behavior / 期待表现
Controlled CPU memory usage irrespective of PDF size.
System Info / 系統信息
In self-hosted mode on 16G CPU (64G GPU) machine with a 300 page PDF, with layout detector enabled pipeline goes OOM.
The layout detector holds 3 copies of every page image in memory
simultaneously for the entire duration of process():
image_batch- numpy arrays of ALL images created upfrontpil_images- PIL copies reconverted from numpy for ALL imagesimageslist still in scopeFor a 300-page PDF at 200 DPI (~11MB/page), this consumes ~10GB
before any model inference begins Peak image memory is currently not bounded by the page_queue maxsize .
Recommendation
detection thread. Previously, images were stored in the unbounded
dict as fast as they could be loaded, bypassing the queue's
backpressure (maxsize=100).
are extracted in _stream_process_layout_batch.
Layout detector changes:
pil_images lists). Instead, slice input images per-chunk.
(no-op when already RGB) instead of PIL→numpy→PIL round-trip.
so chunk images can be freed immediately via del.
Who can help? / 谁可以帮助到您?
No response
Information / 问题信息
Reproduction / 复现过程
Expected behavior / 期待表现
Controlled CPU memory usage irrespective of PDF size.