perf: Use OpenCV over PIL for PNG encoding in ImageRef.from_pil#562
perf: Use OpenCV over PIL for PNG encoding in ImageRef.from_pil#562maxdswain wants to merge 3 commits into
ImageRef.from_pil#562Conversation
|
✅ DCO Check Passed Thanks @maxdswain, all your commits are properly signed off. 🎉 |
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Enforce conventional commitWonderful, this rule succeeded.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
|
|
@maxdswain it would definitely be welcome to have this performance bottleneck addressed. However, opencv-python adds some intricacies, since it exists in both flavours: In order to support this cleanly, and knowing that other, optional third-party dependencies such as OCR engines in docling favour partially one and partially the other flavour, we would have to:
Even so it would not yet be a complete solution, since every dependent of Would you like to take these adjustments into account? |
Thank you @cau-git for this valuable feedback, I've made the adjustments you've suggested. |
|
@maxdswain thanks for the updates! can you please rebase this branch on current |
dolfim-ibm
left a comment
There was a problem hiding this comment.
I think we are better without the extra conflicting dependencies.
I propose we remove the public dependencies and add a dev dependency on the headless version.
3bc36ed to
a86a06e
Compare
|
Thanks for the feedback @dolfim-ibm, I've made the changes you've suggested. |
Signed-off-by: Max Swain <89113255+maxdswain@users.noreply.github.com>
Signed-off-by: Max Swain <89113255+maxdswain@users.noreply.github.com>
Signed-off-by: Max Swain <89113255+maxdswain@users.noreply.github.com>
a86a06e to
8f9349c
Compare
|
@cau-git @dolfim-ibm I should addressed all of your concerns, is this ready to be merged? |
|
@maxdswain I still need to test this in a setup with docling to see if any of this may cause differences in the test outputs there. I already confirmed it speeds up things as expected. |
Overview
The
ImageRef.from_pilclass method is used widely in docling's codebase. It is often used several times per page when parsing documents in thedocling_parse.pdf_parser.PdfDocument._to_bitmap_resources_from_decodermethod. From my profiling, I found that it took up ~45% of processing time when doing aDocumentConverterconversion with all AI models disabled. This led me looking into how it's performance can be improved.The function uses pillow to encode the image to a png, which is notoriously slow. So I swapped it out with opencv, improving the performance of this function by ~55% for this simple test case:
When using these changes in the main docling repo, it reduced by conversion time from 14.2 to 9.21 (~35%) when disabling all AI models.
One caveat is that I did add an extra dependency
opencv-python-headless, however this is already a dependency in the main docling repo'suv.lock.