Extract Lang attribute for marked contents#20407
Conversation
| const pdfDoc = await loadingTask.promise; | ||
| const pdfPage = await pdfDoc.getPage(1); | ||
|
|
||
| pdfDoc.annotationStorage.setValue("30R", { value: "test" }); |
There was a problem hiding this comment.
There is no annotations in the pdf so I don't see the point of setting these values.
There was a problem hiding this comment.
Uh I am sorry, bad copy/paste from a previous test. Removed
|
Could you rebase your patch ? |
|
/botio test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @calixteman received. Current queue size: 0 Live output at: http://54.193.163.58:8877/d64eb63a5678ba9/output.txt |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @calixteman received. Current queue size: 0 Live output at: http://54.241.84.105:8877/0d18703028e4f9f/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.241.84.105:8877/0d18703028e4f9f/output.txt Total script time: 60.00 mins |
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/d64eb63a5678ba9/output.txt Total script time: 75.18 mins
Image differences available at: http://54.193.163.58:8877/d64eb63a5678ba9/reftest-analyzer.html#web=eq.log |
|
@edoardocavazza Can you check the failure in issue12909 ? |
|
The If some of the elements in the text layer are going to have a different language, we need to make sure to get the measurements with the appropriate canvas context (this might be what's causing the horizontal scaling difference in |
9f576be to
cc9f9cf
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #20407 +/- ##
==========================================
+ Coverage 89.41% 89.43% +0.01%
==========================================
Files 262 262
Lines 66738 66761 +23
==========================================
+ Hits 59675 59705 +30
+ Misses 7063 7056 -7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
cc9f9cf to
52e7a3c
Compare
| const opList = await pdfPage.getOperatorList(); | ||
| expect(opList.fnArray[0]).toEqual(OPS.beginMarkedContentProps); |
There was a problem hiding this comment.
Nit: Improve readability by adding a blank line before the expect statements.
| const opList = await pdfPage.getOperatorList(); | |
| expect(opList.fnArray[0]).toEqual(OPS.beginMarkedContentProps); | |
| const opList = await pdfPage.getOperatorList(); | |
| expect(opList.fnArray[0]).toEqual(OPS.beginMarkedContentProps); |
| expect(opList.argsArray[10][2]?.lang).toEqual("es-ES"); | ||
| }); |
There was a problem hiding this comment.
This test doesn't clean-up after itself.
| expect(opList.argsArray[10][2]?.lang).toEqual("es-ES"); | |
| }); | |
| expect(opList.argsArray[10][2]?.lang).toEqual("es-ES"); | |
| await loadingTask.destroy(); | |
| }); |
| expect(span.textContent).toEqual("Esto es español"); | ||
| }); |
There was a problem hiding this comment.
This test doesn't clean-up after itself.
| expect(span.textContent).toEqual("Esto es español"); | |
| }); | |
| expect(span.textContent).toEqual("Esto es español"); | |
| await loadingTask.destroy(); | |
| }); |
| let langAscentCache = this.#ascentCache.get((lang ||= "")); | ||
| if (langAscentCache) { | ||
| const cachedAscent = langAscentCache.get(fontFamily); | ||
| if (cachedAscent) { | ||
| return cachedAscent; | ||
| } | ||
| } else { | ||
| langAscentCache = new Map(); | ||
| this.#ascentCache.set(lang, langAscentCache); |
There was a problem hiding this comment.
This can probably be shortened?
| let langAscentCache = this.#ascentCache.get((lang ||= "")); | |
| if (langAscentCache) { | |
| const cachedAscent = langAscentCache.get(fontFamily); | |
| if (cachedAscent) { | |
| return cachedAscent; | |
| } | |
| } else { | |
| langAscentCache = new Map(); | |
| this.#ascentCache.set(lang, langAscentCache); | |
| const langAscentCache = this.#ascentCache.getOrInsertComputed( | |
| (lang ||= ""), | |
| makeMap | |
| ); | |
| const cachedAscent = langAscentCache.get(fontFamily); | |
| if (cachedAscent) { | |
| return cachedAscent; | |
| } |
| let props = null; | ||
| if (args[1] instanceof Dict) { | ||
| const lang = args[1].get("Lang"); | ||
| if (typeof lang === "string") { | ||
| props = Object.create(null); | ||
| props.lang = stringToPDFString(lang); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Wait, why are we adding unused data here (outside of a single unit-test)!?
This means adding effectively dead code in the Firefox PDF Viewer, and it'll also be overall (ever so slightly) less efficient since this data first of all needs to be parsed and secondly it needs to be sent to the main-thread.
Some marked contents have a
Langattribute defined in the props passed to thebeginMarkedContentPropsoperator. With this PR, the evaluator extracts this information. I chose to pass thepropsobject as third argument in order to maintain backwards compatibility, but I'm open to changes.