Switch back to original xmlquery package#305
Merged
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR migrates the EPUB parser stack back to the upstream github.com/antchfx/xmlquery (and antchfx/xpath) now that namespace-aware XPath compilation is available, removing the previously used fork and prefix-injection approach.
Changes:
- Replace
github.com/readium/xmlqueryusage with upstreamgithub.com/antchfx/xmlqueryand namespace-aware, precompiled XPath expressions. - Simplify XML loading by removing the
prefixesparameter fromfetcher.ReadResourceAsXMLand updating all call sites. - Update EPUB parsing modules (package doc, NCX, navdoc, SMIL, encryption, metadata) and related tests to use
xmlquery.QuerySelector*with compiled XPath.
Reviewed changes
Copilot reviewed 17 out of 18 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/parser/epub/utils.go | Introduces shared namespace map + XPath compilation helper; updates container.xml parsing call site. |
| pkg/parser/epub/parser.go | Updates XML loading calls to new ReadResourceAsXML signature. |
| pkg/parser/epub/parser_smil.go | Switches SMIL parsing to compiled namespace-aware XPath selectors. |
| pkg/parser/epub/parser_smil_test.go | Updates test XML loading to new ReadResourceAsXML signature. |
| pkg/parser/epub/parser_packagedoc.go | Switches OPF package document parsing to compiled namespace-aware XPath selectors. |
| pkg/parser/epub/parser_packagedoc_test.go | Updates test XML loading to new ReadResourceAsXML signature. |
| pkg/parser/epub/parser_ncx.go | Switches NCX parsing to compiled namespace-aware XPath selectors. |
| pkg/parser/epub/parser_ncx_test.go | Updates test XML loading to new ReadResourceAsXML signature. |
| pkg/parser/epub/parser_navdoc.go | Switches navdoc parsing to compiled namespace-aware XPath selectors. |
| pkg/parser/epub/parser_navdoc_test.go | Updates test XML loading to new ReadResourceAsXML signature. |
| pkg/parser/epub/parser_encryption.go | Switches encryption.xml parsing to compiled namespace-aware XPath selectors. |
| pkg/parser/epub/parser_encryption_test.go | Updates test XML loading to new ReadResourceAsXML signature. |
| pkg/parser/epub/metadata.go | Switches metadata lookup to compiled namespace-aware XPath selectors; minor slice type update. |
| pkg/parser/epub/metadata_test.go | Updates test XML loading to new ReadResourceAsXML signature. |
| pkg/parser/epub/media_overlay_service.go | Updates XML loading to new ReadResourceAsXML signature. |
| pkg/fetcher/resource.go | Changes ReadResourceAsXML signature and parsing implementation to upstream xmlquery. |
| go.mod | Adds direct requirements for antchfx/xmlquery and antchfx/xpath; removes readium/xmlquery. |
| go.sum | Updates checksums for dependency changes. |
Comments suppressed due to low confidence (2)
pkg/parser/epub/utils.go:50
container.xmluses a default namespace (NamespaceOPC). The unprefixed XPath/container/rootfiles/rootfilewill not match namespaced elements in standard XPath, sonmay be nil for valid EPUBs. Consider adding the container namespace toxmlNSand querying with a prefixed XPath (or usinglocal-name()selectors).
xml, err := ftchr.ReadResourceAsXML(ctx, res)
if err != nil {
return nil, errors.Wrap(err, "failed loading container.xml")
}
n := xml.SelectElement("/container/rootfiles/rootfile")
if n == nil {
return nil, errors.New("rootfile not found in container")
pkg/fetcher/resource.go:105
ReadResourceAsXMLconverts[]bytetostringand then back to anio.Reader, which adds an extra allocation/copy for every XML parse. Prefer using a byte reader (e.g.,bytes.NewReader(bytes)) to avoid the conversion overhead.
func ReadResourceAsXML(ctx context.Context, r Resource) (*xmlquery.Node, *ResourceError) {
bytes, ex := r.Read(ctx, 0, 0)
if ex != nil {
return nil, ex
}
node, err := xmlquery.ParseWithOptions(strings.NewReader(string(bytes)), xmlquery.ParserOptions{
Decoder: &xmlquery.DecoderOptions{
Strict: true,
Entity: xml.HTMLEntity,
},
})
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We go back to using the original xmlquery package now that the underlying xpath package supports easily using namespaces during querying