modify usfm for chapter-level drafting to avoid import issues; move remarks to chapters#285
Draft
mshannon-sil wants to merge 1 commit intomainfrom
Draft
modify usfm for chapter-level drafting to avoid import issues; move remarks to chapters#285mshannon-sil wants to merge 1 commit intomainfrom
mshannon-sil wants to merge 1 commit intomainfrom
Conversation
…emarks to chapters
ddaspit
reviewed
Mar 30, 2026
Contributor
ddaspit
left a comment
There was a problem hiding this comment.
@ddaspit reviewed 2 files and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on Enkidu93 and mshannon-sil).
machine/corpora/update_usfm_parser_handler.py line 345 at r1 (raw file):
tokens = list(self._tokens) if chapters is not None: tokens = self._get_incremental_draft_tokens(tokens, chapters)
I think we can do something similar, but before we parse instead of after. Instead of calling parse_usfm in update_usfm, we can do something like this:
tokenizer = UsfmTokenizer(self._settings.stylesheet)
tokens = tokenizer.tokenize(usfm)
tokens = filter_tokens_by_chapter(tokens, chapters)
parser = UsfmParser(tokens, handler, self._settings.stylesheet, self._settings.versification)
parser.process_tokens()This would avoid updating the whole book.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR addresses issue #284.
Mostly looking for high-level feedback about the approach at the moment. As we were discussing, is the right place for this functionality in the
get_usfm()method as essentially a post-processing step? Or should we look to implement this feature inprocess_tokens()(and maybe move the remark logic here as well)?Some initial thoughts:
Pros for putting it in
get_usfm():process_token().Pros for putting it in
process_token():This change is