Skip to content

Add product and satellites list support and HTML map caching to pipeline#53

Open
cmspeed wants to merge 4 commits into
OPERA-Cal-Val:mainfrom
cmspeed:update-run-nextpass
Open

Add product and satellites list support and HTML map caching to pipeline#53
cmspeed wants to merge 4 commits into
OPERA-Cal-Val:mainfrom
cmspeed:update-run-nextpass

Conversation

@cmspeed

@cmspeed cmspeed commented May 26, 2026

Copy link
Copy Markdown
Collaborator

This PR updates the PipelineConfig and core execution functions in pipeline.py to accept lists for the product and satellites parameters. It also introduces a caching mechanism to port pre-generated footprint maps into individual output folders. To support the new unified workflow in the disaster-alerts dashboard, the pipeline must handle a two-phase architecture:

(1) pipeline.py must be able to accept lists of products and satellites to pass down to run_next_pass so it can generate a single master footprints map.

(2) When looping through individual products to mosaic them, run_pipeline must be able to detect the master cached folder and pull that unified HTML map into the active output directory without triggering redundant searches or overwriting the map.

IMPORTANT NOTE: This PR requires the -p and -s parameter updates in the run_next_pass function within the next_pass module to function. Therefore, prior to merging this PR, next_pass PR #51 should first be merged.

@ehavazli ehavazli left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update here. I found two blocking issues before this is safe to merge:

  1. The new product: list[str] support is still only partial inside pipeline.py. In run_pipeline, folder_name = config.mode if config.mode else config.product can now become a list, and config.output_dir / folder_name will fail when mode is unset. Downstream, generate_products(..., product=config.product) still treats product as a scalar and does checks like "DSWX" in product, so the advertised multi-product workflow is not end-to-end safe yet.

  2. The new cached/cloud-search inputs are not wired through this repo's own entrypoints. PipelineConfig now has search_dir and satellites, but the CLI still constructs PipelineConfig without those fields, and the standalone search / download commands still only expose scalar product input. That means the new behavior is effectively inaccessible from this repo's public interface unless a caller builds PipelineConfig manually.

@cmspeed cmspeed requested a review from ehavazli June 4, 2026 00:38
@cmspeed

cmspeed commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator Author

Code has been updated to address points 1 and 2 above. Also tests/test_pipeline.py has been added for generic testing of pipeline functionality.

@ehavazli

ehavazli commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Manual review on 0b2b829 found three blocking issues:

  1. src/disasters/pipeline.py:194-195 still builds mode_dir from config.product before handling the new list type. When product is a list[str] and mode is unset, config.output_dir / folder_name raises TypeError, so the advertised multi-product path never starts.

  2. src/disasters/pipeline.py:291-306 still passes config.product straight into generate_products, but src/disasters/pipeline.py:1077-1090 still only handles scalar strings. With a list input, checks like "DSWX" in product test list membership, not substring matching, so the code falls into the unsupported branch and exits without producing outputs. Even after fixing mode_dir, multi-product execution is still not end-to-end.

  3. The committed PR does not wire the new interface through the public CLI. src/disasters/cli.py:73-77, src/disasters/cli.py:239-257, src/disasters/cli.py:309-313, and src/disasters/cli.py:406-410 still expose only scalar --product, and there is no --search-dir or --satellites wiring into PipelineConfig or the standalone search/download entrypoints. The PR description and follow-up comment claim those paths were updated, but the actual mergeable diff at 0b2b829 does not contain those changes.

CodeRabbit raised 0 issues, but the manual review still found the blockers above.

@ehavazli ehavazli left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review finding:

Critical: src/disasters/pipeline.py line 509 passes products=[np_prod], which wraps the already-built product list in another list. For multi-product downloads this sends a nested list like [['DSWX-HLS_V1', 'RTC-S1_V1']] to next_pass.run_next_pass.

next_pass.run_next_pass expects products as str | list[str] and extends that list into CLI args, so the nested list can produce malformed product arguments.

Suggested fix:

products=np_prod

instead of:

products=[np_prod] if np_prod else None

@ehavazli ehavazli self-requested a review June 9, 2026 18:47

@ehavazli ehavazli left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmspeed I made a small change. Please check and merge after confirming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants