Skip to content

add intervals for mutect2 PON#78

Merged
maxulysse merged 6 commits into
nf-core:devfrom
maxulysse:intervals_mutect2
May 19, 2026
Merged

add intervals for mutect2 PON#78
maxulysse merged 6 commits into
nf-core:devfrom
maxulysse:intervals_mutect2

Conversation

@maxulysse
Copy link
Copy Markdown
Member

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/createpanelrefs branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Comment thread nextflow_schema.json Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for running Mutect2 Panel-of-Normals generation with a scatter/gather strategy over genomic intervals, controlled by a new mutect2_intervals_num parameter. It wires interval splitting and downstream merging into the Mutect2 PoN subworkflow, and updates related nf-core utility subworkflows and tests.

Changes:

  • Add mutect2_intervals_num pipeline parameter and plumb it through PREPARE_GENOMECREATEPANELREFSBAM_CREATE_SOM_PON_GATK.
  • Implement interval-based scatter/gather for Mutect2 PoN using new/updated GATK4 modules (SplitIntervals, MergeVcfs, MergeMutectStats) and adjust Mutect2 output naming.
  • Update nf-test scenarios/snapshots and refresh some module/utility internals (e.g., params JSON dumping, mosdepth env/container).

Reviewed changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
workflows/createpanelrefs.nf Pass interval scatter metadata into the Mutect2 PoN subworkflow.
tests/mutect2.nf.test Add a new test scenario for --mutect2_intervals_num.
tests/mutect2.nf.test.snap Update snapshots to cover the new Mutect2 intervals scenario.
subworkflows/nf-core/utils_nfschema_plugin/meta.yml Extend schema-plugin subworkflow inputs (help + cli typecasting).
subworkflows/nf-core/utils_nfschema_plugin/main.nf Wire cliTypecast and adjust help parameter handling.
subworkflows/nf-core/utils_nextflow_pipeline/main.nf Revise params JSON dump logic and JSON serialization handling.
subworkflows/nf-core/bam_create_som_pon_gatk/meta.yml Update subworkflow interface to accept separate interval channels and new merge steps.
subworkflows/nf-core/bam_create_som_pon_gatk/main.nf Implement Mutect2 scatter/gather branching and merging before GenomicsDBImport.
subworkflows/local/utils_nfcore_createpanelrefs_pipeline/main.nf Update schema plugin invocation to include the new cli_typecast input.
subworkflows/local/prepare_genome/main.nf Add interval splitting via GATK4_SPLITINTERVALS and emit interval scatter tuples.
nextflow.config Add default mutect2_intervals_num.
nextflow_schema.json Add mutect2_intervals_num to the pipeline schema.
modules/nf-core/multiqc/.conda-lock/linux_arm64-bd-40bf3b435e89dc22_1.txt Remove MultiQC conda lock file (arm64).
modules/nf-core/multiqc/.conda-lock/linux_amd64-bd-c1f4a7982b743963_1.txt Remove MultiQC conda lock file (amd64).
modules/nf-core/mosdepth/main.nf Update mosdepth container reference.
modules/nf-core/mosdepth/environment.yml Add explicit htslib dependency and annotate deps for renovate.
modules/nf-core/gatk4/splitintervals/meta.yml Add module metadata for SplitIntervals.
modules/nf-core/gatk4/splitintervals/main.nf Add the SplitIntervals process module implementation.
modules/nf-core/gatk4/splitintervals/environment.yml Add conda environment for SplitIntervals.
modules/nf-core/gatk4/mutect2/meta.yml Align Mutect2 output filenames to use ${prefix}.
modules/nf-core/gatk4/mutect2/main.nf Update Mutect2 outputs to ${prefix}-based naming.
modules/nf-core/gatk4/mergevcfs/meta.yml Add module metadata for MergeVcfs.
modules/nf-core/gatk4/mergevcfs/main.nf Add the MergeVcfs process module implementation.
modules/nf-core/gatk4/mergevcfs/environment.yml Add conda environment for MergeVcfs.
modules/nf-core/gatk4/mergemutectstats/meta.yml Add module metadata for MergeMutectStats.
modules/nf-core/gatk4/mergemutectstats/main.nf Add the MergeMutectStats process module implementation.
modules/nf-core/gatk4/mergemutectstats/environment.yml Add conda environment for MergeMutectStats.
modules/nf-core/gatk4/createsomaticpanelofnormals/meta.yml Align CreateSomaticPanelOfNormals output patterns to ${prefix}.
modules/nf-core/gatk4/createsomaticpanelofnormals/main.nf Update CreateSomaticPanelOfNormals outputs/stub to ${prefix} naming.
modules.json Record updated/new module and subworkflow SHAs.
main.nf Plumb mutect2_intervals_num into PREPARE_GENOME and NFCORE_CREATEPANELREFS.
conf/test.config Set test defaults for mutect2_intervals_num.
conf/modules/mutect2.config Add scatter/gather-aware tagging/prefixing/publishing for Mutect2 and merge steps.
conf/modules/cnvkit.config Reorder/restore SAMTOOLS_VIEW module configuration block.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tuple val(meta4), path(dict)

output:
tuple val(meta), path("**.interval_list"), emit: split_intervals
Comment on lines +64 to +69
pattern: "*.interval_list"
- "**.interval_list":
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test' ]
Comment on lines +38 to +62
output:
vcf:
- - meta:
type: file
description: merged vcf file
pattern: "*.vcf.gz"
ontologies:
- edam: http://edamontology.org/format_3989 # GZIP format
- "*.vcf.gz":
type: file
description: merged vcf file
pattern: "*.vcf.gz"
ontologies:
- edam: http://edamontology.org/format_3989 # GZIP format
tbi:
- - meta:
type: file
description: merged vcf file
pattern: "*.vcf.gz"
ontologies:
- edam: http://edamontology.org/format_3989 # GZIP format
- "*.tbi":
type: file
description: index files for the merged vcf files
pattern: "*.tbi"
Comment on lines +70 to +78
.map { meta, vcf -> [meta - meta.subMap('num_intervals'), vcf] }

ch_tbi = tbi_branch.no_intervals
.mix(GATK4_MERGEVCFS.out.tbi)
.map { meta, tbi -> [meta - meta.subMap('num_intervals'), tbi] }

ch_stats = stats_branch.no_intervals
.mix(GATK4_MERGEMUTECTSTATS.out.stats)
.map { meta, stats -> [meta - meta.subMap('num_intervals'), stats] }
Comment on lines +76 to +83
def temp_pf = workflow.launchDir.resolve(".${filename}")
def jsonGenerator = new groovy.json.JsonGenerator.Options()
.excludeNulls()
.addConverter(Path) { Path path -> path.toUriString() }
.addConverter(Duration) { Duration duration -> duration.toMillis() }
.addConverter(MemoryUnit) { MemoryUnit memory -> memory.toBytes() }
.addConverter(nextflow.script.types.VersionNumber) { nextflow.script.types.VersionNumber version -> version.toString() }
.build()
@maxulysse maxulysse merged commit 8e82d2b into nf-core:dev May 19, 2026
21 checks passed
@maxulysse maxulysse deleted the intervals_mutect2 branch May 19, 2026 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants