Skip to content

tycheleturner/snow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Snow

❄️ SNOW ❄️

Second-pass de Novo variant Offspring Workflow
Developed by Tychele N. Turner, Ph.D.

License Python Status

SNOW takes de novo variants from HAT-FLEX or other DNV callers and performs second-pass cleaning, characterization, and annotation. Outputs can be used directly for QC with acorn or any downstream analysis.

Overview

Use SNOW when you:

  • Have de novo variant calls (from HAT-FLEX or another caller) in a trio.
  • Want clean, merged, parent-of-origin annotated de novos instead of fragmented, ambiguous calls.
  • Work with short-read, long-read, or mixed sequencing technologies.

Note

SNOW tools operate on VCF + BAM/CRAM files.
If your DNV calls are not from HAT-FLEX, you can use snow table2snow to build a properly formatted VCF from a simple DNV table.

Note

SNOW supports de novo VCFs from highly accurate short-read and long-read sequencing.
BAM/CRAM inputs can likewise be short-read or long-read.


Features

🧭 Command overview

Command What it does
snow consolidate Merge nearby de novos on the same read
snow parentcheck Re-check parent/child read support for de novos
snow phaser Phase de novos to infer parent-of-origin
snow pangenomecheck Compare de novos to pangenome graph
snow table2snow Table → SNOW-compatible VCF
snow snow2table SNOW VCF → table
snow snow2bed SNOW VCF → BED
snow snow2acorn SNOW VCF → acorn-ready QC table
snow snowfall Fun mode: make it snow

✨ Cleaner, better-annotated de novos

Merge nearby calls into combined variants

  • snow consolidate merges nearby de novo variants on the same read into combined variants using:
    • CLUSTER / CLUSTER_INFO from HAT-FLEX or generated by snow table2snow.
    • Short genomic distance and haplotype-aware read information.
  • Prevents over-fragmentation of de novo calls.
  • Tracks which original VCF records were merged so you can trace back to the input.

Check alternate-allele support in the reads

  • snow parentcheck re-examines raw alignments around each candidate de novo in all trio members.
  • Counts alternate-allele reads in each parent and in the child.
  • Flags sites where the “de novo” allele is actually present (or weakly present) in a parent:
    • Helps filter out missed inherited variants.
    • Highlights likely technical artifacts / mapping issues.
  • Also flags sites where the child has no reads carrying the de novo allele (useful for mixed technologies or sequencing runs).

🧬 Parent-of-origin phasing

Phase de novos to infer parent-of-origin (POO)

  • snow phaser supports both short-read and long-read data.
  • Uses:
    • A full trio VCF (for phase-by-transmission SNPs).
    • A de novo–only VCF from HAT-FLEX or generated with snow table2snow.
  • Outputs:
    • Parent-of-origin assignment (maternal / paternal / unknown).
    • Read support and counts from proband and parents.
    • Confidence and PHRED-scaled quality scores for POO.

Sex-aware handling of sex chromosomes

  • Special logic for male X/Y:
    • PAR regions treated as diploid.
    • Non-PAR X/Y treated as hemizygous where appropriate.
  • PAR regions can be provided via --par-bed; otherwise, built-in GRCh37/GRCh38 PARs are used.
  • Helps avoid spurious “de novo” calls driven by incorrect ploidy assumptions.

🏷️ Rich annotations & format conversions

Rich INFO annotations

SNOW adds structured VCF INFO fields for:

  • Parent-of-origin.
  • Allelic depths and strand balance.
  • Nearby supporting markers used for phasing.

VCF ↔ table ↔ BED

  • snow table2snow: convert a simple de novo (DNV) table into a SNOW-compatible VCF.
  • snow snow2table: convert a SNOW VCF back into a table.
  • snow snow2bed: convert a SNOW VCF into a BED file.

VCF → acorn format for QC

  • snow toAcorn converts SNOW-annotated VCFs into an acorn-ready table:
    • Designed for de novo QC and visualization with acorn.

☃️ Fun mode

  • snow snowfall to… make it snow!

Installation

Important

SNOW requires Python 3.8+ and access to BAM/CRAM + VCF files from your trio.

git clone https://github.com/tycheleturner/snow
cd snow/snow_cli_py/
pip3 install .

Help for snow

usage: snow [-h] [--version] {consolidate,parentcheck,phaser,pangenomecheck,table2snow,snow2table,snow2bed,snow2acorn,snowfall} ...

SNOW (Second-pass de Novo variant Offspring Workflow) CLI. SNOW takes de novo variants from HAT-FLEX or other DNV callers and performs second-pass cleaning, characterization, and annotation. Outputs can be used directly for QC with acorn or other downstream analyses.

positional arguments:
  {consolidate,parentcheck,phaser,pangenomecheck,table2snow,snow2table,snow2bed,snow2acorn,snowfall}
                        Subcommands
    consolidate         Merge nearby de novos on the same read
    parentcheck         Re-check parent/child read support for de novos
    phaser              Phase de novos to infer parent-of-origin
    pangenomecheck      Check the DNV file for presence of site and/or specific alternate allele in a decomposed pangenome graph VCF
    table2snow          Table to SNOW-compatible VCF
    snow2table          SNOW VCF to table
    snow2bed            SNOW VCF to BED
    snow2acorn          SNOW VCF to acorn-ready QC table
    snowfall            Make it snow!

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

Example Run Integrating consolidate, parentcheck, and phase

snow consolidate \
  --vcf-in child.hatflex.dnvs.final.vcf.gz \
  --bam child.cram \
  --ref GRCh38_full_analysis_set_plus_decoy_hla.fa \
  --child child \
  --window 100 \
  --vcf-out child_sconsolidate.vcf.gz \
  --cluster-field CLUSTER \
  --verbose \
  --min-alt-reads 2

snow parentcheck \
  --vcf-in child_sconsolidate.vcf.gz \
  --vcf-out child_sconsolidate_sparentcheck.vcf.gz \
  --father-bam father.cram \
  --mother-bam mother.cram \
  --child-bam child.cram \
  --ref GRCh38_full_analysis_set_plus_decoy_hla.fa \
  --verbose \
  --min-k-alt-reads 2

snow phaser \
  --full-vcf family_whole_genome_joint_calls.vcf.gz \
  --denovo-vcf child_sconsolidate_sparentcheck.vcf.gz \
  --child-bam child.cram \
  --child-sample child \
  --father-sample father \
  --mother-sample mother \
  --window 40000 \
  --min-gq 20 \
  --denovo-require-flag \
  --annotated-vcf child_snow.vcf.gz \
  -v \
  -vv \
  --proband-sex female

Make it snow

snow snowfall --fancy

Citation

If you use SNOW in your work, please cite: https://www.medrxiv.org/content/10.64898/2026.01.26.26344889v1

About

Second-pass de Novo variant Offspring Workflow

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages