This workflow is a best-practice workflow for <detailed description>.
The workflow is built using snakemake and consists of the following steps:
- Download genome reference from NCBI
- Validate downloaded genome (
pythonscript) - Simulate short read sequencing data on the fly (
dwgsim) - Check quality of input read data (
FastQC) - Collect statistics from tool output (
MultiQC)
This template workflow creates artificial sequencing data in *.fastq.gz format.
It does not contain actual input data.
The simulated input files are nevertheless created based on a mandatory table linked in the config.yaml file (default: .test/samples.tsv).
The sample sheet has the following layout:
| sample | condition | replicate | read1 | read2 |
|---|---|---|---|---|
| sample1 | wild_type | 1 | sample1.bwa.read1.fastq.gz | sample1.bwa.read2.fastq.gz |
| sample2 | wild_type | 2 | sample2.bwa.read1.fastq.gz | sample2.bwa.read2.fastq.gz |