This README provides instructions for running the sRNA-seq pipeline. The pipeline processes raw FASTQ files, performs quality filtering, adapter trimming, read mapping, and quantification using various bioinformatics tools. Below are the steps required to execute the pipeline successfully.
-
Raw FASTQ Files:
- Place the raw FASTQ files into the
infolder. - Ensure that the files are in gzip-compressed format (
*.fastq.gz). - Name the files according to the following pattern:
<sample_name>_R1_001.fastq.gzor<sample_name>_R2_001.fastq.gzrespectively.
- Place the raw FASTQ files into the
-
Define Samples:
- Edit
samples.csvwith one sample and corresponding condition per line. - This file specifies the sample names used in the pipeline.
- Edit
Run Pipeline Script:
- Execute the
pipeline.shscript in the terminal alternatively: - Right-click on the 'pipeline.sh' file and select "Run as Program" from the context menu.
The pipeline.sh script automates the setup and execution of the sRNA-seq pipeline. It performs the following tasks:
-
Check Dependencies:
- Verifies if Conda is installed. If not, installs it.
- Checks for the required Conda environment. If not present, creates it.
-
Create Conda Environment:
- Sets up a Conda environment.
-
Execute Snakemake Pipeline:
- Runs the Snakemake pipeline within the created Conda environment.
- Utilizes all available CPU cores for efficient processing.