Snakemake workflow: MPRACaptureFlow

Workflow for processing and analyzing Capture-C MPRA data, following the methodology described in the preprint Capture-C MPRA: A high-throughput method to simultaneously characterize promoter interactions and regulatory activity by Arnould, Keukeleire, et al. (2025), https://doi.org/10.1101/2025.06.11.658967.

Developers

Pia Keukeleire (@pi-zz-a), Institute of Human Genetics, UKSH / University of Lübeck

Usage

If you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this (original) repository as well as the preprint (see above).

Cloning this repository, as well as those required (described in step 2) takes only a minute on a normal machine. Snakemake will install all required packages in conda environments automatically, which might take a few minutes up to an hour.

Running the workflow on the provided sample data should take around 5 minutes on a normal machine.

Step 1: Obtain a copy of this workflow

Create a new github repository using this workflow as a template.
Clone the newly created repository to your local system, into the place where you want to perform the data analysis.

Step 2: Download required tools

This workflow relies on CHiCAGO (Freire-Pritchett et al., 2021) for identifying significant cHi-C loops, on MPRAsnakeflow for creating count tables from the MPRA sequencing data and on BCalm (Keukeleire et al., 2025) for quantifying MPRA activity.

Clone chicagoTools from https://github.com/dovetail-genomics/chicago/ and link to your local chicagoTools directory in your configuration file.
Clone MPRAsnakeflow from https://github.com/kircherlab/MPRAsnakeflow and link to the MPRAsnakeflow directory in your configuration file.
Install BCalm according to the instructions in https://github.com/kircherlab/BCalm.

Step 3: Configure workflow

Configure the workflow according to your needs via editing the files in the config/ folder. Adjust config.yaml to configure the workflow execution, and samples.tsv to specify your sample setup. For running the workflow on the small example dataset (data/small_test.fastq.gz), one can use config/example_config.yaml and config/example_samples.tsv.

Step 4: Install Snakemake

Install Snakemake version >= 7.15.2 using conda:

conda create -c bioconda -c conda-forge -n snakemake snakemake

For installation details, see the instructions in the Snakemake documentation.

Step 5: Execute workflow

Activate the conda environment:

conda activate snakemake

Test your configuration by performing a dry-run via

snakemake --use-conda --configfile config/example_config.yaml -n

The workflow needs to be run twice: once to get the input files for MPRAsnakeflow, and then after running MPRAsnakeflow, once to get the BCalm output quantification. For the first run, the third rule in the Snakefile needs to be commented out. For the second run, include the third rule.

For the example data, I included an example output count matrix which you can find in data/mprasnakeflow_example_counts.tsv.gz.

For running MPRAsnakeflow, one can use the following configurations (for more detailed instructions, see the MPRAsnakeflow repository):

---
experiments:
example:
	bc_length: 15
	umi_length: 16
	data_folder: data/ # folder containing your MPRA sequencing files
	experiment_file: # file describing the sequencing files
	demultiplex: false
	assignments:
		fromFile:
			type: file
			assignment_file: mpra_capture_flow/results/example_project/mprasf/assignment_barcodes.sorted.tsv.gz
	design_file: ../../mpra_capture_flow/results/example_project/mprasf/mprasnakeflow_design.fa
	configs:
		minimal:
			filter:
				bc_threshold: 1
				min_dna_counts: 1
				min_rna_counts: 1

For further analysis and for reproducing the manuscript figures, see the repository containing my analysis notebooks: https://github.com/kircherlab/CMPRA_figures.

See the Snakemake documentation for further details.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github		.github
config		config
data		data
docs		docs
resources		resources
workflow		workflow
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Snakemake workflow: MPRACaptureFlow

Developers

Usage

Step 1: Obtain a copy of this workflow

Step 2: Download required tools

Step 3: Configure workflow

Step 4: Install Snakemake

Step 5: Execute workflow

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Snakemake workflow: MPRACaptureFlow

Developers

Usage

Step 1: Obtain a copy of this workflow

Step 2: Download required tools

Step 3: Configure workflow

Step 4: Install Snakemake

Step 5: Execute workflow

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages