PSC-scDRS is designed to link single-variant analysis scores with polygenic disease risk at single-cell RNA sequencing. It identifies cells with excess expression of disease-associated genes identified by Whole Exome Sequencing (WES).
In the Quick Start section, you can follow the instructions to install the requirements and run a sample. To learn more about the functions, refer to the flowcharts in the Flowcharts. Additionally, there, you can find detailed instructions on the data type and formats required as input for the PSC-scDRS function.
PSC-scDRS requires significant computational resources. Ensure your system meets the following minimum requirements.
System requirements for the sample dataset:
- CPU: ≥ 16 cores
- Memory: ≥ 32 GB RAM (scDRS may require up to 360 GB depending on dataset size),
- Storage: ≥ 50 GB free disk space,
- A stable high-speed internet connection for downloading approximately 22 GB of data.
** The pipeline with sample files was tested on a Dell XPS 15 9530 workstation running Ubuntu 22.04.5 LTS, equipped with 32 GB of RAM and a 13th-generation Intel® Core™ i9-13900H processor (20 cores).
Note: For large datasets, it is recommended to run the pipeline on a high-performance computing (HPC) system, as the scDRS method may require it, depending on the input files.
sudo apt update
sudo apt install -y \
git \
build-essential \
wget curl unzip \
python3.12 python3.12-venvAll the codes and needed files for the sample file will be downloaded in this step.
cd ~
git clone https://github.com/ikmb/PSC-scDRS.git
cd PSC-scDRS
mkdir -p data
wget -O "data/HumanLiverHealthyscRNAseqData.zip" \
"https://github.com/seirana/PSC-scDRS/raw/main/data/HumanLiverHealthyscRNAseqData.zip"PSC-scDRS needs some extra software to run:
Pipeline will install it. If there is a problem, check their page.
Pipeline will install it. If there is a problem, check here.
Pipeline will install it. If there is a problem, check here.
cd ~
REPO_DIR="$(find . -maxdepth 5 -type f -name setup_dependencies.sh -path '*/PSC-scDRS/*' -print -quit | xargs -r dirname)"
echo "$REPO_DIR"
cd "$REPO_DIR"
bash ./setup_dependencies.sh
Summary statistics for the GAISE single-marker test on PSC whole-exome sequencing data are available in the sampleWES.zip file.
The single-cell RNA sequencing data from the healthy human liver in the study by Andrews, T.S. et al. (PMID: 38199298) is provided as a sample dataset after applying the required modifications using the scDRS (PMID: 36050550) method, in the HumanLiverHealthyscRNAseqData.h5ad file.
The pipeline will run the code smoothly.
cd ~
REPO_DIR="$(find . -maxdepth 5 -type f -name setup_dependencies.sh -path '*/PSC-scDRS/*' -print -quit | xargs -r dirname)"
echo "$REPO_DIR"
cd "$REPO_DIR"
bash ./PSC_scDRS_run.sh** The final results are stored as ./PSC-scDRS/output/PSC cell association with Liver.csv
