Skip to content

Hariharan-M-2/Neurodevelopmental-Disorder-RNAseq-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

69 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RNA-seq Analysis of NAA15 Mutation Associated with Neurodevelopmental Disorders 🧬

R Bioconductor DESeq2 License


🔖 Overview

This repository contains the complete downstream RNA-seq analysis pipeline for transcriptomic data derived from heterozygous and homozygous NAA15 mutant samples. The analysis spans differential expression, functional enrichment (GO/KEGG), and targeted screening for overlap with curated neurodevelopmental disorder (NDD) gene sets, with the aim of contextualizing transcriptional alterations in a human NDD framework.

  • 🔬 Differential Expression — DESeq2 with apeglm shrinkage to identify significant DEGs across Homo and Hetero contrasts, with volcano plots, MA plots, and heatmaps.
  • 📊 Exploratory Data Analysis — Variance Stabilizing Transformation (VST) and PCA to assess sample quality and genotype-driven variation.
  • 🧩 Functional Enrichment — GO Biological Process and KEGG pathway over-representation analysis using clusterProfiler.
  • 🧠 NDD Gene Contextualization — Systematic screening of DEGs against a curated NDD gene list, validated with Fisher's Exact Test and fgsea GSEA.

💠 Getting Started

Prerequisites

  • R ≥ 4.5.0
  • renv for reproducible dependency management

1. Clone the Repository

git clone https://github.com/Hariharan-M-2/Neurodevelopmental-Disorder-RNAseq-Analysis.git
cd Neurodevelopmental-Disorder-RNAseq-Analysis

2. Install Dependencies

All package versions are locked in renv.lock. Restore the environment in R:

if (!requireNamespace("renv", quietly = TRUE)) install.packages("renv")
renv::restore()

3. Run the Analysis Pipeline

Open the project in RStudio (or set the working directory to the project root) and run:

source("analysis/00_run_pipeline.r")

This will execute all analysis scripts sequentially, creating outputs in output/ and plots in plots/.


📂 Repository Structure

Neurodevelopmental-Disorder-RNAseq-Analysis/
│
├── analysis/
│   ├── 00_run_pipeline.r                      # Runs the full workflow
│   ├── 01_gene_annotation.r                   # Converts Ensembl IDs to gene symbols
│   ├── 02_deseq_dataset_creation.r            # DESeq2 object creation & filtering
│   ├── 03_exploratory_data_analysis.r         # Perform VST and PCA
│   ├── 04_differential_expression_analysis.r  # Run DESeq2 for DE analysis
│   ├── 05_functional_enrichment_analysis.r    # GO & KEGG enrichment analysis
│   └── 06_ndd_gene_presence_analysis.r        # NDD gene screening & GSEA
│
├── data/                                      # Input data (counts, metadata, NDD list)
├── docs/                                      # Biological interpretation & figures
├── case_studies/                              # Extended NDD case study analysis
├── renv.lock                                  # Locked dependency versions
└── README.md

📃 Documentation

Document Description
docs/experimental-design.md Study objective, sample design (13 SRA samples across 5 genotypes), and detailed methodology for each analysis step
docs/biological-interpretation.md Combined results & discussion with figures, GO/KEGG enrichment tables, and NDD gene contextualization
case_studies/genotype_ndd_analysis.md In-depth case study on NAA15 biology — NatA complex function, genotype–phenotype correlations, and molecular mechanisms linking NAA15 mutations to NDD


📊 Analysis Pipeline

  Raw Count Matrix + Metadata
        ↓
  Gene Annotation
        ↓
  DESeq2 Dataset Creation
        ↓
  Exploratory Data Analysis
        ↓
  Differential Expression (Homo vs WT, Hetero vs WT)
        ↓
  GO & KEGG Functional Enrichment
        ↓
  NDD Gene Screening & GSEA

✍🏻 Behind The Code

🧑🏻‍💻 Hariharan M - This work was completed during an internship at the Computational and Genomics Lab, CRIC — CDFD, Hyderabad under the supervision of Dr. Akaash Ranjan.

GitHub


The goal is to turn data into information 🦋 , and information into insight.

About

Investigating the transcriptomic architecture and functional landscape of NAA15-associated Neurodevelopmental Disorders (NDD) through comparative gene expression profiling.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages