This repository provides a comprehensive R-based toolkit for microbiome data analysis using the phyloseq package. It covers alpha/beta diversity, taxonomic analysis, statistical testing, and publication-ready visualization including volcano plots and bar plots. Below is a structured README describing each function module and usage example.
install.packages(c("phyloseq", "ggplot2", "dplyr", "ggtext", "ggprism", "ggrepel", "patchwork", "broom", "vegan", "glue", "ggpubr", "tidyr", "purrr", "rvg", "officer", "combinat", "RColorBrewer", "GGally"))
source("./250429_microbiome_analysis_functions.R")
- Calculates richness metrics: Observed, Shannon, Chao1, Inverse Simpson.
- Optional: Faith's PD (requires tree).
alpha_df <- create_alpha_diversity_df(physeq_obj, tree = TRUE)- Plots violin + boxplot + significance tests (Kruskal, Wilcoxon).
alpha_plot(alpha_df, x_ = "Group", y_ = "Shannon")
- Outputs CI & p-values for each index using Kruskal-Wallis and Wilcoxon.
res <- alpha_df(df = alpha_df, group = "Group")
- Computes PERMANOVA, ANOSIM, PERMDISP.
- Visualizes PCoA with ellipse.
beta_out <- beta_plot(phyloseq, type = "Group", type_col = c("red", "blue"))
- Extension of
beta_plot()withvegan::envfit()support and global PERMANOVA.
beta_env <- beta_plot.envfit(phyloseq, type = "Group", envfit = TRUE, formula = "Age + BMI", PERMANOVA = TRUE)
- Computes mean, sd, se, CI per taxon and group.
Abund_cal(ps.glom, tax_level = "Genus", group = "Group", path = "./results/")
- Visualizes top taxa at any level or within a genus.
- Automatically color-coded by phylum.
tax_plot_out <- taxa_plot(melt, taxa = "Genus", tax_otu = otu_list, x_axis = "SampleID")
- Generates comparative plots: bar, log2FC, p-value, heatmap.
- Returns
patchworkplot and statistics.
res <- mean_plot(melt, tax_level = "Genus", tax_otu = top_genera, type = "Group", palet = c("#E31A1C", "#1F78B4"))
- Computes Wilcoxon p-values, log2FC, and highlights significant taxa.
vol_out <- volcano_plot(ph_glom, tax_rank = "Species", comparision = "Group", compar1 = "A", compar2 = "B")
- Extracts Genus, Species names from BLAST
sscinamesoutput.
blast_table <- blast_arrange(blast_output, ASV = "Query")
- Filters OTUs based on prevalence and average abundance thresholds.
ps_filtered <- filter_phyloseq_by_prevalence_abundance(ps, 0.001, 0.1)
save_ggplot_to_pptx(): Saves ggplot as editable PowerPoint.ps_tax_update(): Updates ASV ID based on abundance and reassigns taxonomy table.seq_to_fas(): Converts sequence list to FASTA.
This script was developed by Soyeon Kim for microbiome analysis pipelines based on the phyloseq R package.
It is part of a broader effort to support microbial ecology, statistical testing, and reproducible bioinformatics workflows.
📧 Contact: kim.soyeon.bio@gmail.com
🔗 Related blog post: https://bio-kcs.tistory.com/
This repository is licensed under the MIT License.
You are free to use, modify, and distribute the code with attribution.