Because most of the code I've written remains proprietary, I created this page to showcase my experience indirectly via the output data visualizations.
Presented here: Data visualizations from a flow cytometry analysis pipeline written in R
Additional demonstration: Cut-and-Run Seq Pipeline
Technical Overview:
- Investigation of chimeric antigen receptor NK cells (i.e. CAR-NK cells)
- Purpose: analyze the efficiency of CAR-NK cell creation
- Actual R code is proprietary (sorry)
| Library | Function |
|---|---|
| flowCore | Creates the flowSet data type |
| flowAI | Automated qc |
| flowAssist | FlowFrame converter |
| flowWorkspace | gatingSets, cytoSets |
| ggcyto | Autoplot function |
| ggplot2 | data visualization |
| openCyto | Automated culstering |
| ggpointdensity | Graphing |
| uwot | UMAP dimensionality reduction |
| cytofkit2 | Clustering |
| ggridges | Fluorescence density plots |
| flowStats | DataFrame to flowFrame converter |
- QC: Raw fluorescence batch comparison
- QC: Fluorescence normalization (gaussNorm)
Gating:
- Non-debris gating
- Singlet gating
- Live cell gating
- Marker gating (CD45+ > ...)
- With multiple samples (each row) it becomes increasingly important to normalize fluorescence distributions prior to gating
- In this comparison, gaussNorm performed best

- Standard removal of non-cell datapoints
- In this example, I didn't remove the smaller debris as later analysis indicated they were actual cells.

- CD45+ gating for lymphocytes is a typical next step after singlet gating
- Shown - an example 2-color gating step with gates for multiple quadrants
- This is an example of multiple gates applid to the same set of cells

- I tried dimensionality reduction with two different algorithms to see which performed better in terms of computation time and distinguishing cell clusters (calculated separately of course). UMAP performed better than tSNS, in accordance with various reports.

- To help identify/verify groups of cells, I colored the UMAP projections according the the level of various cell surface markers.

- Clustering was performed using cytofkit2
- Clustered heat maps cah help show the similarities/differences between clusters
- This plot is marked by cell surface marker, though those would normally be replaced by the corresponding gene label

- The cells we were examining were immunologically naive. Accordingly, there were few highly differentiated cells, necessitating a "flat" gating scheme.

- One of the general concepts in completing a FC analysis is to combine dimensionality reduction, clustering, and the users's experimetal design (the chosen cell surface markers) to assign cell identity. The triple combination of allows users to cross-reference the UMAP location with cluster identity. When they match, it provides two independent assessments of the cell grouping, providing assurance that the user-assigned cell identities defined by their cell surface markers are accurate.
- One of the mechanisms to extract insights is to look at gene expression in different cell types. For example one could examine the fraction of T-cells that express CCR5/CXCR3 and CCR3/CCR4 to subset CD4+ T-cells, providing information on the relative Th1 vs Th2 type cells.



