@@ -16,26 +16,43 @@ vignette: >
1616library(echoAI)
1717```
1818
19+ ``` {r, echo=FALSE}
20+ ## Several examples query remote Zenodo/GitHub resources.
21+ ## Gate on internet access so R CMD check works offline.
22+ has_internet <- function() {
23+ z <- try(suppressWarnings(
24+ readLines("https://github.com", n = 1L, warn = FALSE)
25+ ), silent = TRUE)
26+ !inherits(z, "try-error")
27+ }
28+ run_online <- has_internet()
29+ ```
30+
1931# Introduction
2032
2133` echoAI ` provides API access to variant-level AI/ML predictions,
22- currently centred on
23- [ IMPACT] ( https://github.com/immunogenomics/IMPACT )
24- (Inference and Modeling of Phenotype-related ACtive Transcription).
34+ currently centred on three tools:
35+
36+ - ** IMPACT** (Inference and Modeling of Phenotype-related ACtive Transcription)
37+ -- predicts transcription factor (TF) binding at motif sites by learning
38+ epigenomic profiles, primarily from [ ENCODE] ( https://www.encodeproject.org/ ) .
39+ The 707 annotations cover a wide range of immune and non-immune cell types,
40+ making IMPACT scores especially useful for prioritising causal variants in
41+ immune-mediated diseases. All IMPACT data are aligned to ** hg19** .
42+ Tabix-indexed versions are hosted on Zenodo
43+ ([ doi:10.5281/zenodo.7062238] ( https://doi.org/10.5281/zenodo.7062238 ) )
44+ for rapid remote querying.
2545
26- IMPACT predicts transcription factor (TF) binding at a motif site by
27- learning the epigenomic profiles at those sites
28- (primarily [ ENCODE] ( https://www.encodeproject.org/ ) ).
29- The 707 annotations cover a wide range of immune and non-immune cell types,
30- making IMPACT scores especially useful for prioritising causal variants
31- in immune-mediated diseases.
46+ - ** SpliceAI** -- predicts the probability that a variant alters mRNA splicing.
47+ Results can be obtained via a local VCF/TSV or the Broad Institute API.
3248
33- All IMPACT data are aligned to the ** hg19** genome build.
34- Tabix-indexed versions are hosted on Zenodo
35- ([ doi:10.5281/zenodo.7062238] ( https://doi.org/10.5281/zenodo.7062238 ) )
36- for rapid remote querying.
49+ - ** Deep learning annotations** (Basenji, DeepSEA) -- query precomputed
50+ variant-level scores from deep learning models of chromatin accessibility
51+ and gene expression.
3752
38- # Query IMPACT annotations
53+ # IMPACT
54+
55+ ## Query IMPACT annotations
3956
4057The primary entry point is ` IMPACT_query() ` , which queries tabix-indexed
4158IMPACT annotation and LD-score files hosted on Zenodo.
@@ -70,7 +87,7 @@ annot_long <- IMPACT_query(
7087head(annot_long)
7188```
7289
73- # Annotation key
90+ ## Annotation key
7491
7592The annotation key maps each of the 707 IMPACT annotation IDs to its
7693source study, tissue, cell type/line, and transcription factor.
@@ -80,7 +97,7 @@ annot_key <- IMPACT_get_annotation_key(save_key = FALSE)
8097head(annot_key)
8198```
8299
83- # Download full annotation matrices
100+ ## Download full annotation matrices
84101
85102For larger analyses (e.g. genome-wide or multi-locus), you can download
86103the full per-chromosome annotation matrices directly from the IMPACT
@@ -103,7 +120,7 @@ merged_DT <- echodata::get_Nalls2019_merged()
103120ANNOT_MELT <- IMPACT_iterate_get_annotations(merged_DT = merged_DT)
104121```
105122
106- # Post-processing
123+ ## Post-processing
107124
108125` IMPACT_postprocess_annotations() ` converts the annotation table to
109126long format, identifies the top consensus SNP per locus, and adds a
@@ -113,7 +130,7 @@ combined cell-type label.
113130ANNOT_MELT <- IMPACT_postprocess_annotations(ANNOT_MELT)
114131```
115132
116- # Enrichment analysis
133+ ## Enrichment analysis
117134
118135Enrichment is computed as the ratio of IMPACT signal in a given SNP group
119136(e.g. consensus, credible set, lead GWAS) to the proportion of SNPs in
@@ -132,9 +149,9 @@ head(enrich)
132149ENRICH <- IMPACT_iterate_enrichment(ANNOT_MELT = ANNOT_MELT)
133150```
134151
135- # Visualisation
152+ ## Visualisation
136153
137- ## SNP group box plot
154+ ### SNP group box plot
138155
139156Compare IMPACT score distributions across SNP groups with
140157` IMPACT_snp_group_boxplot() ` :
@@ -147,15 +164,15 @@ bp <- IMPACT_snp_group_boxplot(
147164)
148165```
149166
150- ## Enrichment plots
167+ ### Enrichment plots
151168
152169Visualise enrichment results with bar and violin plots:
153170
154171``` {r enrichment-plot, eval=FALSE}
155172plots <- IMPACT_plot_enrichment(ENRICH = ENRICH)
156173```
157174
158- ## Locus plot
175+ ### Locus plot
159176
160177Create a multi-panel locus plot showing GWAS results, fine-mapping
161178posterior probabilities, and per-tissue IMPACT scores:
@@ -164,14 +181,50 @@ posterior probabilities, and per-tissue IMPACT scores:
164181impact_plot <- IMPACT_plot_impact_score(annot_melt = annot_melt)
165182```
166183
167- ## Heatmap
184+ ### Heatmap
168185
169186Generate a ComplexHeatmap of mean IMPACT scores across loci and SNP groups:
170187
171188``` {r heatmap, eval=FALSE}
172189mat_meta <- IMPACT_heatmap(ANNOT_MELT = ANNOT_MELT)
173190```
174191
192+ # SpliceAI
193+
194+ ## Run SpliceAI
195+
196+ ` SPLICEAI_run() ` is the main entry point. It dispatches to the
197+ appropriate backend (API, local TSV, or VCF) depending on your input.
198+
199+ ``` {r spliceai-run, eval=FALSE}
200+ query_dat <- echodata::BST1[1:50,]
201+ res <- SPLICEAI_run(query_dat = query_dat)
202+ ```
203+
204+ ## Visualise splice probabilities
205+
206+ ``` {r spliceai-plot, eval=FALSE}
207+ plt <- SPLICEAI_plot(query_dat = res)
208+ ```
209+
210+ # Deep learning annotations
211+
212+ ## Query deep learning scores
213+
214+ ` DEEPLEARNING_query() ` retrieves precomputed variant-level scores from
215+ deep learning models (e.g. Basenji, DeepSEA) via tabix-indexed files.
216+
217+ ``` {r dl-query, eval=FALSE}
218+ query_dat <- echodata::BST1[1:50,]
219+ dl_res <- DEEPLEARNING_query(query_dat = query_dat)
220+ ```
221+
222+ ## Visualise scores
223+
224+ ``` {r dl-plot, eval=FALSE}
225+ plt <- DEEPLEARNING_plot(dl_res)
226+ ```
227+
175228
176229# Session Info
177230
0 commit comments