You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
as separate workflows that execute only one or a few rules. When using TES, it
79
-
is recommended to use additional remote storage to store input and output
80
-
files. By default, Snakemake TES tasks are executed using the official
81
-
Snakemake container image in the same version as the original Snakemake call.
82
-
To use specific tools, conda environments should be appended to the rules. A
83
-
demo workflow is available
84
-
[here][elixir-cloud-demo-smk].
85
-
86
-
### CWL-tes
87
-
88
-
A demo workflow is available [here][elixir-cloud-demo-cwl].
89
-
90
-
### Nextflow
91
-
92
-
You can find an article about NextFlow with GA4GH TES [here](https://techcommunity.microsoft.com/blog/healthcareandlifesciencesblog/introducing-nextflow-with-ga4gh-tes-a-new-era-of-scalable-data-processing-on-azu/4253160)
93
-
94
-
To use TES in your Nextflow config, use the plugin `nf-ga4gh`:
95
-
96
-
```
97
-
plugins {
98
-
id 'nf-ga4gh'
99
-
}
100
-
```
101
-
102
-
## Workflow Execution Service (WES)
103
-
The GA4GH [WES][ga4gh-wes] is a standard specification protocol for executing
104
-
and monitoring bioinformatics workflows. It allows researchers to easily
105
-
execute and manage complex analysis pipelines across multiple computing
106
-
platforms and institutions. The WES specification provides a unified API for
107
-
describing workflow inputs and outputs, monitoring job status and progress, and
108
-
managing data transfers. With this specification, users can build scalable,
109
-
reproducible, and interoperable genomics workflows, enabling collaboration
110
-
across institutions and improving data sharing. Two use cases for the GA4GH WES
111
-
specification are:
112
-
113
-
- Scenario 1: A researcher wants to analyze a large dataset of genomic data
114
-
using a specific analysis pipeline. With the WES specification, the
115
-
researcher can easily define the inputs and parameters for the pipeline,
116
-
select a computing platform that meets their requirements, and submit the job
117
-
for execution. They can then monitor the progress of the job and receive
118
-
notifications when the job is complete. This allows the researcher to focus
119
-
on analyzing the results rather than managing the underlying infrastructure.
120
-
121
-
- Scenario 2: A clinical laboratory needs to process patient samples for
122
-
genetic testing. The laboratory can use the WES specification to define the
123
-
analysis pipeline and integrate it with its LIMS. This allows the laboratory
124
-
to automate the processing of samples, reducing errors and turnaround time.
125
-
126
-
## Data Repository Service (DRS)
127
-
128
-
The GA4GH [DRS][ga4gh-drs] API provides a standard set of data retrieval methods
129
-
to access genomic and related health data across different repositories.
130
-
It allows researchers to simplify and standardize data retrieval in cloud-based
131
-
environements. Some key features like Standardized data access that offers a consistent
132
-
API for retrieving datasets. Cloud-agnostic means that it works accross different
133
-
cloud infrastructures. Two use cases for the GA4GH DRS:
134
-
135
-
- Scenario 1: A researcher wants to run an analysis pipeline on a dataset without
136
-
worrying about where the data physically resides. The researcher uses a DRS ID
137
-
to request the dataset. DRS resolves the ID to the actual storage location and
138
-
provides signed URLs or access tokens and the pipeline retrievess the data
139
-
seamlessly, regardless of the underlying cloud or storage system.
140
-
141
-
- Scenario 2: A pharmaceutical company is collaborating with hospitals to analyze
142
-
patient genomic data. Due to privacy regulations, raw data cannot be moved outside
143
-
the hospital’s secure environment. The hospital can expose their datasets via DRS
144
-
endpointsand the pharmaceutical company's workflow engine queries DRS to get metadata.
145
-
Finally, the analysis is performed without violating data residency rules.
146
-
147
-
## Tool Registry Service (TRS)
148
-
149
-
The GA4GH [TRS][ga4gh-trs] API provides a standard mechanism to list, search and
150
-
register tools and worflows across different platforms and cloud environments.
151
-
It supports workflows written in CWL, WDL, Nextflow, Galaxy, Snakemake.
152
-
Here are examples of two use cases:
153
-
154
-
- Scenario 1: A bioinformatics researcher develops a workflow for variant calling
155
-
using WDL and Docker containers. They want to share it with collaborators who use
156
-
different platform. TRS can help, the researcher registers the workflow in a
157
-
TRS-compliant registry like Dockstore. The collaborators can discover the workflow
158
-
via TRS API and run it on their platform.
159
-
TRS will ensure that metadata, versioning, and container are standardized and
160
-
accessible
161
-
162
-
- Scenario 2: A hospital’s genomics lab uses an automated pipeline to analyze patient
163
-
exome data for rare disease diagnosis. The pipeline queries a TRS registry to find
164
-
the latest version of tools (like VEP or GATK), retrieves the workflow descriptor
165
-
and container images. Finally, the pipeline executes the tools in a secure,
0 commit comments