-
Notifications
You must be signed in to change notification settings - Fork 74
[SYNPY-1786] Added examples for getting data into a grid #1335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from 1 commit
3175175
2199893
3620e39
163e24b
fa74a18
fc02a36
dbddb31
81dee71
15f2bfd
649837c
1ba8b79
eaa7147
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -272,7 +272,114 @@ else: | |
| print("No validation results available. The Grid session must be exported to generate validation results.") | ||
| ``` | ||
|
|
||
| ### Example: Complete validation workflow for animal study metadata | ||
| ### Example2: Getting data into a Grid for a file-based workflow | ||
|
|
||
| The following example is for file-based curation. It assumes your data is in a CSV file. | ||
| It assumes your data is in a CSV file where each column is a property. | ||
|
andrewelamb marked this conversation as resolved.
Outdated
|
||
|
|
||
| ```python | ||
| import pandas as pd | ||
| from synapseclient import Synapse | ||
| from synapseclient.models import CurationTask, Folder, File | ||
| from synapseclient.core.utils import make_bogus_data_file | ||
| from synapseclient.extensions.curator import create_file_based_metadata_task | ||
|
|
||
| # 1. Replace all these values with your own information | ||
| PROJECT_ID = "syn68175188" | ||
| FOLDER_NAME = "Patient Curation Folder" | ||
| CSV_PATH = "patient.csv" | ||
|
andrewelamb marked this conversation as resolved.
Outdated
|
||
| JSON_SCHEMA_URI = "dpetest-test.schematic.Patient" | ||
| CURATION_TASK_NAME = "File-based curation task for patients" | ||
| INSTRUCTIONS = "Please curate the patient information." | ||
|
|
||
| # 2. Login to Synapse | ||
| syn = Synapse() | ||
| syn.login() | ||
|
|
||
| # 3. Get annotations from CSV file | ||
| annotations = pd.read_csv(CSV_PATH).to_dict(orient="records")[0] | ||
|
andrewelamb marked this conversation as resolved.
Outdated
|
||
|
|
||
| # 4. Create a folder store the file that will be used for curation | ||
|
andrewelamb marked this conversation as resolved.
Outdated
|
||
| folder = Folder(name=FOLDER_NAME, parent_id=PROJECT_ID) | ||
| folder = folder.store(synapse_client=syn) | ||
|
|
||
| # 5. Create a file, annotate it, and store it in Synapse folder | ||
| path_to_file = make_bogus_data_file(n=5) | ||
| file = File(path=path_to_file, parent_id=folder.id, annotations=annotations) | ||
| file = file.store(synapse_client=syn) | ||
|
|
||
| # 6. Create EntityView and CurationTask | ||
| view_id, task_id = create_file_based_metadata_task( | ||
|
andrewelamb marked this conversation as resolved.
|
||
| folder_id=folder.id, | ||
| curation_task_name=CURATION_TASK_NAME, | ||
| instructions=INSTRUCTIONS, | ||
| schema_uri=JSON_SCHEMA_URI, | ||
| synapse_client=syn, | ||
| ) | ||
|
|
||
|
linglp marked this conversation as resolved.
|
||
| # 7. Cleanup all Synapse entities created | ||
| folder.delete(synapse_client=syn) | ||
| CurationTask(task_id=task_id).delete(synapse_client=syn, delete_source=True) | ||
| ``` | ||
|
|
||
| ### Example2: Getting data into a Grid for a record-based workflow | ||
|
|
||
| The following example is for record-based curation. | ||
| It assumes your data is in a CSV file where each column is a property. | ||
|
|
||
| ```python | ||
| import pandas as pd | ||
|
|
||
| from synapseclient import Synapse | ||
| from synapseclient.models import Folder | ||
| from synapseclient.extensions.curator import create_record_based_metadata_task | ||
|
|
||
|
|
||
| # 1. Replace all these values with your own information | ||
| PROJECT_ID = "syn68175188" | ||
| FOLDER_NAME = "Patient Curation Folder" | ||
| CSV_PATH = "patient.csv" | ||
| JSON_SCHEMA_URI = "dpetest-test.schematic.Patient" | ||
| CURATION_TASK_NAME = "Record-based curation task for patients" | ||
| INSTRUCTIONS = "Please curate the patient information." | ||
| RECORD_SET_NAME = "Patient Record Set" | ||
| RECORD_SET_DESCRIPTION = "A record set for patients created for a file-based curation task example." | ||
|
andrewelamb marked this conversation as resolved.
Outdated
|
||
| UPSERT_KEYS = ["PatientID"] | ||
|
|
||
| # 2. Login to Synapse | ||
| syn = Synapse() | ||
| syn.login() | ||
|
|
||
| # 3. Create a folder to store the RecordSet in | ||
| folder = Folder(name=FOLDER_NAME, parent_id=PROJECT_ID) | ||
|
andrewelamb marked this conversation as resolved.
|
||
| folder = folder.store(synapse_client=syn) | ||
|
|
||
| # 4. Create RecordSet, CurationTask, and Grid | ||
| record_set, task, grid = create_record_based_metadata_task( | ||
| folder_id=folder.id, | ||
| record_set_name=RECORD_SET_NAME, | ||
| record_set_description=RECORD_SET_DESCRIPTION, | ||
| curation_task_name=CURATION_TASK_NAME, | ||
| upsert_keys=UPSERT_KEYS, | ||
| instructions=INSTRUCTIONS, | ||
| schema_uri=JSON_SCHEMA_URI, | ||
| synapse_client=syn, | ||
| ) | ||
|
|
||
| # 5. Store the record set with the path to the CSV file as an annotation. | ||
| # TODO: Add functionality to store data in a RecordSet programmatically | ||
| # This doesn't store the data in the RecordSet | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does this comment mean?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I clarified it, but essentially it's a placeholder for SYNPY-1781
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it. Technically you can already do this but we would have to merge the CSV client side and upload a new complete copy of the data. We should wait though to use the import featurem |
||
| record_set.get(synapse_client=syn) | ||
| record_set.path = CSV_PATH | ||
| record_set = record_set.store(synapse_client=syn) | ||
|
|
||
| # 6. Cleanup all Synapse entities created | ||
| folder.delete(synapse_client=syn) | ||
| record_set.delete(synapse_client=syn, delete_source=True) | ||
|
andrewelamb marked this conversation as resolved.
Outdated
|
||
| grid.delete(synapse_client=syn) | ||
|
andrewelamb marked this conversation as resolved.
Outdated
|
||
| ``` | ||
|
|
||
| ### Example3: Complete validation workflow for animal study metadata | ||
|
|
||
| This example demonstrates the full workflow from creating a curation task through validating the submitted metadata: | ||
|
|
||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.