Skip to content

Commit 184718b

Browse files
[SYNPY-1726] Add bind-jsonschema CLI command (#1317)
* Add schema management wrapper functions using OOP models - Create schema_management.py with register_jsonschema and bind_jsonschema - Export functions from curator extension __init__.py * Update CLI commands to use OOP model wrapper functions - Update register_json_schema to use register_jsonschema wrapper - Update bind_json_schema to use bind_jsonschema wrapper - Add register-json-schema CLI command with argparse configuration * Add register-json-schema command to CLI documentation - Add register-json-schema to table of contents - Add complete command documentation with usage and parameters - Document schema registration workflow * Add register and bind examples to Python tutorial - Update tutorial to cover complete JSON Schema workflow - Add section 9: Register JSON Schema to Synapse - Add section 10: Bind JSON Schema to entities - Update tutorial script with register_jsonschema and bind_jsonschema examples - Update imports and line references * [test] Add unit tests for bind-json-schema CLI command - Add test for argument parsing - Add test for --enable-derived-annotations flag - Add test for API call invocation * [test] Fix bind-json-schema test to match new wrapper implementation Update test to mock bind_jsonschema wrapper instead of asyncio.run, and verify correct arguments are passed to the wrapper function. * Use JSONSchema.uri instead of manually building URI * Log success message in register_jsonschema instead of returning it - Update register_jsonschema to log message and return only schema_uri - Update CLI function to match new signature - Update docstring and examples * Return JSONSchema object instead of just URI string - Update return type from str to JSONSchema - Update docstring and examples to show accessing json_schema.uri - Update tutorial script accordingly * Remove unnecessary fallback logic in bind_jsonschema * Add integration tests for schema management functions Add comprehensive integration tests for register_jsonschema and bind_jsonschema wrapper functions. Tests cover: - Registering schemas with and without version - Binding schemas to folders - Enabling derived annotations - Complete register + bind workflow * Use new operations API instead of deprecated syn.get() * Simplify return statement in bind_jsonschema * [test] Add integration tests for register-json-schema and bind-json-schema CLI commands in test_command_line_client.py. Tests cover: - Registering a schema via CLI - Binding a schema to an entity via CLI - Complete workflow of register and bind operations * Fix return type annotation for bind_jsonschema * Remove comments from schema_manahement.py * Remove verbose binding details from CLI output * Simplify initial setup section in schema operations tutorial * Update main.py returning result * [test] fixing import * [test] removing assert fix - previously tried to test if that the binding result contains entity information (like entityId or entity_id) - result is a JSONSchemaBinding object (a Python class instance), not a dictionary - just checking that the result is not none should be sufficient * [test] changing clean up to finalizer * [test] removing try/ except blocks Now if cleanup fails, the test will fail * [test] add schema unbinding before folder deletion Add explicit schema unbinding before folder deletion to ensure schemas can be deleted (schemas cannot be deleted while bound - errors out) * [test] refactor to reuse existing test_state fixture - Remove SchemaState container class in favor of separate fixtures for schema_organization and schema_file. - aligns with the existing test pattern where test_state provides the core Synapse client, project, and parser, while additional fixtures provide test-specific resources. * [test] refactor use class-scoped fixture, fix UUID naming, use addfinalizer * [docs] moving hardcoded file paths to the top * [test] syn.delete > project.delete * Add Python 3.14 compatible async versions for schema management functions - Add register_jsonschema_async and bind_jsonschema_async functions - Refactor sync functions to use wrap_async_to_sync pattern - Export async functions from curator __init__.py * Apply suggestion from @thomasyu888 * Apply suggestions from code review --------- Co-authored-by: Thomas Yu <thomas.yu@sagebase.org>
1 parent c83a3c4 commit 184718b

9 files changed

Lines changed: 928 additions & 30 deletions

File tree

docs/tutorials/command_line_client.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,8 @@ synapse [-h] [--version] [-u SYNAPSEUSER] [-p SYNAPSE_AUTH_TOKEN] [-c CONFIGPATH
8282
- [get-sts-token](#get-sts-token): Get an STS token for access to AWS S3 storage underlying Synapse
8383
- [migrate](#migrate): Migrate Synapse entities to a different storage location
8484
- [generate-json-schema](#generate-json-schema): Generate JSON Schema(s) from a data model
85+
- [register-json-schema](#register-json-schema): Register a JSON Schema to a Synapse organization
86+
- [bind-json-schema](#bind-json-schema): Bind a JSON Schema to a Synapse entity
8587

8688
### `get`
8789

@@ -558,3 +560,32 @@ synapse generate-json-schema [-h] [--data-types data_type1, data_type2] [--outpu
558560
| `--data-types` | Named | Optional list of data types to create JSON Schema for |
559561
| `--output` | Named | Optional. Either a file path ending in '.json', or a directory path |
560562
| `--data-model-labels` | Named | Either 'class_label', or 'display_label' |
563+
564+
### `register-json-schema`
565+
566+
Register a JSON Schema to a Synapse organization for later binding to entities.
567+
568+
```bash
569+
synapse register-json-schema [-h] [--schema-version VERSION] schema_path organization_name schema_name
570+
```
571+
572+
| Name | Type | Description | Default |
573+
|-----------------------|------------|-------------------------------------------------------------------------------------|---------|
574+
| `schema_path` | Positional | Path to the JSON schema file to register | |
575+
| `organization_name` | Positional | Name of the organization to register the schema under | |
576+
| `schema_name` | Positional | The name of the JSON schema | |
577+
| `--schema-version` | Named | Version of the schema to register (e.g., '0.0.1'). If not specified, auto-generated | None |
578+
579+
### `bind-json-schema`
580+
581+
Bind a registered JSON Schema to a Synapse entity for metadata validation.
582+
583+
```bash
584+
synapse bind-json-schema [-h] [--enable-derived-annotations] id json_schema_uri
585+
```
586+
587+
| Name | Type | Description | Default |
588+
|-------------------------------|------------|------------------------------------------------------------------------------------|---------|
589+
| `id` | Positional | The Synapse ID of the entity to bind the schema to (e.g., syn12345678) | |
590+
| `json_schema_uri` | Positional | The URI of the JSON Schema to bind (e.g., 'my.org-schema.name-1.0.0') | |
591+
| `--enable-derived-annotations`| Named | Enable derived annotations to auto-populate annotations from schema | False |

docs/tutorials/python/schema_operations.md

Lines changed: 49 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -2,27 +2,26 @@ JSON Schema is a tool used to validate data. In Synapse, JSON Schemas can be use
22

33
Synapse supports a subset of features from [json-schema-draft-07](https://json-schema.org/draft-07). To see the list of features currently supported, see the [JSON Schema object definition](https://rest-docs.synapse.org/rest/org/sagebionetworks/repo/model/schema/JsonSchema.html) from Synapse's REST API Documentation.
44

5-
In this tutorial, you will learn how to create these JSON Schema using an existing data model.
5+
In this tutorial, you will learn how to create, register, and bind JSON Schemas using an existing data model.
66

77
## Tutorial Purpose
88

9-
You will create a JSON schema from your data model using the Python client as a library. To use the CLI tool, see the [documentation](../command_line_client.md).
9+
You will learn the complete JSON Schema workflow:
10+
1. **Generate** JSON schemas from your data model
11+
2. **Register** schemas to a Synapse organization
12+
3. **Bind** schemas to Synapse entities for metadata validation
13+
14+
This tutorial uses the Python client as a library. To use the CLI tool, see the [command line documentation](../command_line_client.md).
1015

1116
## Prerequisites
1217

1318
* You have a working [installation](../installation.md) of the Synapse Python Client.
1419
* You have a data model, see this [data model_documentation](../../explanations/curator_data_model.md).
1520

16-
## 1. Imports
17-
18-
```python
19-
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=1-2}
20-
```
21-
22-
## 2. Set up your variables
21+
## 1. Initial set up
2322

2423
```python
25-
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=4-11}
24+
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=1-18}
2625
```
2726

2827
To create a JSON Schema you need a data model, and the data types you want to create.
@@ -31,74 +30,96 @@ The data model must be in either CSV or JSON-LD form. The data model may be a lo
3130

3231
The data types must exist in your data model. This can be a list of data types, or `None` to create all data types in the data model.
3332

34-
## 3. Log into Synapse
35-
36-
```python
37-
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=13-14}
38-
```
39-
40-
## 4. Create a JSON Schema
33+
## 2. Create a JSON Schema
4134

4235
Create a JSON Schema
4336

4437
```python
45-
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=16-23}
38+
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=20-27}
4639
```
4740

4841
You should see the first JSON Schema for the datatype you selected printed.
4942
It will look like [this schema](https://repo-prod.prod.sagebase.org/repo/v1/schema/type/registered/dpetest-test.schematic.Patient).
5043
By setting the `output` parameter as path to a "temp" directory, the file will be created as "temp/Patient.json".
5144

52-
## 5. Create multiple JSON Schema
45+
## 3. Create multiple JSON Schema
5346

5447
Create multiple JSON Schema
5548

5649
```python
57-
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=26-32}
50+
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=30-36}
5851
```
5952

6053
The `data_types` parameter is a list and can have multiple data types.
6154

62-
## 6. Create every JSON Schema
55+
## 4. Create every JSON Schema
6356

6457
Create every JSON Schema
6558

6659
```python
67-
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=34-39}
60+
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=38-43}
6861
```
6962

7063
If you don't set a `data_types` parameter a JSON Schema will be created for every data type in the data model.
7164

72-
## 7. Create a JSON Schema with a certain path
65+
## 5. Create a JSON Schema with a certain path
7366

7467
Create a JSON Schema
7568

7669
```python
77-
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=41-47}
70+
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=45-51}
7871
```
7972

8073
If you have only one data type and set the `output` parameter to a file path(ending in.json), the JSON Schema file will have that path.
8174

82-
## 8. Create a JSON Schema in the current working directory
75+
## 6. Create a JSON Schema in the current working directory
8376

8477
Create a JSON Schema
8578

8679
```python
87-
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=49-54}
80+
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=53-58}
8881
```
8982

9083
If you don't set `output` parameter the JSON Schema file will be created in the current working directory.
9184

92-
## 9. Create a JSON Schema using display names
85+
## 7. Create a JSON Schema using display names
9386

9487
Create a JSON Schema
9588

9689
```python
97-
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=56-62}
90+
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=60-66}
9891
```
9992

10093
You can have Curator format the property names and valid values in the JSON Schema. This will remove whitespace and special characters.
10194

95+
## 8. Register a JSON Schema to Synapse
96+
97+
Once you've created a JSON Schema file, you can register it to a Synapse organization.
98+
99+
```python
100+
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=68-76}
101+
```
102+
103+
The `register_jsonschema` function:
104+
- Takes a path to your generated JSON Schema file
105+
- Registers it with the specified organization in Synapse
106+
- Returns the schema URI and a success message
107+
- You can optionally specify a version (e.g., "0.0.1") or let it auto-generate
108+
109+
## 9. Bind a JSON Schema to a Synapse Entity
110+
111+
After registering a schema, you can bind it to Synapse entities (files, folders, etc.) for metadata validation.
112+
113+
```python
114+
{!docs/tutorials/python/tutorial_scripts/schema_operations.py!lines=78-85}
115+
```
116+
117+
The `bind_jsonschema` function:
118+
- Takes a Synapse entity ID (e.g., "syn12345678")
119+
- Binds the registered schema URI to that entity
120+
- Optionally enables derived annotations to auto-populate metadata
121+
- Returns binding details
122+
102123
## Source Code for this Tutorial
103124

104125
<details class="quote">

docs/tutorials/python/tutorial_scripts/schema_operations.py

Lines changed: 35 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
from synapseclient import Synapse
2-
from synapseclient.extensions.curator import generate_jsonschema
2+
from synapseclient.extensions.curator import (
3+
bind_jsonschema,
4+
generate_jsonschema,
5+
register_jsonschema,
6+
)
37

48
# Path or URL to your data model (CSV or JSONLD format)
59
# Example: "path/to/my_data_model.csv" or "https://raw.githubusercontent.com/example.csv"
@@ -9,6 +13,16 @@
913
DATA_TYPE = ["Patient"]
1014
# Directory where JSON Schema files will be saved
1115
OUTPUT_DIRECTORY = "temp"
16+
# Path to a generated JSON Schema file for registration
17+
SCHEMA_PATH = "temp/Patient.json"
18+
# Your Synapse organization name for schema registration
19+
ORGANIZATION_NAME = "my.organization"
20+
# Name for the schema
21+
SCHEMA_NAME = "patient.schema"
22+
# Version number for the schema
23+
SCHEMA_VERSION = "0.0.1"
24+
# Synapse entity ID to bind the schema to (file, folder, etc.)
25+
ENTITY_ID = "syn12345678"
1226

1327
syn = Synapse()
1428
syn.login()
@@ -53,10 +67,29 @@
5367
synapse_client=syn,
5468
)
5569

56-
# Create JSON Schema in using display names for both properties names and valid values
70+
# Create JSON Schema using display names for both properties names and valid values
5771
schemas, file_paths = generate_jsonschema(
5872
data_model_source=DATA_MODEL_SOURCE,
5973
data_types=DATA_TYPE,
6074
data_model_labels="display_label",
6175
synapse_client=syn,
6276
)
77+
78+
# Register a JSON Schema to Synapse
79+
json_schema = register_jsonschema(
80+
schema_path=SCHEMA_PATH,
81+
organization_name=ORGANIZATION_NAME,
82+
schema_name=SCHEMA_NAME,
83+
schema_version=SCHEMA_VERSION,
84+
synapse_client=syn,
85+
)
86+
print(f"Registered schema URI: {json_schema.uri}")
87+
88+
# Bind a JSON Schema to a Synapse entity
89+
result = bind_jsonschema(
90+
entity_id=ENTITY_ID,
91+
json_schema_uri=json_schema.uri,
92+
enable_derived_annotations=True,
93+
synapse_client=syn,
94+
)
95+
print(f"Successfully bound schema to entity: {result}")

synapseclient/__main__.py

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,10 @@
3636
SynapseNoCredentialsError,
3737
)
3838
from synapseclient.extensions.curator.schema_generation import generate_jsonschema
39+
from synapseclient.extensions.curator.schema_management import (
40+
bind_jsonschema,
41+
register_jsonschema,
42+
)
3943
from synapseclient.wiki import Wiki
4044

4145
tracer = trace.get_tracer("synapseclient")
@@ -814,6 +818,31 @@ def generate_json_schema(args, syn):
814818
logging.info(f"Created JSON Schema files: [{paths}]")
815819

816820

821+
def register_json_schema(args, syn):
822+
"""Register a JSON schema to a Synapse organization."""
823+
register_jsonschema(
824+
schema_path=args.schema_path,
825+
organization_name=args.organization_name,
826+
schema_name=args.schema_name,
827+
schema_version=args.schema_version,
828+
synapse_client=syn,
829+
)
830+
831+
832+
def bind_json_schema(args, syn):
833+
"""Bind a JSON schema to a Synapse entity."""
834+
result = bind_jsonschema(
835+
entity_id=args.id,
836+
json_schema_uri=args.json_schema_uri,
837+
enable_derived_annotations=args.enable_derived_annotations,
838+
synapse_client=syn,
839+
)
840+
syn.logger.info(
841+
f"Successfully bound schema '{args.json_schema_uri}' to entity '{args.id}'"
842+
)
843+
return result
844+
845+
817846
def build_parser():
818847
"""Builds the argument parser and returns the result."""
819848

@@ -1845,6 +1874,54 @@ def build_parser():
18451874
)
18461875
parser_generate_json_schema.set_defaults(func=generate_json_schema)
18471876

1877+
parser_register_json_schema = subparsers.add_parser(
1878+
"register-json-schema", help="Register a JSON Schema to a Synapse organization."
1879+
)
1880+
parser_register_json_schema.add_argument(
1881+
"schema_path",
1882+
type=str,
1883+
help="Path to the JSON schema file to register",
1884+
)
1885+
parser_register_json_schema.add_argument(
1886+
"organization_name",
1887+
type=str,
1888+
help="Name of the organization to register the schema under",
1889+
)
1890+
parser_register_json_schema.add_argument(
1891+
"schema_name",
1892+
type=str,
1893+
help="The name of the JSON schema",
1894+
)
1895+
parser_register_json_schema.add_argument(
1896+
"--schema-version",
1897+
dest="schema_version",
1898+
type=str,
1899+
default=None,
1900+
help="Version of the schema to register (e.g., '0.0.1'). If not specified, a version will be auto-generated.",
1901+
)
1902+
parser_register_json_schema.set_defaults(func=register_json_schema)
1903+
1904+
parser_bind_json_schema = subparsers.add_parser(
1905+
"bind-json-schema", help="Bind a JSON Schema to a Synapse entity."
1906+
)
1907+
parser_bind_json_schema.add_argument(
1908+
"id",
1909+
type=str,
1910+
help="The Synapse ID of the entity to bind the schema to (e.g., syn12345678).",
1911+
)
1912+
parser_bind_json_schema.add_argument(
1913+
"json_schema_uri",
1914+
type=str,
1915+
help="The URI of the JSON Schema to bind (e.g., 'my.org-schema.name-1.0.0').",
1916+
)
1917+
parser_bind_json_schema.add_argument(
1918+
"--enable-derived-annotations",
1919+
action="store_true",
1920+
default=False,
1921+
help="Enable derived annotations to auto-populate annotations from schema. Defaults to False.",
1922+
)
1923+
parser_bind_json_schema.set_defaults(func=bind_json_schema)
1924+
18481925
parser_migrate.set_defaults(func=migrate)
18491926

18501927
return parser

synapseclient/extensions/curator/__init__.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,12 @@
77
from .file_based_metadata_task import create_file_based_metadata_task
88
from .record_based_metadata_task import create_record_based_metadata_task
99
from .schema_generation import generate_jsonld, generate_jsonschema
10+
from .schema_management import (
11+
bind_jsonschema,
12+
bind_jsonschema_async,
13+
register_jsonschema,
14+
register_jsonschema_async,
15+
)
1016
from .schema_registry import query_schema_registry
1117

1218
__all__ = [
@@ -15,4 +21,8 @@
1521
"query_schema_registry",
1622
"generate_jsonld",
1723
"generate_jsonschema",
24+
"register_jsonschema",
25+
"register_jsonschema_async",
26+
"bind_jsonschema",
27+
"bind_jsonschema_async",
1828
]

0 commit comments

Comments
 (0)