-
Notifications
You must be signed in to change notification settings - Fork 74
Add initial CLAUDE.md for AI-assisted development #1342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
532295e
Add initial CLAUDE.md for AI-assisted development
BryanFauble eb74b2b
Add module-level CLAUDE.md files for models, api, core, and tests
BryanFauble f54c099
Rewrite and expand CLAUDE.md coverage to 16 files
BryanFauble 826cca2
Fix inaccuracies and add missing context to CLAUDE.md files
BryanFauble a319562
Address PR review feedback
BryanFauble a538eb7
Merge branch 'develop' into add-claude-md
BryanFauble 486c506
Add integration test fixture best practices to tests/CLAUDE.md
BryanFauble 8ad0553
Add integration test fixture best practices to tests/CLAUDE.md
BryanFauble File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,117 @@ | ||
| <!-- Last reviewed: 2026-03 --> | ||
|
|
||
| ## Project | ||
|
|
||
| Synapse Python Client — official Python SDK and CLI for Synapse (synapse.org), a collaborative science platform by Sage Bionetworks. Provides programmatic access to entities (projects, files, folders, tables, views), metadata, permissions, evaluations, and data curation workflows. Published to PyPI as `synapseclient`. | ||
|
|
||
| ## Stack | ||
|
|
||
| - Python 3.10–3.14 (`setup.cfg`: `python_requires = >=3.10, <3.15`) | ||
| - HTTP: httpx (async), requests (sync/legacy) | ||
| - Models: stdlib dataclasses (NOT Pydantic) | ||
| - Tests: pytest 8.2, pytest-asyncio, pytest-socket, pytest-xdist | ||
| - Docs: MkDocs with Material theme, mkdocstrings | ||
| - Linting: ruff, black (line-length 88), isort (profile=black), bandit | ||
| - CI: GitHub Actions → SonarCloud, PyPI deploy on release | ||
| - Docker: `Dockerfile` at repo root, published to `ghcr.io/sage-bionetworks/synapsepythonclient` | ||
|
|
||
| ## Commands | ||
|
|
||
| ```bash | ||
| # Install for development | ||
| pip install -e ".[boto3,pandas,pysftp,tests,curator,dev]" | ||
|
|
||
| # Unit tests | ||
| pytest -sv tests/unit | ||
|
|
||
| # Integration tests (requires Synapse credentials, runs in parallel) | ||
| pytest -sv --reruns 3 tests/integration -n 8 --dist loadscope | ||
|
|
||
| # Pre-commit checks (ruff, black, isort, bandit) | ||
| pre-commit run --all-files | ||
|
|
||
| # Build docs locally | ||
| pip install -e ".[docs]" && mkdocs serve | ||
| ``` | ||
|
|
||
| ## Conventions | ||
|
|
||
| ### Async-first with generated sync wrappers | ||
| All new methods must be async with `_async` suffix. The `@async_to_sync` class decorator (`core/async_utils.py`) auto-generates sync counterparts at class definition time. Never write sync methods manually on model classes — the decorator handles it. | ||
|
|
||
| ### `wrap_async_to_sync()` for standalone functions | ||
| Use `wrap_async_to_sync()` (not `@async_to_sync`) for free-standing async functions outside of classes — see `operations/` layer for the pattern. The class decorator only works on classes. | ||
|
|
||
| ### Protocol classes for sync type hints | ||
| Each model in `models/` has a corresponding protocol in `models/protocols/` defining the sync method signatures. When adding a new async method to a model, add its sync signature to the protocol class so IDE type hints work. | ||
|
|
||
| ### Dataclass models with `fill_from_dict()` | ||
| Models are `@dataclass` classes, NOT Pydantic. REST responses are deserialized via `fill_from_dict()` methods on each model. New models must follow this pattern. | ||
|
|
||
| ### Concrete types are Java class names | ||
| `core/constants/concrete_types.py` maps Java class names (e.g., `org.sagebionetworks.repo.model.FileEntity`) for polymorphic entity deserialization. When adding new entity types, register the concrete type string here AND in `api/entity_factory.py` AND in `models/mixins/asynchronous_job.py` if it's an async job type. | ||
|
|
||
| ### Options dataclass pattern | ||
| The `operations/` layer uses dataclass option objects (`StoreFileOptions`, `FileOptions`, `TableOptions`, etc.) to bundle type-specific configuration for CRUD operations. Follow this pattern for new entity-type-specific options. | ||
|
|
||
| ### Mixin composition for shared behavior | ||
| Shared functionality lives in `models/mixins/` (AccessControllable, StorableContainer, AsynchronousJob, etc.). Prefer adding to existing mixins over duplicating logic across models. | ||
|
|
||
| ### `synapse_client` parameter pattern | ||
| Most functions accept an optional `synapse_client` parameter. If omitted, `Synapse.get_client()` returns the cached singleton. Never pass `None` explicitly — omit the argument instead. | ||
|
|
||
| ### Branch naming | ||
| Use `SYNPY-{issue_number}` or `synpy-{issue_number}` prefix for feature branches. PR titles follow `[SYNPY-XXXX] Description` format. | ||
|
|
||
| ## Architecture | ||
|
|
||
| ``` | ||
| synapseclient/ | ||
| ├── client.py # Synapse class — public entry point, REST methods, auth (9600+ lines) | ||
| ├── api/ # REST API layer — one file per resource type (21 files) | ||
| │ └── entity_factory.py # Polymorphic entity deserialization via concrete type dispatch | ||
| ├── models/ # Dataclass entities (Project, File, Table, etc.) (28 files) | ||
| │ ├── protocols/ # Sync method type signatures for IDE hints (18 files) | ||
| │ ├── mixins/ # Shared behavior (ACL, containers, async jobs, tables) (7 files) | ||
| │ └── services/ # Model-level business logic (storable_entity, search) | ||
| ├── operations/ # High-level CRUD: get(), store(), delete() — factory dispatch | ||
| ├── core/ # Infrastructure: upload/download, retry, cache, creds, OTel | ||
| │ ├── upload/ # Multipart upload (sync + async) | ||
| │ ├── download/ # File download (sync + async) | ||
| │ ├── credentials/ # Auth chain (PAT, env var, config file, AWS SSM) | ||
| │ ├── constants/ # Concrete types, config keys, limits, method flags | ||
| │ ├── models/ # ACL, Permission, DictObject, custom JSON serialization | ||
| │ └── multithread_download/ # Threaded download manager | ||
| ├── extensions/ | ||
| │ └── curator/ # Schema curation (pandas, networkx, rdflib) — optional | ||
| ├── services/ # JSON schema validation services | ||
| └── entity.py, table.py, ... # Legacy classes (pre-OOP rewrite, read-only) | ||
|
|
||
| synapseutils/ # Legacy bulk utilities (copy, sync, migrate, walk) — sync-only | ||
| ``` | ||
|
|
||
| Data flow: User → `operations/` factory → model async methods → `api/` service functions → `client.py` REST calls → Synapse API. Responses deserialized via `fill_from_dict()` on model instances. | ||
|
|
||
| ## Constraints | ||
|
|
||
| - Do not use Pydantic for models — the codebase uses stdlib dataclasses with custom serialization. Mixing would break the `@async_to_sync` decorator and `fill_from_dict()` pattern. | ||
| - For new tests, prefer async test modules. Existing synchronous unit tests under `tests/unit/` are retained and maintained; the `@async_to_sync` decorator is covered by a dedicated smoke test, so avoid adding duplicate sync/async test coverage. | ||
| - On non-Windows platforms, unit tests must not make external network calls — `pytest-socket` blocks internet-facing sockets while allowing Unix domain sockets. Socket blocking is skipped on Windows. Use `pytest-mock` for HTTP mocking. | ||
| - `develop` is the default/main branch, not `main` or `master`. PRs target `develop`. | ||
| - Legacy classes in root `synapseclient/` (entity.py, table.py, etc.) are kept for backwards compatibility. New features go in `models/` using the dataclass pattern. | ||
| - Avoid adding new methods to `client.py` (9600+ lines) — prefer the `api/` + `models/` layered pattern. | ||
| - `synapseutils/` is legacy sync-only (uses `requests`, NOT `httpx`). Do not add async methods there — new async equivalents go in `models/` or `operations/`. | ||
|
|
||
| ## Testing | ||
|
|
||
| - `asyncio_mode = auto` in pytest.ini — no need for `@pytest.mark.asyncio` | ||
| - `asyncio_default_fixture_loop_scope = session` — all async tests share one event loop | ||
| - Unit test client fixture: session-scoped, `skip_checks=True`, `cache_client=False` | ||
| - Integration tests use `--reruns 3` for flaky retries and `-n 8 --dist loadscope` for parallelism | ||
| - Integration fixtures create per-worker Synapse projects; use `schedule_for_cleanup()` for teardown | ||
| - Auth env vars: `SYNAPSE_AUTH_TOKEN` (bearer token), `SYNAPSE_PROFILE` (config file profile, default: `"default"`), `SYNAPSE_TOKEN_AWS_SSM_PARAMETER_NAME` (AWS SSM path) | ||
| - CI runs integration tests only on Python 3.10 and 3.14 (oldest + newest) to limit Synapse server load | ||
|
|
||
| ## Maintenance | ||
|
|
||
| Each CLAUDE.md file has a `<!-- Last reviewed: YYYY-MM -->` header. Update this when the file is reviewed or modified. If a code change invalidates guidance in a CLAUDE.md file, update the guidance in the same PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,61 @@ | ||
| <!-- Last reviewed: 2026-03 --> | ||
|
BryanFauble marked this conversation as resolved.
|
||
|
|
||
| ## Project | ||
|
|
||
| User-facing documentation for the Synapse Python Client. Built with MkDocs + Material theme, deployed via GitHub Pages. Follows the Diataxis documentation framework with four content types: tutorials, guides, reference, and explanations. | ||
|
|
||
| ## Stack | ||
|
|
||
| MkDocs with Material theme, mkdocstrings (Google-style docstrings), termynal (CLI animations), markdown-include (file embedding). | ||
|
|
||
| ### Python style | ||
| - Use built-in generics (`list`, `dict`, `tuple`, `set`) instead of `typing.List`, `typing.Dict`, etc. (Python 3.9+) | ||
|
|
||
| ## Conventions | ||
|
|
||
|
BryanFauble marked this conversation as resolved.
|
||
| ### Content types (Diataxis framework) | ||
| - **tutorials/** — Step-by-step learning (competence-building). Themed around a biomedical researcher working with Alzheimer's Disease data. Progressive build-up: Project → Folder → File → Annotations → etc. | ||
| - **guides/** — How-to guides for specific use cases (problem-solution oriented). Includes extension-specific guides (curator). | ||
| - **reference/** — API reference auto-generated from docstrings via mkdocstrings. Split into `experimental/sync/` and `experimental/async/` for new OOP API. | ||
| - **explanations/** — Deep conceptual content ("why" not just "how"). Design decisions, internal machinery. | ||
|
|
||
| ### File inclusion pattern (markdown-include) | ||
| Tutorial code lives in `tutorials/python/tutorial_scripts/*.py` and is embedded in markdown via line-range includes: | ||
| ```markdown | ||
| {!docs/tutorials/python/tutorial_scripts/annotation.py!lines=5-23} | ||
| ``` | ||
| Single source of truth — edit the `.py` file, not the markdown. Changing line numbers in scripts requires updating the line ranges in the corresponding `.md` files. | ||
|
|
||
| ### mkdocstrings reference generation | ||
| Reference markdown files use `::: synapseclient.ClassName` syntax to trigger auto-generation from docstrings. Key configuration: | ||
| - `docstring_style: google` — parse Google-style docstrings | ||
| - `members_order: source` — preserve source code order | ||
| - `filters: ["!^_", "!to_synapse_request", "!fill_from_dict"]` — private members, `to_synapse_request()`, and `fill_from_dict()` are excluded from docs | ||
| - `inherited_members: true` — shows mixin methods on inheriting classes | ||
| - Member lists are explicit — each reference page specifies which methods to document | ||
|
|
||
| ### Anchor links for cross-referencing | ||
| Pattern: `[](){ #reference-anchor }` in reference pages. Tutorials link to reference via `[API Reference][project-reference-sync]`. Explicit type hints use: `[syn.login][synapseclient.Synapse.login]`. | ||
|
|
||
| ### termynal CLI animations | ||
| Terminal animation blocks marked with `<!-- termynal -->` HTML comment. Prompts configured as `$` or `>`. Used in authentication.md and installation docs. | ||
|
|
||
| ### Custom CSS (`css/custom.css`) | ||
| - API reference indentation: `doc-contents` has 25px left padding with border | ||
| - Smaller table font (0.7rem) for API docs | ||
| - Wide layout: `max-width: 1700px` for complex content | ||
|
|
||
| ### Navigation structure | ||
| Defined in `mkdocs.yml` nav section. 5 main sections: Home, Tutorials, How-To Guides, API Reference, Further Reading, News. API Reference has ~85 markdown files (~40 legacy, ~45 experimental). | ||
|
|
||
| ## Constraints | ||
|
|
||
| - Do not edit tutorial code inline in markdown — edit the `.py` script file in `tutorial_scripts/` and update line ranges if needed. | ||
| - Reference docs auto-generate from source docstrings — to change method documentation, edit the docstring in the Python source, not the markdown. | ||
| - `mkdocs.yml` is at the repo root, not in `docs/` — it configures the entire doc build. | ||
| - Docs deploy to Read the Docs (configured via `.readthedocs.yaml` at repo root). | ||
| - Local build output goes to `docs_site/` (via `site_dir` in `mkdocs.yml`) — gitignored. | ||
| - Cross-referencing uses the `autorefs` plugin: `[display text][synapseclient.ClassName.method]` auto-resolves to mkdocstrings anchors. | ||
|
|
||
| ### news.md | ||
| Release notes live in `docs/news.md`. Each release gets a heading with the version number and date, followed by bullet points describing changes. Group entries by category (Features, Bug Fixes, etc.). Reference Jira ticket numbers (SYNPY-XXXX) in each entry. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,83 @@ | ||
| <!-- Last reviewed: 2026-03 --> | ||
|
BryanFauble marked this conversation as resolved.
|
||
|
|
||
| ## Project | ||
|
|
||
| REST API service layer — thin async functions that map to Synapse REST endpoints. One file per resource type. Called by model layer, never by end users directly. | ||
|
BryanFauble marked this conversation as resolved.
|
||
|
|
||
| ## Conventions | ||
|
|
||
| ### Function signature pattern | ||
| ```python | ||
| async def verb_resource( | ||
| required_param: str, | ||
| optional_param: str = None, | ||
| *, | ||
| synapse_client: Optional["Synapse"] = None, | ||
| ) -> Dict[str, Any]: | ||
| ``` | ||
| - All functions are `async def` | ||
|
BryanFauble marked this conversation as resolved.
|
||
| - `synapse_client` is **always** `Optional["Synapse"] = None` — never make it required. Callers omit it to use the cached singleton returned by `Synapse.get_client()`. | ||
| - `synapse_client` is always the last parameter, keyword-only (after `*`) | ||
| - Use `Synapse.get_client(synapse_client=synapse_client)` to get the client instance | ||
| - Use `TYPE_CHECKING` guard for `Synapse` import — avoids circular dependencies between `api/` and `client.py` | ||
| - Construct a `query_params` dictionary for non-null optional args, and pass it to the `params` arg of the REST call. See `entity_services.py` for the pattern. | ||
|
|
||
| ### Docstring conventions | ||
| Module-level — every file opens with boilerplate linking to the Synapse REST controller: | ||
| ```python | ||
| """This module is responsible for exposing the services defined at: | ||
| <https://rest-docs.synapse.org/rest/#org.sagebionetworks.repo.web.controller.XController> | ||
| """ | ||
| ``` | ||
| Function-level (Google style): | ||
| ```python | ||
| """ | ||
| One-line summary. | ||
|
|
||
| <https://rest-docs.synapse.org/rest/POST/endpoint.html> | ||
|
|
||
| Arguments: | ||
| param: Description. | ||
| synapse_client: If not passed in and caching was not disabled by | ||
| `Synapse.allow_client_caching(False)` this will use the last created | ||
| instance from the Synapse class constructor. | ||
|
|
||
| Returns: | ||
| Description of return value. | ||
| """ | ||
| ``` | ||
| - The `synapse_client` argument description is boilerplate — always copy it verbatim, not paraphrased. | ||
| - The REST endpoint URL uses `<link>` format (angled brackets), not markdown `[text](url)`. | ||
| - Parameter descriptions in `Arguments:` must be copied verbatim from the Synapse REST API docs for that endpoint — do not paraphrase or infer. | ||
|
|
||
| ### REST call pattern | ||
| ```python | ||
| client = Synapse.get_client(synapse_client=synapse_client) | ||
| return await client.rest_post_async(uri="/endpoint", body=json.dumps(request)) | ||
|
BryanFauble marked this conversation as resolved.
|
||
| ``` | ||
| Available methods: `rest_get_async`, `rest_post_async`, `rest_put_async`, `rest_delete_async`. Pass `endpoint=client.fileHandleEndpoint` for file handle operations; omit for the default repository endpoint. Use `json.dumps()` for request bodies — not raw dicts. Always assign the response to a named `response` variable before returning or extracting attributes from it. | ||
|
|
||
| ### Return values | ||
| - Most functions return raw `Dict[str, Any]` — transformation happens in the model layer via `fill_from_dict()` | ||
| - Some return typed dataclass instances (e.g., `EntityHeader` from `entity_services.py`) when the data is only used internally | ||
| - Delete operations return `None` | ||
|
|
||
| ### Pagination | ||
|
BryanFauble marked this conversation as resolved.
|
||
| Use async pagination helpers when the API endpoint returns a list of results. For single-object responses, a simple `return` is sufficient. | ||
|
|
||
| Helpers from `api_client.py`: | ||
| - `rest_get_paginated_async()` — for GET endpoints with limit/offset. Expects `results` or `children` key in response. | ||
| - `rest_post_paginated_async()` — for POST endpoints with `nextPageToken`. Expects `page` array in response. | ||
| Both are async generators yielding individual items. Reference `entity_services.py`, `table_services.py`, or `evaluation_services.py` for pagination patterns. | ||
|
|
||
| ### Entity factory (`entity_factory.py`) | ||
| Polymorphic entity deserialization via concrete type dispatch. Maps Java class names from `core/constants/concrete_types.py` to model classes. When adding a new entity type, register the type mapping here. | ||
|
|
||
| ### When to add a new service file vs. update an existing one | ||
| Add a new file when the Synapse REST controller is different (each file maps to one controller). Update an existing file when adding endpoints under the same controller. | ||
|
|
||
| ### Adding a new service file | ||
|
BryanFauble marked this conversation as resolved.
|
||
| 1. Create `synapseclient/api/new_service.py` | ||
| 2. Add all public functions to `api/__init__.py` imports and `__all__` — every public function must be re-exported | ||
| 3. Use `json.dumps()` for request bodies (not dict) | ||
| 4. Reference `entity_services.py` for CRUD pattern, `table_services.py` or `evaluation_services.py` for pagination pattern | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.