Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 13 additions & 2 deletions docs/source/process_data/version_pipelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,19 @@ Users need to understand how to interact with computed results produced by data

Core data processing pipelines MUST adopt [semantic versioning](https://semver.org/).
- Major version changes indicate that the structure or interpretation of the data has changed.

- Update to `aind-data-schema` that renames or restructures the metadata.
- Any default parameter changes.
- Changes to the output file structure.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add "any code update that changes output file content, even if that is fixing previously incorrect behavior"

- Minor version changes indicate new, backwards compatible features were added to the pipeline.
- Patch version changes indicate bug fixes.

- Add a new parameter to the input arguments.
- Add a new QC plot.

- Patch version changes indicate backwards compatible bug fixes.

- Critical bug fixes that do not alter the data structure.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say "do not alter output contents". Content vs structure is a big question here - I'd say writing to NWB 2.1 instead of 2.0 (changing structure in a non-breaking way) might make the most sense as a minor version (essentially bumping a dependency as in Sean's comment. but any change of content should be a major version (under strict semantic versioning anyway.
In my view this is perhaps a reason not to use strict semantic versioning, though I'm a bit undecided here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also clearly an actual patch: bug fix that simply enables a job that was previously failing to generate results (since this isn't changing contents)


The pipeline's name and semantic version MUST be stored in aind-data-schema [Processing](https://github.com/AllenNeuralDynamics/aind-data-schema/blob/dev/src/aind_data_schema/core/processing.py#L970) metadata at the top level of the results.

Expand Down Expand Up @@ -52,4 +63,4 @@ When querying the metadata database for `Processing.pipeline_version`, users and
- Semantic versions `< 0.2.0` (i.e., `0.1.0`, `0.1.1`)
- Code Ocean versions from before semantic versioning was adopted (i.e., `18.0`)

For pipelines that have adopted semantic versioning, users and developers will always be able to find a pipelines semantic version in the `nextflow.config`.
For pipelines that have adopted semantic versioning, users and developers will always be able to find a pipelines semantic version in the `nextflow.config`.