Skip to content

Commit 22a3762

Browse files
Docs: instrument metadata documentation (#16)
* First pass instrument metadata documentation * Added details on instrument in the database * Added notes on fetching/saving always being latest versions * Reordered sections * Added link to merging section * docs: simplify writing for link to mege * docs: update w/ David's language * docs: fix header levels * docs: simplify write language * docs: simplify fetch section * docs: tweak some language in details sections * Added note about ID conflict * Added merging rule link * Explain date conflict and replace arg * Clarify 2.0 only support for database storage * Replaced 'rig' with 'instrument' when describing ID --------- Co-authored-by: Dan Birman <danbirman@gmail.com>
1 parent 3a8af48 commit 22a3762

1 file changed

Lines changed: 94 additions & 5 deletions

File tree

docs/source/acquire_upload/prepare_before_acquisition.md

Lines changed: 94 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -299,15 +299,104 @@ Subject metadata is populated by lab animal services (LAS) without your involvem
299299

300300
## Instrument
301301

302-
[Instrument](https://aind-data-schema.readthedocs.io/en/latest/instrument.html) metadata should be prepared in advance and uploaded to the metadata-service.
302+
[Instrument](https://aind-data-schema.readthedocs.io/en/latest/instrument.html) metadata should be prepared in advance of data acquisition.
303303

304-
Keep track of your `instrument_id` you will need to provide this value when you upload your data asset later.
304+
### ID
305305

306-
### I want to write an instrument.json
306+
The `instrument_id` for AIND should be the SIPE ID for an instrument. If an instrument is not tracked by SIPE, any string will be accepted.
307307

308-
### I'm ready to upload my instrument.json
308+
### Other details
309309

310-
[TODO]
310+
#### Multiple instruments
311+
312+
Multiple `instrument.json` files can be provided when two separate instruments are used simultaneously to acquire a data asset. See [metadata merging rules](upload.md#metadata-merging-rules) for information about how metadata files are merged during data upload.
313+
314+
#### Upload options
315+
316+
Users have two options for providing instrument metadata files:
317+
318+
1) Files can be provided at upload time in the data folder. In this case, it is up to users to ensure that the instrument file(s) are in the data folder when upload is triggered. Users are free to set this up however they choose. Two patterns that have been used are:
319+
320+
* A static instrument metadata file is saved somewhere on the data acquisition machine and is copied into the data folder prior to upload
321+
322+
* A script is run that dynamically generates an instrument metadata file before upload.
323+
324+
2) A static version of the instrument metadata is uploaded to a database in advance. See details below. In this case, users must specify the `instrument_id` as part of the job parameters in the `gather_preliminary_metadata` job type settings as follows
325+
326+
```
327+
{
328+
"skip_task": false,
329+
"job_settings": {
330+
"instrument_settings": {
331+
"instrument_id": INSTRUMENT_ID # a string containing a valid instrument ID
332+
}
333+
},
334+
...
335+
}
336+
```
337+
338+
The data transfer service will then pull the instrument metadata from the database during upload.
339+
340+
Note that it is possible to combine these methods. For example, a user could pass the instrument JSON for the behavior instrument in the data directory (named something like `instrument_behavior.json`) and also specify a physiology rig by instrument ID in the `gather_preliminary_metadata` job type settings. The two instrument files would be merged by the data transfer service. See [metadata merging rules](upload.md#metadata-merging-rules).
341+
342+
Also note that we currently require all devices in the database to have a unique `instrument_id`. It is therefore not possible to store two distinct modality specific instrument.json files that share an `instrument_id` in the database.
343+
344+
### Maintenance responsibility
345+
346+
While it is ultimately the responsibility of the scientist collecting data to ensure that all metadata is correct, it is the responsibility of the person who modifies an instrument to update instrument metadata to reflect the changes they made.
347+
348+
### How to
349+
350+
The following sections describe use cases for saving, fetching, editing and creating instrument metadata files
351+
352+
#### I want to write an instrument.json
353+
354+
Instrument JSON files should be created by a Python script using models from the [`aind-data-schema`](https://github.com/AllenNeuralDynamics/aind-data-schema) library to ensure the output file is valid according to the schema (as opposed to directly writing JSON). There are multiple examples of Python scripts for generating instrument JSON files in the [data schema examples folder](https://github.com/AllenNeuralDynamics/aind-data-schema/tree/dev/examples)
355+
356+
We recommend that basic maintenance changes, e.g. replacing a device with an identical one but with a different serial number, be done by modifying the Python script and updating the `Instrument.modification_date`.
357+
358+
#### I'm ready to upload my instrument JSON file to the database
359+
360+
If you want to store your Instrument metadata file in the Scientific Computing managed database (only 2.0 schema instrument files are supported), you can follow these steps to post your instrument json file to the database:
361+
362+
Note that you must currently have the `release-v1.0.0` branch of `aind-metadata-mapper` installed:
363+
```bash
364+
git checkout https://github.com/AllenNeuralDynamics/aind-metadata-mapper.git
365+
cd aind-metadata-mapper
366+
git checkout release-v1.0.0
367+
conda create -n instrument_uploader # or whatever you want your env to be called
368+
conda activate instrument_uploader
369+
pip install -e .
370+
```
371+
372+
Then run the following in python
373+
374+
```python
375+
from aind_metadata_mapper import utils
376+
from aind_data_schema.core.instrument import Instrument
377+
378+
# Load the JSON as an Instrument object
379+
with open(instrument_path, 'r') as f:
380+
instrument_object = Instrument.model_validate_json(f.read())
381+
382+
# save the instrument to the database.
383+
utils.save_instrument(instrument_object)
384+
```
385+
386+
The "modification_date" field will be automatically updated to the current date when the instrument file is uploaded. There is currently a check on uniqueness by date, so uploading more than one instrument.json per day (for example, if you make a mistake and try to upload a second time) will result in an error. If you do need to upload a second time in day, you'll need to overwrite the previous instrument by passing the `replace=True` arguement to the `utils.save_instrument()` function.
387+
388+
#### I want to get an instrument from the database
389+
390+
During data upload you can automatically have your `instrument.json` fetched by the GatherMetadataJob. If you need to see the file you uploaded locally, you can fetch the most recent `instrument.json` sorted by `Instrument.modification_date`.
391+
392+
```python
393+
from aind_metadata_mapper import utils
394+
395+
# fetch the instrument, where `INSTRUMENT_ID` is a string containing the instrument ID
396+
instrument_data = utils.get_instrument(INSTRUMENT_ID)
397+
```
398+
399+
If you need access to an older version of an instrument metadata file from the database, please reach out to someone in Scientific Computing for assistance.
311400

312401
## Procedures
313402

0 commit comments

Comments
 (0)