The TemplateFlow Archive is an infrastructure reliant on DataLad. Therefore, it is possible (and recommended for those who want to leverage the power of DataLad) to access the Archive using just DataLad.
Tip
Prefer the Python client for routine access
The high-level :class:`templateflow.client.TemplateFlowClient` can be
configured with use_datalad=True (or via templateflow config set
TEMPLATEFLOW_USE_DATALAD 1) to transparently manage the cache with
DataLad. This is a good default when you want TemplateFlow to perform
datalad get and datalad update commands automatically while still
interacting with the archive through the familiar client API.
Note
Drop down to raw DataLad when you need full control
Advanced workflows—such as pinning remotes, creating custom siblings, or
operating entirely offline—are better served by running DataLad commands
directly. In those scenarios leave use_datalad disabled in the client
and use the instructions below to operate on the repository yourself.
The archive is indexed by a superdataset, which can be installed with:
$ datalad install -r https://github.com/templateflow/templateflow.git
or just:
$ datalad install -r ///templateflow
Please note the -r modifier, which will automatically install all the
subdatasets.
In this case, subdatasets (sub-folders) are the individual templates
(signified by the tpl- prefix).
If the operation finished successfully, you should be able to change directories
into templateflow and see something like:
$ cd templateflow/ $ ls -lh total 76K -rw-rw-r-- 1 oesteban oesteban 122 Sep 8 10:42 dataset_description.json drwxrwxr-x 4 oesteban oesteban 4.0K Sep 8 10:43 tpl-fsaverage drwxrwxr-x 5 oesteban oesteban 4.0K Sep 8 10:43 tpl-fsLR drwxrwxr-x 5 oesteban oesteban 4.0K Sep 8 10:42 tpl-MNI152Lin drwxrwxr-x 5 oesteban oesteban 16K Sep 8 10:42 tpl-MNI152NLin2009cAsym drwxrwxr-x 5 oesteban oesteban 4.0K Sep 8 10:42 tpl-MNI152NLin2009cSym drwxrwxr-x 5 oesteban oesteban 12K Sep 8 10:42 tpl-MNI152NLin6Asym drwxrwxr-x 5 oesteban oesteban 4.0K Sep 8 10:42 tpl-MNI152NLin6Sym drwxrwxr-x 16 oesteban oesteban 4.0K Sep 8 10:43 tpl-MNIInfant drwxrwxr-x 11 oesteban oesteban 4.0K Sep 8 10:43 tpl-MNIPediatricAsym drwxrwxr-x 5 oesteban oesteban 4.0K Sep 8 10:43 tpl-NKI drwxrwxr-x 4 oesteban oesteban 4.0K Sep 8 10:43 tpl-OASIS30ANTs drwxrwxr-x 4 oesteban oesteban 4.0K Sep 8 10:43 tpl-PNC drwxrwxr-x 5 oesteban oesteban 4.0K Sep 8 10:43 tpl-WHS
Important
The DataLad install operation DOES NOT download the data. Please see how to get the data below.
Before going ahead, make sure you understand how DataLad works. Once the TemplateFlow superdataset has been installed, as well as all or some of the subdatasets, it is possible to access data. For example, pulling down all T1-weighted NIfTI images of all datasets would look like:
$ find . -name "*_T1w.nii.gz" -exec datalad get {} +
Let's unpack what happened.
DataLad (or more precisely, the git-annex working under the hood)
replaces large files with symbolic links which point to files that
permit the location of the actual resource.
This technique ("annexing" to git) permits keeping the actual files
outside the version control system that (unless set up with some
special extension such as LFS) is not adequate to track large data
files.
Because annexed files are indeed in the file tree, it is possible to
search with tools like find or tree:
$ tree tpl-MNI152Lin tpl-MNI152Lin ├── CHANGES ├── LICENSE ├── scripts │ ├── headmask.py │ ├── normalize.py │ └── sanitize.py ├── template_description.json ├── tpl-MNI152Lin_res-01_desc-brain_mask.nii.gz -> .git/annex/objects/J4/J9/URL-s131839--https&c%%files.osf.io%v1%resourc-4a92beb360af57cc397642c99e4f34ee/URL-s131839--https&c%%files.osf.io%v1%resourc-4a92beb360af57cc397642c99e4f34ee ├── tpl-MNI152Lin_res-01_desc-head_mask.nii.gz -> .git/annex/objects/j3/Jw/URL-s168509--https&c%%files.osf.io%v1%resourc-2e366aff039e485ce73875dd1fc912fd/URL-s168509--https&c%%files.osf.io%v1%resourc-2e366aff039e485ce73875dd1fc912fd ├── tpl-MNI152Lin_res-01_PD.nii.gz -> .git/annex/objects/5m/4z/URL-s10250635--https&c%%files.osf.io%v1%resourc-d38cc6938c26e9389a1a9acf03f5a4b6/URL-s10250635--https&c%%files.osf.io%v1%resourc-d38cc6938c26e9389a1a9acf03f5a4b6 ├── tpl-MNI152Lin_res-01_T1w.nii.gz -> .git/annex/objects/pM/Fm/URL-s10669511--https&c%%files.osf.io%v1%resourc-2e59511114a1686f937e0127af887b83/URL-s10669511--https&c%%files.osf.io%v1%resourc-2e59511114a1686f937e0127af887b83 ├── tpl-MNI152Lin_res-01_T2w.nii.gz -> .git/annex/objects/63/jK/URL-s10096230--https&c%%files.osf.io%v1%resourc-7ee9c493542a55d96d28d55d57a3ee52/URL-s10096230--https&c%%files.osf.io%v1%resourc-7ee9c493542a55d96d28d55d57a3ee52 ├── tpl-MNI152Lin_res-02_desc-brain_mask.nii.gz -> .git/annex/objects/vj/pW/URL-s25649--https&c%%files.osf.io%v1%resourc-ebe0f869bd33c9dd7d983a73f7704326/URL-s25649--https&c%%files.osf.io%v1%resourc-ebe0f869bd33c9dd7d983a73f7704326 ├── tpl-MNI152Lin_res-02_desc-head_mask.nii.gz -> .git/annex/objects/7q/gF/URL-s32857--https&c%%files.osf.io%v1%resourc-4c79972ef82dfaa9070522b558a8411c/URL-s32857--https&c%%files.osf.io%v1%resourc-4c79972ef82dfaa9070522b558a8411c ├── tpl-MNI152Lin_res-02_PD.nii.gz -> .git/annex/objects/1m/jq/URL-s1411464--https&c%%files.osf.io%v1%resourc-95c7dabef32603e9f1d4f3f9cb92b800/URL-s1411464--https&c%%files.osf.io%v1%resourc-95c7dabef32603e9f1d4f3f9cb92b800 ├── tpl-MNI152Lin_res-02_T1w.nii.gz -> .git/annex/objects/Wf/Fx/URL-s1448817--https&c%%files.osf.io%v1%resourc-2ba5a81206dff8bbf84fb319ed1d7201/URL-s1448817--https&c%%files.osf.io%v1%resourc-2ba5a81206dff8bbf84fb319ed1d7201 └── tpl-MNI152Lin_res-02_T2w.nii.gz -> .git/annex/objects/X8/Fv/URL-s1375781--https&c%%files.osf.io%v1%resourc-6f1f3ad0441ef1200307a70b32b4f303/URL-s1375781--https&c%%files.osf.io%v1%resourc-6f1f3ad0441ef1200307a70b32b4f303 1 directory, 16 files
If your terminal has advanced coloring, you will also see that only the two
links ending with _T1w.nii.gz are not "broken" links.
This is because we did datalad get on both of them in the previous step.
DataLad only pulls the actual file objects when they are requested.