|
| 1 | +<Breadcrumbs |
| 2 | + breadcrumbs={[ |
| 3 | + { path: "/datasets", text: "AnVIL Data Explorer" }, |
| 4 | + { path: "/guides", text: "Guides" }, |
| 5 | + { path: "", text: "TSV File Manifest Download" }, |
| 6 | + ]} |
| 7 | +/> |
| 8 | + |
| 9 | +# TSV File Manifest Download |
| 10 | + |
| 11 | +Manifest downloads are available for all of the datasets listed in the AnVIL Data Explorer, including both open-access and managed-access datasets. A tab-separated-value file (.tsv) is generated based on the data selected. |
| 12 | + |
| 13 | +The downloaded manifest contains a number of columns. Depending on how the data will be accessed and used, some of the key columns are: |
| 14 | + |
| 15 | +- **dataset.title**, which contains the name of the dataset that the file belongs to. |
| 16 | + - A manifest can contain files from multiple datasets, depending on how the file is generated. |
| 17 | +- **datasets.consent_group** and **datasets.data_use_permission**, which contain the dataset's consent and use codes. |
| 18 | +- **files.file_size**, which contains the file size in bytes. |
| 19 | +- **files.name**, which contains the file name. |
| 20 | +- **files.drs_url**, which contains the DRS URL for use within the Terra environment. |
| 21 | +- **files.azul_url**, which is a URL that allows HTTP access to the individual file. |
| 22 | + - Files in open-access datasets are available via this link. |
| 23 | + - At this time, AnVIL requires requester-pays for managed-access datasets, so the files are not accessible through this URL. |
| 24 | +- **files.azul_mirror_url**, which contains the URI to the Amazon Web Services S3 bucket for that file. |
| 25 | + - Please note that the file name in the bucket is a hash to reduce storage requirements in case there is file duplication. |
| 26 | + - This field will be blank if the file is not present through the AWS Open Data Sponsorship Program. |
| 27 | + |
| 28 | +## Example |
| 29 | + |
| 30 | +### Downloading The Manifest For A Single Dataset |
| 31 | + |
| 32 | +1. Visit the dataset of interest by clicking on the dataset name in the Data Explorer. |
| 33 | + |
| 34 | +<Figure |
| 35 | + alt="Visit the dataset of interest" |
| 36 | + src="/guides/dataset-manifest-download/single-dataset-download-01.webp" |
| 37 | + width="100%" |
| 38 | +/> |
| 39 | + |
| 40 | +2. On the dataset description page, click on the "Export" button in the upper right-hand corner of that page. |
| 41 | + |
| 42 | +<Figure |
| 43 | + alt="Click the Export button" |
| 44 | + src="/guides/dataset-manifest-download/single-dataset-download-02.webp" |
| 45 | + width="100%" |
| 46 | +/> |
| 47 | + |
| 48 | +3. Then click on "Download TSV Manifest" in the "Download" section near the bottom of the page. |
| 49 | + |
| 50 | +<Figure |
| 51 | + alt="Click Download TSV Manifest" |
| 52 | + src="/guides/dataset-manifest-download/single-dataset-download-03.webp" |
| 53 | + width="100%" |
| 54 | +/> |
| 55 | + |
| 56 | +4. This will display a screen to request the generation of the manifest. Click on the "Request Link" button. |
| 57 | + |
| 58 | +<Figure |
| 59 | + alt="Click the Request Link button" |
| 60 | + src="/guides/dataset-manifest-download/single-dataset-download-04.webp" |
| 61 | + width="100%" |
| 62 | +/> |
| 63 | + |
| 64 | +5. Once the manifest is generated, you can either download it directly by clicking the download icon or copy its URL by clicking the copy icon. |
| 65 | + |
| 66 | +<Figure |
| 67 | + alt="Download or copy the manifest link" |
| 68 | + src="/guides/dataset-manifest-download/single-dataset-download-05.webp" |
| 69 | + width="100%" |
| 70 | +/> |
| 71 | + |
| 72 | +The manifest can be viewed with any utilities that can import tab-separated-value files. It can additionally be processed with scripts depending on the need. |
| 73 | + |
| 74 | +### Downloading A Manifest For Multiple Datasets |
| 75 | + |
| 76 | +Downloading files from multiple datasets works the same way as downloading from a single dataset, except for how you select the datasets. |
| 77 | + |
| 78 | +In this case, on the Data Explorer's main page, use the faceted search feature in the right-hand column to select the datasets of interest and then click on the "Export" button on the top right of the page. |
| 79 | + |
| 80 | +<Figure |
| 81 | + alt="Select datasets and click Export" |
| 82 | + src="/guides/dataset-manifest-download/multiple-datasets-download-01.webp" |
| 83 | + width="100%" |
| 84 | +/> |
| 85 | + |
| 86 | +From this point on, the interface is the same as the single dataset download above. Continue with Step 3 above. |
0 commit comments