You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _episodes/01-introduction.md
+9-6Lines changed: 9 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ keypoints:
14
14
- "Rucio is the primary way to browse and access simulated EIC/ePIC data"
15
15
---
16
16
17
-
##Simulation Campaigns
17
+
# Simulation Campaigns
18
18
19
19
Simulations of a range of physics processes in the ePIC detector are typically run on a monthly basis by the Production Working Group. Information on simulation campaigns can be found on the [Production Working Group pages](https://eic.github.io/epic-prod/). This includes details of files produced in previous campaigns.
20
20
@@ -32,7 +32,7 @@ These are linked to specific software releases following the same format.
32
32
33
33
Various types of files are produced as part of the simulation campaign as we will discuss in the next section. The files you may wish to access will differ depending upon your use case. In this tutorial, we will explore a few different common use cases and the types of files you may want in each.
34
34
35
-
###Submitting a New Simulation Request
35
+
## Submitting a New Simulation Request
36
36
37
37
If you would like to submit a new request to a future campaign for a dataset that is not in production, please follow the following process:
38
38
@@ -42,7 +42,7 @@ If you would like to submit a new request to a future campaign for a dataset tha
42
42
3. Once your input files are ready, submit a [simulation request form](https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_forms_d_e_1FAIpQLScDqiEaHayAcwBDGAWa4W6k-2D6yUFzS-2DXiWuhLolpy64mLk5FA_viewform&d=DwMFAg&c=CJqEzB1piLOyyvZjb8YUQw&r=1bclzxVlhTV419LkWWxwLTl3ztSqyuA_Q_Vnypx1RD4&m=q9b8IbAHm_MLsvy4XkI2Px2QKzFNqjpf0qc4nctB9ZHyf-uL5bZuiegs5-hwb-Ec&s=TFdsmJL2wPtUD-CCXVdkWIF5lxB1QYbz5MKGhB6nroA&e=).
43
43
- If your input is not pre-processed following the [pre-processing guidelines](https://github.com/eic/epic-prod/blob/main/docs/_documentation/input_preprocessing.md), it will not be simulated. Please review these carefully.
44
44
45
-
##Simulation Files Organisation
45
+
# Simulation Files Organisation
46
46
47
47
Within a simulation campaign, there are three broad classes of files that are produce:
48
48
- EVGEN: The input hepmc3 datasets
@@ -54,13 +54,16 @@ Within a simulation campaign, there are three broad classes of files that are pr
54
54
55
55
Most users and use cases will interact with RECO files, the output of the full simulation and reconstruction chain. We will explore some use cases and how to find the relevant files in each case.
56
56
57
-
##How can I Browse the Simulation Campaign Output and Access Files?
57
+
# How can I Browse the Simulation Campaign Output and Access Files?
58
58
59
59
To browse the campaign output and find the files we want, we can use [Rucio](https://rucio.cern.ch/). *Rucio* is an open source scientific data management system. It is utilised in other large physics experiments such as ATLAS.
60
60
61
-
###Wait, I read I should use XrootD to find and access files?
61
+
## Wait, I read I should use XrootD to find and access files?
62
62
63
-
You may find reference to or instructions on using Xrootd to browse and access files.These may still work and indeed, we will use some of these commands later in this tutorial. However, Rucio is now the preferred method for the cases we will examine.
63
+
You may find reference to or instructions on using [XrootD]({{ page.root }}{% link _extras/xrootd.md %}) to browse and access files. These may still work and indeed, we will use some of these commands later in this tutorial. However, Rucio is now the preferred method for the cases we will examine. **The recommended workflow is now:**
64
+
65
+
1. Find file location with Rucio
66
+
2. Stream or download with XrootD
64
67
65
68
Why? This change isn't just to make everybody learn something new, it is also a consequence of the expansion of the volume of ePIC data now available. Previously (before 2026), all simulated data was stored on Jefferson Lab servers. However, data is now spread between multiple sites. This makes finding an accessing it using XrootD more complicated. Rucio can deal with this "issue" in a straightforward way.
Copy file name to clipboardExpand all lines: _episodes/02-rucio_usage.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,7 @@ keypoints:
15
15
- "Once you find the file location with Rucio, you can use xrootd to download or stream it too"
16
16
---
17
17
18
-
##Getting Started
18
+
# Getting Started
19
19
20
20
We can access and run the Rucio client from within eic-shell. From wherever you have eic-shell:
21
21
@@ -40,7 +40,7 @@ rucio -h
40
40
41
41
To use Rucio further, we will need to briefly look at how Rucio organises data.
42
42
43
-
##Datasets and DIDs
43
+
# Datasets and DIDs
44
44
45
45
Typically, we want to analyse data contained within specific files. Files can be grouped together into datasets which can themselves, be grouped into containers. All three refer to "data". As such, the term "data identifier` or **DID** is used in Rucio. A DID is just the name of a single file, dataset or container.
46
46
@@ -86,7 +86,7 @@ The `name` here - `/RECO/26.02.0/epic_craterlake/EXCLUSIVE/DEMP/DEMPgen-1.2.4/10
86
86
87
87
Other names may not necessarily contain all of the same information, but as a bare minimum, are likely to tell us something about the physics process simulated and beam conditions, as well as which software release was used. This is reflected in the metadata tags assigned as we will see later.
88
88
89
-
##Finding DIDs
89
+
# Finding DIDs
90
90
91
91
Now that we know what a DID looks like, how can we find the DID corresponding to the file or dataset that we're interested in?
92
92
@@ -182,7 +182,7 @@ The `root://dtn-eic.jlab.org` at the start of the output tells us that this part
182
182
183
183
So, we can find DIDs, check what they are and what they contain. To get to this point though, we needed some pre-knowledge of what the DID looked like which isn't necessarily that helpful for finding something. However, a much easier approach to finding what we need is to use the metadata tags that are assigned all DIDs from March 2026 onwards.
184
184
185
-
## Metadata Tags
185
+
# Metadata Tags
186
186
187
187
The following tags are available as of March 2026:
188
188
@@ -278,7 +278,7 @@ which will return only datasets with 10x250 collisions (10 GeV electrons on 250
278
278
> **Hint** - Check the example name we looked at when introducing DIDs in a previous section.
279
279
{: .challenge}
280
280
281
-
## Using DIDs - Downloading or Processing Files
281
+
# Using DIDs - Downloading or Processing Files
282
282
283
283
So far we've seen how we can find DIDs and check some basic info such as what type of data they point to and where that data is stored. We generally want to do a bit more than that though. Typically we want to find data to *use* it in some way. For our simulation data, this is usually to analyse it!
Copy file name to clipboardExpand all lines: _episodes/03-use_cases.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,7 @@ keypoints:
15
15
16
16
In this episode, we will explore a few common use cases and how users may want to interact with simulation campaign output in each case. Examples of carrying out some common tasks associated with each use case will be included.
17
17
18
-
##Physics Analyser - Novice
18
+
# Physics Analyser - Novice
19
19
20
20
This use case explores a user new to analysing ePIC data to try and look at a specific physics process. They will likely want to find and identify a specific physics process to pass through their analysis code. Their requirements are likely to include:
21
21
@@ -76,7 +76,7 @@ where `FILEPATH` is the path to one specific file from the output of one of the
76
76
> - Download **one** file from this dataset of your choice
77
77
{: .challenge}
78
78
79
-
##Physics Analyser - Experienced
79
+
# Physics Analyser - Experienced
80
80
81
81
In this use case, we consider an experienced physics analyser that has a well developed analysis script that they want to run on a large number of files, possibly even a full dataset, for a specific physics process they're interested in. Their requirements are likely to include:
82
82
@@ -180,7 +180,7 @@ Note that we have restricted these examples to only print out the first five fil
180
180
> 3. Stream **five** of the files in this dataset in a script, check the total number of events contained in all five files.
181
181
{: .challenge}
182
182
183
-
##Detector Designer/Optimiser, Algorithm/Reconstruction Development
183
+
# Detector Designer/Optimiser, Algorithm/Reconstruction Development
184
184
185
185
In this use case, someone updating the design of a detector in DD4HEP, or adjusting a reconstruction algorithm for a detector, may not want full reconstructed data. Instead, they may want more raw, hit level information. They may also want a specific detector configuration for comparison. In terms of physics process, they may not be looking at an actual reaction at all, but a particle gun simulation. To summarise, they may want:
186
186
@@ -212,7 +212,7 @@ Some tags they might use to find their data include:
212
212
> - Do non-reconstructed files exist for this/these dataset(s)?
213
213
{: .challenge}
214
214
215
-
##Conclusion and Comments
215
+
# Conclusion and Comments
216
216
217
217
That wraps up our introduction to using Rucio and some example use cases and scenarios.
Copy file name to clipboardExpand all lines: _extras/xrootd.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -62,7 +62,7 @@ In our earlier episode, we used this command to copy a file we found using Rucio
62
62
63
63
It is also possible to open a file directly in ROOT if you have XrootD installed too. Note that the following command should be executed after opening root and `TFile::Open()` should be used:
64
64
65
-
```bash
65
+
```c++
66
66
auto f = TFile::Open("root://dtn-eic.jlab.org//volatile/eic/EPIC/RECO/path-to-file")
0 commit comments