Skip to content

Commit b83542d

Browse files
committed
Versioning and documentation
1 parent 20da277 commit b83542d

2 files changed

Lines changed: 11 additions & 5 deletions

File tree

README.md

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,8 @@ log:
8383

8484
run_one_path: /usr/bin/run-one
8585

86+
metadata_archive: /path/to/metadata/archive
87+
8688
transfer_details:
8789
user: username
8890
host: remote.host.com
@@ -102,8 +104,11 @@ sequencers:
102104
- RunParameters.xml
103105
ignore_folders:
104106
- nosync
105-
rsync_options:
107+
remote_rsync_options:
106108
- --chmod=Dg+s,g+rw
109+
metadata_rsync_options:
110+
- "--exclude='*'"
111+
- "--include=InterOp"
107112
# ... additional sequencer configurations
108113
```
109114

@@ -113,7 +118,7 @@ sequencers:
113118
2. **Validation**: Confirms run ID matches expected format for the sequencer type
114119
3. **Transfer Phases**:
115120
- **Sequencing Phase**: Starts continuous background rsync transfer while sequencing is ongoing (when the final sequencing file doesn't exist). Uploads status and metadata files (specified for each sequencer type in the config with `metadata_for_statusdb`) to database.
116-
- **Final Transfer**: After sequencing completes (final sequencing file appears), initiates final rsync transfer and captures exit code.
121+
- **Final Transfer**: After sequencing completes (final sequencing file appears), syncs specified metadata file to archive location, initiates final rsync transfer and captures exit codes.
117122
- **Completion**: Updates database when transfer was successful.
118123

119124
### Status Tracking
@@ -145,14 +150,15 @@ Run status is tracked in CouchDB with events including:
145150
- Final completion is indicated by the presence of a sequencer-specific final file (e.g., `RTAComplete.txt` for Illumina)
146151
- Remote storage is accessible via rsync over SSH
147152
- CouchDB is accessible and the database exists
148-
- Metadata files (e.g., RunInfo.xml) are present in run directories for status database updates
153+
- Metadata files (e.g., RunInfo.xml) are present in run directories for status database updates and sync to metadata archive location
149154

150155
### Status Files
151156

152157
The logic of the script relies on the following status files:
153158

154159
- `run.final_file` - The final file written by each sequencing machine. Used to indicate when the sequencing has completed.
155-
- `final_rsync_exitcode` - Used to indicate when the final rsync is done, so that the final rsync can be run in the background. This is especially useful for restarts after long pauses of the cronjob.
160+
- `.final_rsync_exitcode` - Used to indicate when the final rsync is done, so that the final rsync can be run in the background. This is especially useful for restarts after long pauses of the cronjob.
161+
- `.metadata_rsync_exitcode` - Used to indicate when rsync of metadata to the metadata archive is done, so that the rsync can be run in the background. This is useful when there are I/O issue with the disks.
156162

157163
## Development
158164

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ ignore = [
2020

2121
[project]
2222
name = "dataflow_transfer"
23-
version = "1.0.5"
23+
version = "1.1.0"
2424
description = "Script for transferring sequencing data from sequencers to storage"
2525
authors = [
2626
{ name = "Sara Sjunnebo", email = "sara.sjunnebo@scilifelab.se" },

0 commit comments

Comments
 (0)