Skip to content

Commit a317545

Browse files
authored
fix: split_by_worker should happen at file level
Fix the performance issue due to untar files before assign files to subprocesses
1 parent 70dab84 commit a317545

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

seqchromloader/loader.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,8 +59,8 @@ def __iter__(self):
5959
worker_info = torch.utils.data.get_worker_info()
6060
pipeline = [
6161
wds.SimpleShardList(self.wds),
62-
wds.tarfile_to_samples(),
6362
wds.split_by_worker,
63+
wds.tarfile_to_samples(),
6464
wds.decode(),
6565
wds.rename(seq="seq.npy",
6666
chrom="chrom.npy",

0 commit comments

Comments
 (0)