Skip to content

Commit f870c50

Browse files
committed
Revert "Can parse a single file path"
This reverts commit ce1fe93, and includes a documented decision for this change.
1 parent e9703e5 commit f870c50

18 files changed

Lines changed: 258 additions & 325 deletions

doc/source/api.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,10 @@ API Reference
8080
:show-inheritance:
8181
:members:
8282

83+
.. autoexception:: NotASequenceError
84+
:show-inheritance:
85+
:members:
86+
8387
.. autoexception:: ParseError
8488
:show-inheritance:
8589
:members:

doc/source/decisions/002-directory-sequence-support.rst

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
[ADR-002] Directory Sequence Support
44
====================================
55

6-
:bdg-success:`Accepted`
6+
:bdg-success:`Rejected`
77

88
Context and Problem Statement
99
-----------------------------
@@ -23,13 +23,6 @@ Supporting sequences of directories will complicate the API in the following way
2323
with directory sequences are used.
2424

2525

26-
Considered Options
27-
------------------
28-
29-
* We will support directory sequences
30-
* We will not support directory sequences.
31-
32-
3326
Decision Outcome
3427
----------------
3528

Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
.. _adr-004:
2+
3+
[ADR-004] Supporting Single Path Sequences
4+
==========================================
5+
6+
:bdg-danger:`Rejected`
7+
8+
Context and Problem Statement
9+
-----------------------------
10+
11+
This proposal is to make the ranges in a sequence optional,
12+
such that a sequence with no ranges (i.e. a single file) is still considered a valid sequence.
13+
14+
Use cases for this include:
15+
16+
* Retrieving a sequence string from an unknown source,
17+
and not knowing whether it will be a single file or a sequence.
18+
For example, a user may have a sequence string stored in a database, and want to
19+
use pathseq to loop over the files in that sequence. If the sequence string is for
20+
a single file, they would still want to be able to use pathseq to loop over it.
21+
22+
Where this type of pattern has been seen before,
23+
users have typically run the equivalent of ``.with_existing_files``
24+
immediately after creating the sequence.
25+
26+
Currently, a sequence with only empty ranges is considered to be empty.
27+
A single path sequence would have no ranges, and would be considered as having one file.
28+
These two definitions are somewhat in conflict,
29+
and so introducing single path sequences would erode the concept of a sequence.
30+
31+
PathSeq defines a "stem" slightly differently to pathlib.
32+
In pathlib, the stem of a path is the final path component without its suffix.
33+
In PathSeq, the stem of a path is the final path component without the ranges and any suffixes.
34+
This difference is achievable because the ranges are an additional component
35+
that creates a clear separation between the stem from the suffixes.
36+
In single path sequences, there is no clear separation between the stem and suffixes,
37+
hence why pathlib behaves the way it does.
38+
pathlib puts the burden on users to parse the stem and suffixes themselves,
39+
and PathSeq would ideally do the same,
40+
else risk users reporting unintuitive/inconsistent parsing of suffixes
41+
(e.g. "file.tar.gz" having a stem of "file.tar" and suffixes of ".gz"
42+
instead of a stem of "file" and suffixes of ".tar.gz").
43+
44+
.. note::
45+
46+
This is already an issue for loose path sequences
47+
where the ranges exist at the start or end of the sequence string,
48+
and therefore there is no separation between the stem and suffixes.
49+
The loose format already warns users that ambiguity exists throughout
50+
the API of ``LoosePathSequence``,
51+
so the effect on loose path sequences is not considered significant.
52+
53+
Supporting single path sequences does not significantly complicate the implementation.
54+
Wherever we support sequences of an unknown number of ranges
55+
we already support sequences with no ranges.
56+
57+
58+
Considered Options
59+
------------------
60+
61+
* Change the signature of ``.with_existing_files`` from ``PathSequence`` to ``PathSequence | None``.
62+
63+
Supporting single path sequences would complicate the API in the following ways:
64+
65+
* Users would have to check the type of the return value before using it.
66+
This applies even for those users that are always using a sequence with ranges.
67+
Essentially, users end up needing to check whether the sequence has any ranges
68+
or not before using ``.with_existing_files``.
69+
So users may as well check this upon creation of the sequence,
70+
and not have to worry about it for the rest of the sequence's lifetime.
71+
72+
* Proper use of ``.with_existing_files`` can be type checked.
73+
74+
* For users that aren't using type checking,
75+
improper use of ``.with_existing_files`` could go unnoticed until
76+
it is called on a single path sequence for which the file does not exist.
77+
78+
* The common use case would be written as:
79+
80+
.. code-block:: python
81+
82+
def do_something_with_sequence(seq: str):
83+
files: Iterable[Path] = PathSequence(seq).with_existing_files() or [Path(seq)]
84+
for file in files:
85+
# do something with the file
86+
...
87+
88+
* ``.with_existing_files`` will raise an error if it is called on a single file sequence,
89+
for which the file does not exist.
90+
91+
* Proper use of ``.with_existing_files`` cannot be type checked.
92+
93+
* For users that aren't using type checking,
94+
improper use of ``.with_existing_files`` could go unnoticed until
95+
it is called on a single path sequence for which the file does not exist.
96+
97+
* The common use case would be written as:
98+
99+
.. code-block:: python
100+
101+
def do_something_with_sequence(seq: str):
102+
files: Iterable[Path]
103+
try:
104+
files = PathSequence(seq).with_existing_files()
105+
except FileNotFoundError:
106+
files = [Path(seq)]
107+
108+
for file in files:
109+
# do something with the file
110+
...
111+
112+
* We will not support single path sequences.
113+
and instead raise an error if a PathSequence is constructed with a single file sequence.
114+
115+
* Users will not have to worry about whether ``.with_existing_files`` can be used safely.
116+
Checking is done upon creation of the sequence.
117+
118+
* The methods that construct an instance of ``BasePurePathSequence`` will
119+
need to raise an error if the sequence string is for a single file.
120+
Users already need to be aware of a ``ParseError`` being raised in these methods,
121+
so this is not a significant change to the API.
122+
123+
* The common use case would be written as:
124+
125+
.. code-block:: python
126+
127+
def do_something_with_sequence(seq: str):
128+
files: Iterable[Path]
129+
try:
130+
files = PathSequence(seq).with_existing_files()
131+
except NotASequenceError:
132+
files = [Path(seq)]
133+
134+
for file in files:
135+
# do something with the file
136+
...
137+
138+
139+
Decision Outcome
140+
----------------
141+
142+
We will not support single path sequences.
143+
A single file and a path sequence occasionally needs to be treated in differently,
144+
and it's best for users to be aware of this distinction when they create the sequence.

doc/source/user/format.rst

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -79,13 +79,13 @@ an inter-range separator.
7979

8080
.. code-block:: text
8181
82-
/directory/ file . udim1001-1002<UDIM> _ 1001-1010# .tar.gz
83-
┌────┴────┼─┼──────────────────┼─┼──────────┼───────┴┐
84-
│ ┌────┘ │ └────┐ ┌───┘ └─── │ │
85-
│stem│prefix│pre-range|range│pre-range│ range │suffixes│
86-
└────┴──────┼─────────┴─────┴─────────┴───────┼────────┘
87-
ranges
88-
└─────────────────────────────────
82+
/directory/ file . 1001-1002<UDIM> _ 1001-1010# .tar.gz
83+
┌────┴────┼─┼───────────────┼─┼──────────┼───────┴┐
84+
│ ┌────┘ │ ┌───┘ └────┐ │ │
85+
│stem│prefix│ range │inter-range│range│suffixes│
86+
└────┴──────┼─────────┴───────────┴─────┼────────┘
87+
│ ranges │
88+
└────────────────────────────┘
8989
9090
9191
.. _format-simple-stem:
@@ -463,6 +463,20 @@ a stem may or may not be present in the name of a loose path sequence.
463463
>>> LoosePathSequence('.tar.gz1-5#').stem
464464
'.tar'
465465
466+
.. note::
467+
468+
For ranges that start or end the name of the sequence,
469+
there is ambiguity in how to interpret the stem and suffixes.
470+
Unlike :attr:`pathlib.PurePath.stem`, this will never contain a suffix
471+
if the paths have multiple suffixes:
472+
473+
.. code-block:: pycon
474+
475+
>>> LoosePathSequence('file.tar.gz.1-5#').stem
476+
'file'
477+
>>> LoosePathSequence('1-5#file.tar.gz').stem
478+
'file'
479+
466480
467481
.. _format-loose-prefix:
468482

src/pathseq/__init__.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
RangesStartName,
1212
)
1313
from ._base import BasePathSequence, BasePurePathSequence, PathT_co, PurePathT_co
14-
from ._error import IncompleteDimensionError, ParseError
14+
from ._error import IncompleteDimensionError, NotASequenceError, ParseError
1515
from ._file_num_seq import FileNumSequence, FileNumT
1616
from ._loose_path_sequence import LoosePathSequence
1717
from ._loose_pure_path_sequence import LoosePurePathSequence
@@ -29,6 +29,7 @@
2929
"IncompleteDimensionError",
3030
"LoosePathSequence",
3131
"LoosePurePathSequence",
32+
"NotASequenceError",
3233
"PaddedRange",
3334
"ParsedLooseSequence",
3435
"ParseError",

src/pathseq/_ast/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
)
88
from ._ranges import PaddedRange, Ranges
99
from ._type import ParsedSequence
10+
from ._util import non_recursive_asdict
1011

1112
__all__ = [
1213
"Formatter",
@@ -17,4 +18,5 @@
1718
"RangesStartName",
1819
"RangesInName",
1920
"RangesEndName",
21+
"non_recursive_asdict",
2022
]

src/pathseq/_ast/_formatter.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -41,9 +41,6 @@ def _splice_strings_onto_ranges(
4141
except StopIteration:
4242
return result
4343

44-
if not result:
45-
return result
46-
4744
raise TypeError(
4845
"The number of inter-range strings given does not match"
4946
" the number of range strings given minus one."

src/pathseq/_ast/_ranges.py

Lines changed: 6 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -55,22 +55,15 @@ class Ranges:
5555
inter_ranges: tuple[str, ...]
5656
"""The inter-range separators between each range in the sequence.
5757
58-
If there are ranges then
59-
the number of inter-range separators is guaranteed to be ``1 - len(self.ranges)``.
58+
The number of inter-range separators is guaranteed to be ``1 - len(self.ranges)``.
6059
"""
6160

6261
def __post_init__(self) -> None:
63-
if self.ranges:
64-
if len(self.inter_ranges) != len(self.ranges) - 1:
65-
raise ValueError(
66-
"The number of inter-range strings given does not match"
67-
" the number of range strings minus one."
68-
)
69-
else:
70-
if self.inter_ranges:
71-
raise ValueError(
72-
"There are no ranges but there are inter-range strings."
73-
)
62+
if len(self.inter_ranges) != len(self.ranges) - 1:
63+
raise ValueError(
64+
"The number of inter-range strings given does not match"
65+
" the number of range strings minus one."
66+
)
7467

7568
def __str__(self) -> str:
7669
return Formatter().ranges(self)

src/pathseq/_base.py

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
Self, # PY311
1515
)
1616

17-
from ._ast import ParsedLooseSequence, ParsedSequence
17+
from ._ast import PaddedRange, ParsedLooseSequence, ParsedSequence, Ranges, non_recursive_asdict
1818
from ._error import ParseError
1919
from ._file_num_seq import FileNumSequence
2020
from ._from_disk import find_on_disk
@@ -32,6 +32,7 @@ class BasePurePathSequence(Sequence[PurePathT_co], metaclass=abc.ABCMeta):
3232
"""A generic class that represents a path sequence.
3333
3434
Raises:
35+
NotASequenceError: When the given path does not represent a sequence.
3536
ParseError: When the given path is not a valid path sequence.
3637
"""
3738

@@ -274,6 +275,7 @@ def with_name(self, name: str) -> Self:
274275
Raises:
275276
ValueError: When the given name is empty.
276277
Use :attr:`~.BasePurePathSequence.parent` instead.
278+
NotASequenceError: When the given name does not represent a sequence.
277279
ParseError: When the resulting path is not a valid path sequence.
278280
"""
279281
return self.with_segments(self._path.with_name(name))
@@ -293,7 +295,6 @@ def with_stem(self, stem: str) -> Self:
293295
parsed = self._parsed.with_stem(stem)
294296
return self.with_segments(self._path.parent, str(parsed))
295297

296-
@abc.abstractmethod
297298
def with_file_num_seqs(
298299
self, *seqs: FileNumSequence[int] | FileNumSequence[Decimal]
299300
) -> Self:
@@ -303,6 +304,19 @@ def with_file_num_seqs(
303304
TypeError: If the given number of file number sequences does not match
304305
the sequence's number of file number sequences.
305306
"""
307+
if len(seqs) != len(self._parsed.ranges.ranges):
308+
raise TypeError(
309+
f"Need {len(self._parsed.ranges.ranges)} sequences, but got {len(seqs)}"
310+
)
311+
312+
new_ranges = tuple(
313+
PaddedRange(seq, range_.pad_format)
314+
for seq, range_ in zip(seqs, self._parsed.ranges.ranges)
315+
)
316+
kwargs = non_recursive_asdict(self)
317+
kwargs["ranges"] = Ranges(new_ranges, self._parsed.ranges.inter_ranges)
318+
new = self._parsed.__class__(**kwargs)
319+
return self.with_segments(self._path.parent, str(new))
306320

307321
def path_with_file_nums(self, *numbers: int | Decimal) -> PurePathT_co:
308322
"""Return a path for the given file number(s) in the sequence.
@@ -457,6 +471,8 @@ class BasePathSequence(BasePurePathSequence[PathT_co], metaclass=abc.ABCMeta):
457471
"""A sequence of Path objects.
458472
459473
Raises:
474+
NotASequenceError: When the given path does not represent a sequence,
475+
but a regular path.
460476
ParseError: When the given path is not a valid path sequence.
461477
"""
462478

src/pathseq/_error.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,17 @@ def __init__(
3737
super().__init__(message)
3838

3939

40+
class NotASequenceError(ParseError):
41+
"""Raised when parsing a string that does not represent a sequence, but a regular path.
42+
43+
In other words, the given sequence string has no :ref:`range <format-simple-range>`
44+
present.
45+
"""
46+
47+
def __init__(self, seq: str) -> None:
48+
super().__init__(seq, 0, len(seq) - 1, reason="No range string is present")
49+
50+
4051
class IncompleteDimensionError(Exception):
4152
"""A multi-dimension sequence does not contain a consistent number of files across a dimension.
4253

0 commit comments

Comments
 (0)