feat: add tuple preservation to AutoSerializer and 'pythonic' alias#121
feat: add tuple preservation to AutoSerializer and 'pythonic' alias#12127Bslash6 wants to merge 3 commits into
Conversation
AutoSerializer now preserves tuples through serialization roundtrips
via type markers, matching the existing set/frozenset/datetime/UUID
preservation. Add 'pythonic' as a registry alias for 'auto' to better
communicate intent.
Implementation: tuples are pre-processed into {"__tuple__": True,
"value": [...]} markers before msgpack encoding (since msgpack natively
flattens tuples to arrays). The object_hook restores them on decode.
Closes #78
📝 WalkthroughWalkthroughAutoSerializer now preserves tuples by wrapping them as explicit tuple markers before MessagePack packing and reconstructs them on unpacking. The serializer registry gains a "pythonic" alias mapped to AutoSerializer and instantiated with integrity checking like other core serializers. ChangesAutoSerializer Tuple Preservation & Pythonic Alias
🎯 3 (Moderate) | ⏱️ ~20 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
10 tests covering: simple/nested/empty tuple roundtrips, tuples in dicts and lists, tuples with other special types (sets, datetimes), list-stays-list verification, malformed marker error paths, and pythonic registry alias.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/unit/test_auto_serializer_new_types.py (1)
564-646: 💤 Low valueConsider adding single-element tuple test.
Single-element tuples like
(1,)are a common Python edge case. While the current implementation likely handles them correctly, adding a test would provide explicit coverage for this pattern.📝 Optional test to add
def test_single_element_tuple_roundtrip(self): """Single-element tuples preserve the trailing comma semantics.""" serializer = AutoSerializer() data = (1,) serialized, metadata = serializer.serialize(data) result = serializer.deserialize(serialized, metadata) assert isinstance(result, tuple) assert len(result) == 1 assert result == (1,)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/test_auto_serializer_new_types.py` around lines 564 - 646, Add a new unit test method test_single_element_tuple_roundtrip to the TestAutoSerializerTuple class that constructs serializer = AutoSerializer(), sets data = (1,), calls serializer.serialize(data) and serializer.deserialize(serialized, metadata), and asserts the result is a tuple, has length 1, and equals (1,); this provides explicit coverage for single-element tuple semantics alongside the existing tuple tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/unit/test_auto_serializer_new_types.py`:
- Around line 629-645: Update the two tests
test_malformed_tuple_marker_missing_value and
test_malformed_tuple_marker_wrong_value_type to remove the unnecessary
enable_integrity_checking=False argument when instantiating AutoSerializer; the
tuple validation runs in _auto_object_hook during msgpack deserialization and is
independent of ByteStorage integrity/compression, so change "serializer =
AutoSerializer(enable_integrity_checking=False)" to "serializer =
AutoSerializer()" in both tests to match the pattern used by other
corruption/roundtrip tests.
---
Nitpick comments:
In `@tests/unit/test_auto_serializer_new_types.py`:
- Around line 564-646: Add a new unit test method
test_single_element_tuple_roundtrip to the TestAutoSerializerTuple class that
constructs serializer = AutoSerializer(), sets data = (1,), calls
serializer.serialize(data) and serializer.deserialize(serialized, metadata), and
asserts the result is a tuple, has length 1, and equals (1,); this provides
explicit coverage for single-element tuple semantics alongside the existing
tuple tests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 769641c7-1a80-44b6-8b25-77931f52380b
📒 Files selected for processing (2)
src/cachekit/serializers/auto_serializer.pytests/unit/test_auto_serializer_new_types.py
🚧 Files skipped from review as they are similar to previous changes (1)
- src/cachekit/serializers/auto_serializer.py
| def test_malformed_tuple_marker_missing_value(self): | ||
| """Malformed __tuple__ marker raises SerializationError.""" | ||
| import msgpack | ||
|
|
||
| serializer = AutoSerializer(enable_integrity_checking=False) | ||
| bad_data = msgpack.packb({"__tuple__": True}) | ||
| with pytest.raises(SerializationError, match="missing 'value' field"): | ||
| serializer.deserialize(bad_data) | ||
|
|
||
| def test_malformed_tuple_marker_wrong_value_type(self): | ||
| """__tuple__ marker with non-list value raises SerializationError.""" | ||
| import msgpack | ||
|
|
||
| serializer = AutoSerializer(enable_integrity_checking=False) | ||
| bad_data = msgpack.packb({"__tuple__": True, "value": "not a list"}) | ||
| with pytest.raises(SerializationError, match="expected list"): | ||
| serializer.deserialize(bad_data) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Check if enable_integrity_checking=False is necessary in tuple corruption tests
# Search for all corruption tests with enable_integrity_checking
rg -n "enable_integrity_checking" tests/unit/test_auto_serializer_new_types.py -A 2 -B 2
# Search for corruption test patterns to compare tuple vs other types
rg -n "def test_.*_missing_value_field" tests/unit/test_auto_serializer_new_types.py -A 5Repository: cachekit-io/cachekit-py
Length of output: 2364
🏁 Script executed:
# Find AutoSerializer class definition
fd AutoSerializer src/ --type f | head -20Repository: cachekit-io/cachekit-py
Length of output: 49
🏁 Script executed:
# Search for enable_integrity_checking usage and documentation
rg -n "enable_integrity_checking" --type py | head -40Repository: cachekit-io/cachekit-py
Length of output: 4796
🏁 Script executed:
# Search for tuple deserialization logic
rg -n "__tuple__" --type py -A 3 -B 3 | head -60Repository: cachekit-io/cachekit-py
Length of output: 3924
🏁 Script executed:
# Check ByteStorage code and its interaction with deserialization
rg -n "ByteStorage" src/cachekit/serializers/auto_serializer.py -B 2 -A 5 | head -50Repository: cachekit-io/cachekit-py
Length of output: 1615
🏁 Script executed:
# Check if there's any special handling of integrity checking in deserialization flow
rg -n "_auto_object_hook|__tuple__" src/cachekit/serializers/auto_serializer.py -B 3 -A 8 | head -80Repository: cachekit-io/cachekit-py
Length of output: 2636
🏁 Script executed:
# Check if any other test file tests tuple with or without the flag
rg -n "test_.*tuple" tests/unit/ -A 2 -B 1 | grep -E "(def test|enable_integrity)" | head -20Repository: cachekit-io/cachekit-py
Length of output: 1296
Remove unnecessary enable_integrity_checking=False flag for consistency.
Both tuple corruption tests use enable_integrity_checking=False, but this flag controls ByteStorage compression/integrity wrapper behavior—not tuple marker validation. The tuple validation (checking for missing 'value' field and correct value type) occurs in _auto_object_hook during msgpack deserialization, independent of this flag. All other corruption tests (UUID, datetime, date, time, set, frozenset) and all other tuple roundtrip tests omit this flag. Remove it from both test methods to align with the pattern used throughout the file.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/unit/test_auto_serializer_new_types.py` around lines 629 - 645, Update
the two tests test_malformed_tuple_marker_missing_value and
test_malformed_tuple_marker_wrong_value_type to remove the unnecessary
enable_integrity_checking=False argument when instantiating AutoSerializer; the
tuple validation runs in _auto_object_hook during msgpack deserialization and is
independent of ByteStorage integrity/compression, so change "serializer =
AutoSerializer(enable_integrity_checking=False)" to "serializer =
AutoSerializer()" in both tests to match the pattern used by other
corruption/roundtrip tests.
Summary
serializer='pythonic'as a registry alias for'auto'Why
AutoSerializerpreserved sets and frozensets via type markers but not tuples — inconsistent. The tricky part: msgpack natively serializes tuples as arrays (so thedefaultcallback is never called). Fixed by pre-processing tuples into{"__tuple__": True, "value": [...]}markers before encoding.Test plan
get_serializer('pythonic')returnsAutoSerializerCloses #78
Summary by CodeRabbit
New Features
Bug Fixes
Tests / Documentation