Apache Iceberg Rust version
None
Describe the bug
Two complementary bugs in JSON serialization for Fixed and Binary primitive types:
1. Literal::try_from_json panics (literal.rs#L502-L503):
(PrimitiveType::Fixed(_), JsonValue::String(_)) => todo!(),
(PrimitiveType::Binary, JsonValue::String(_)) => todo!(),
Per the JSON spec, these should be parsed as hex strings.
2. Literal::try_into_json drops leading zeros (literal.rs#L662-L668):
format!("{x:x}") (no zero-padding) encodes byte 0x00 as "0" and 0x05 as "5". So vec![0x00, 0xff, 0x05] round-trips to "0ff5" instead of "00ff05". Should be {x:02x}.
Neither path is exercised by the check_json_serde round-trip helper at tests.rs#L37 (called for 13 types but never for Fixed or Binary). Present on main (0.9.x).
To Reproduce
Add the following two #[test] functions to crates/iceberg/src/spec/values/tests.rs and run cargo test -p iceberg --lib spec::values::tests::_repro -- --nocapture:
#[test]
fn _repro_bug1_panic() {
let result = std::panic::catch_unwind(|| {
Literal::try_from_json(
JsonValue::String("deadbeef".into()),
&Type::Primitive(PrimitiveType::Binary),
)
});
let err = result.expect_err("expected panic");
let msg = err
.downcast_ref::<&str>()
.map(|s| (*s).to_string())
.or_else(|| err.downcast_ref::<String>().cloned())
.unwrap_or_else(|| "<non-string panic payload>".to_string());
eprintln!("BUG1_PANIC_MESSAGE={msg}");
}
#[test]
fn _repro_bug2_encoding() {
let lit = Literal::Primitive(PrimitiveLiteral::Binary(vec![0x00, 0xff, 0x05]));
let json = lit
.try_into_json(&Type::Primitive(PrimitiveType::Binary))
.unwrap();
eprintln!("BUG2_ACTUAL={json:?}");
eprintln!("BUG2_EXPECTED=String(\"00ff05\")");
}
Actual output (on main @ 0.9.0):
thread 'spec::values::tests::_repro_bug1_panic' panicked at crates/iceberg/src/spec/values/literal.rs:503:66:
not yet implemented
BUG1_PANIC_MESSAGE=not yet implemented
BUG2_ACTUAL=String("0ff5")
BUG2_EXPECTED=String("00ff05")
Expected behavior
try_from_json parses hex strings into PrimitiveLiteral::Binary(Vec<u8>); for Fixed(n) validates that bytes.len() == n and errors otherwise.
try_into_json zero-pads each byte ({x:02x}) so that vec![0x00, 0xff, 0x05] encodes to "00ff05".
check_json_serde round-trip coverage extended to Fixed and Binary.
Willingness to contribute
I can contribute a fix for this bug independently
Apache Iceberg Rust version
None
Describe the bug
Two complementary bugs in JSON serialization for
FixedandBinaryprimitive types:1.
Literal::try_from_jsonpanics (literal.rs#L502-L503):Per the JSON spec, these should be parsed as hex strings.
2.
Literal::try_into_jsondrops leading zeros (literal.rs#L662-L668):format!("{x:x}")(no zero-padding) encodes byte0x00as"0"and0x05as"5". Sovec![0x00, 0xff, 0x05]round-trips to"0ff5"instead of"00ff05". Should be{x:02x}.Neither path is exercised by the
check_json_serderound-trip helper at tests.rs#L37 (called for 13 types but never forFixedorBinary). Present on main (0.9.x).To Reproduce
Add the following two
#[test]functions tocrates/iceberg/src/spec/values/tests.rsand runcargo test -p iceberg --lib spec::values::tests::_repro -- --nocapture:Actual output (on
main@ 0.9.0):Expected behavior
try_from_jsonparses hex strings intoPrimitiveLiteral::Binary(Vec<u8>); forFixed(n)validates thatbytes.len() == nand errors otherwise.try_into_jsonzero-pads each byte ({x:02x}) so thatvec![0x00, 0xff, 0x05]encodes to"00ff05".check_json_serderound-trip coverage extended toFixedandBinary.Willingness to contribute
I can contribute a fix for this bug independently