Skip to content

bug(spec): JSON literal serialization for Fixed/Binary is broken in both directions #2430

@YuangGao

Description

@YuangGao

Apache Iceberg Rust version

None

Describe the bug

Two complementary bugs in JSON serialization for Fixed and Binary primitive types:

1. Literal::try_from_json panics (literal.rs#L502-L503):

(PrimitiveType::Fixed(_), JsonValue::String(_)) => todo!(),
(PrimitiveType::Binary, JsonValue::String(_)) => todo!(),

Per the JSON spec, these should be parsed as hex strings.

2. Literal::try_into_json drops leading zeros (literal.rs#L662-L668):

format!("{x:x}") (no zero-padding) encodes byte 0x00 as "0" and 0x05 as "5". So vec![0x00, 0xff, 0x05] round-trips to "0ff5" instead of "00ff05". Should be {x:02x}.

Neither path is exercised by the check_json_serde round-trip helper at tests.rs#L37 (called for 13 types but never for Fixed or Binary). Present on main (0.9.x).

To Reproduce

Add the following two #[test] functions to crates/iceberg/src/spec/values/tests.rs and run cargo test -p iceberg --lib spec::values::tests::_repro -- --nocapture:

#[test]
fn _repro_bug1_panic() {
    let result = std::panic::catch_unwind(|| {
        Literal::try_from_json(
            JsonValue::String("deadbeef".into()),
            &Type::Primitive(PrimitiveType::Binary),
        )
    });
    let err = result.expect_err("expected panic");
    let msg = err
        .downcast_ref::<&str>()
        .map(|s| (*s).to_string())
        .or_else(|| err.downcast_ref::<String>().cloned())
        .unwrap_or_else(|| "<non-string panic payload>".to_string());
    eprintln!("BUG1_PANIC_MESSAGE={msg}");
}

#[test]
fn _repro_bug2_encoding() {
    let lit = Literal::Primitive(PrimitiveLiteral::Binary(vec![0x00, 0xff, 0x05]));
    let json = lit
        .try_into_json(&Type::Primitive(PrimitiveType::Binary))
        .unwrap();
    eprintln!("BUG2_ACTUAL={json:?}");
    eprintln!("BUG2_EXPECTED=String(\"00ff05\")");
}

Actual output (on main @ 0.9.0):

thread 'spec::values::tests::_repro_bug1_panic' panicked at crates/iceberg/src/spec/values/literal.rs:503:66:
not yet implemented
BUG1_PANIC_MESSAGE=not yet implemented

BUG2_ACTUAL=String("0ff5")
BUG2_EXPECTED=String("00ff05")

Expected behavior

  • try_from_json parses hex strings into PrimitiveLiteral::Binary(Vec<u8>); for Fixed(n) validates that bytes.len() == n and errors otherwise.
  • try_into_json zero-pads each byte ({x:02x}) so that vec![0x00, 0xff, 0x05] encodes to "00ff05".
  • check_json_serde round-trip coverage extended to Fixed and Binary.

Willingness to contribute

I can contribute a fix for this bug independently

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions