Skip to content

Add protobuf support for lambdas#22362

Open
gstvg wants to merge 4 commits into
apache:mainfrom
gstvg:lambda_protobuf_new
Open

Add protobuf support for lambdas#22362
gstvg wants to merge 4 commits into
apache:mainfrom
gstvg:lambda_protobuf_new

Conversation

@gstvg
Copy link
Copy Markdown
Contributor

@gstvg gstvg commented May 19, 2026

Which issue does this PR close?

Part of #21172

Rationale for this change

Protobuf support wasn't implemented in main lambda PR to not make it even bigger

What changes are included in this PR?

Protobuf encoding and decoding (~1000 LOC in generated files, ~210 impl, ~400 tests)

Are these changes tested?

Unit tests, similar to the existing ones for scalar functions

Are there any user-facing changes?

None

@github-actions github-actions Bot added the proto Related to proto crate label May 19, 2026
return Err(Error::General(
"Proto serialization error: Lambda not implemented".to_string(),
));
Expr::HigherOrderFunction(HigherOrderFunction { func, args }) => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: instead of appending this, can we move this next to the rest of functions? (Scalar, Aggregate etc)

Comment on lines +434 to +436
HigherOrderUDFExprNode higher_order_udf_expr = 37;
Lambda lambda = 38;
LambdaVariable lambda_variable = 39;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit as well: Can we move this next to the other UDFs?

expr_id,
expr_type: Some(protobuf::physical_expr_node::ExprType::LambdaVariable(
PhysicalLambdaVariableExprNode {
index: var.index() as u32,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now we don't use this so it is always 0 no?

Comment on lines +586 to +590
protobuf::PhysicalHigherOrderUdfNode {
name: expr.name().to_string(),
args: serialize_physical_exprs(expr.args(), codec, proto_converter)?,
fun_definition: (!buf.is_empty()).then_some(buf),
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we encode the return_field instead of resolving it from the schema when decoding (like ScalarUDF)?
Not really sure what would be the exact disadvantage of keeping it at is, but wondering if it would be cleaner to make the return type explicit.

let serialize_name = extract_function_name(&expr);
let deserialize_name = extract_function_name(&deserialize);

assert_eq!(serialize_name, deserialize_name);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we assert directly on expr and deserialize instead? since the Hofs implement PartialEq I think it should work
assert_eq!(expr, deserialized_expr);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

proto Related to proto crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants