GH-3277: Support passing write configurations to footer optionally#3278
GH-3277: Support passing write configurations to footer optionally#3278ArnavBalyan wants to merge 1 commit into
Conversation
ArnavBalyan
commented
Aug 24, 2025
- Write configurations are often needed for debugging and posterity however application logs are lost in a few days.
- This change adds an optional flag, which when enabled passes the write configurations to the file footer.
- Default flag is false, and can be enabled by users to pass this additional metadata to their parquet files.
|
cc @shangxinli @wgtmac could you please take a look thanks! |
|
IMHO, this is merely a customized logic which can be handled pretty well by specific applications. We don't want to take the maintenance overhead. |
Thanks for the review! This is behind a feature flag and upto the users to enable it, the logic is minimal and provides high degree of clarity and debuggability for end users/applications that don't have to re-write this logic throughout. Maybe we could keep it default off and let users enable on demand, wdyt? @wgtmac @shangxinli |
|
Defaulting to off does not justify it to be a valid feature to the Parquet library. If users want fine-grained control of the subset of configs, do we want to support it? Or if users have built a custom record writer on top of the ParrquetFileWriter (just like what Iceberg did), how do we know it? How does the ParquetRewriter handle different conflicting configs when merging several parquet files? So to me this is a pure application logic which users can handle it well on their side. We don't want to pay for the complexity within the library. |
Sure sounds good! Will close this PR, I think some of the above should be easy to solve, definitely requires more discussion 👍 |
|
Closing this PR as suggested |