Skip to content

GH-3282: Add encryption info CLI support for Parquet file encryption metadata#3281

Closed
ArnavBalyan wants to merge 7 commits into
apache:masterfrom
ArnavBalyan:cli-encryption
Closed

GH-3282: Add encryption info CLI support for Parquet file encryption metadata#3281
ArnavBalyan wants to merge 7 commits into
apache:masterfrom
ArnavBalyan:cli-encryption

Conversation

@ArnavBalyan

@ArnavBalyan ArnavBalyan commented Aug 25, 2025

Copy link
Copy Markdown
Member
  • Since Parquet 1.12, encryption has become a first class citizen, with support for footer and column level encryption.
  • However, users have no clear way to check encryption metadata, mode, or whether footer/file is encrypted.
  • This PR adds a simple, dedicated CLI command: parquet-cli encryption-info <file>
  • The command reports the following:
    • File-level encryption type: PLAINTEXT_FOOTER or ENCRYPTED_FOOTER.
    • Summary of column encryption, per-column details and their encryption status.

@ArnavBalyan ArnavBalyan changed the title Add encryption-info CLI support for Parquet file encryption metadata GH-3282: Add encryption info CLI support for Parquet file encryption metadata Aug 25, 2025
@ArnavBalyan

Copy link
Copy Markdown
Member Author

cc @shangxinli @gszadovszky could you please take a look thanks!

ParquetMetadata footer =
ParquetFileReader.readFooter(getConf(), qualifiedPath(source), ParquetMetadataConverter.NO_FILTER);

FileMetaData meta = footer.getFileMetaData();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to also print out details about the encryption algorithm, wouldn't it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that's a great point will add support

@wgtmac

wgtmac commented Aug 28, 2025

Copy link
Copy Markdown
Member

cc @ggershinsky @shangxinli for experts on encryption

@ggershinsky

Copy link
Copy Markdown
Contributor

Some other details worth printing -

  • is a column encrypted with the footer key or with a column-specific key?
  • if all columns are encrypted with the footer key, then the file is in "uniform encryption" mode; can print this (so the user knows one key only is used in a file and can open every column)
  • explicit info on the footer encryption mode - encrypted or plaintext
  • optional (via a flag) printing of the key metadata of the footer key and (if available) of the column keys - can be useful for debugging key retrieval. This is binary, but maybe something similar to "hexdump -C" can be performed where some effort is made to find/print ASCII text chunks (often, key metadata has text/json parts)
  • advanced debugging: print the AAD-related fields

@ArnavBalyan

Copy link
Copy Markdown
Member Author

Some other details worth printing -

  • is a column encrypted with the footer key or with a column-specific key?
  • if all columns are encrypted with the footer key, then the file is in "uniform encryption" mode; can print this (so the user knows one key only is used in a file and can open every column)
  • explicit info on the footer encryption mode - encrypted or plaintext
  • optional (via a flag) printing of the key metadata of the footer key and (if available) of the column keys - can be useful for debugging key retrieval. This is binary, but maybe something similar to "hexdump -C" can be performed where some effort is made to find/print ASCII text chunks (often, key metadata has text/json parts)
  • advanced debugging: print the AAD-related fields

Thanks this is great feedback I'll iterate and update this shortly

@github-actions

Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has had no activity for at least 2 months. If you are still working on this change or plan to move it forward, please leave a comment or push a new commit so we know to keep it open. Otherwise, this PR will be closed automatically in about one month. Thank you for your contribution to Apache Parquet!

@github-actions github-actions Bot added the stale label Apr 23, 2026
@github-actions

Copy link
Copy Markdown

Closing this pull request due to at least 3 months of inactivity. If you would like to continue the work, please feel free to reopen this pull request or open a new one. Thank you for your contribution to Apache Parquet!

@github-actions github-actions Bot closed this May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants