Skip to content

GH-2914: Add documentation for the Java library#3275

Closed
ArnavBalyan wants to merge 4 commits into
apache:masterfrom
ArnavBalyan:arnavb/add-exmaples
Closed

GH-2914: Add documentation for the Java library#3275
ArnavBalyan wants to merge 4 commits into
apache:masterfrom
ArnavBalyan:arnavb/add-exmaples

Conversation

@ArnavBalyan

Copy link
Copy Markdown
Member

@ArnavBalyan

Copy link
Copy Markdown
Member Author

cc @wgtmac @shangxinli could you please review thanks!

@wgtmac wgtmac left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for approving it by accident. I need to leave a comment to retract it.

private static void writeSalesData(String filename, MessageType schema) throws IOException {
Path file = new Path(filename);

try (ParquetWriter<Group> writer = ExampleParquetWriter.builder(file)

@wgtmac wgtmac Aug 23, 2025

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just hesitant to use ExampleParquetWriter as examples which is not for production purpose. Adding an example module also incurs more maintenance burden so I don't think this is a good idea TBH.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, we can remove the sub-module and just produce it as reference only example, it should also be able to resolve the documentation concerns raised in the issue, wdyt?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the dependency from ExampleParquetWriter and removed the pom to eliminate maintenance overhead

@ArnavBalyan

Copy link
Copy Markdown
Member Author

Gentle reminder cc @wgtmac @ggershinsky thanks!

@wgtmac

wgtmac commented Aug 28, 2025

Copy link
Copy Markdown
Member

TBH, I don't think adding some random examples would really help users because they are pretty similar to what's already in the unit test. What in my mind is something like https://arrow.apache.org/cookbook/ which requires a lot of effort to craft examples and maintain them to be in sync. Today LLMs are smart enough to produce code like this (I believe this PR is exactly doing this, right?).

@ArnavBalyan

Copy link
Copy Markdown
Member Author

TBH, I don't think adding some random examples would really help users because they are pretty similar to what's already in the unit test. What in my mind is something like https://arrow.apache.org/cookbook/ which requires a lot of effort to craft examples and maintain them to be in sync. Today LLMs are smart enough to produce code like this (I believe this PR is exactly doing this, right?).

Thanks cookbook is a great idea, I would like to implement it for Parquet java, let me add support in another change. I came up with the examples in this to allow beginners to understand basic examples, I myself faced issues a while back when onboarding to Parquet.
I think the change should be harmless and can only help users with some more guidance/help when onboarding to the project, wdyt? (the examples are structured in 3 stages to allow from basic usage to advanced usage to solve for issues raised in #2914)
Thanks so much for the review!

@ArnavBalyan

ArnavBalyan commented Aug 28, 2025

Copy link
Copy Markdown
Member Author

cc @wgtmac @Fokko @gszadovszky @shangxinli just wanted to get a sense of the community thoughts on a cookbook as a follow up to this PR. I think having better documentation to parquet will help users adopt the project faster and in general would be a good ecosystem addition to the project. If you are open to this I'd like to add support and maintain it in the future. thanks for the suggestion @wgtmac

@ArnavBalyan

Copy link
Copy Markdown
Member Author

Have created an issue to track this story thanks! Would be really great if folks can review and add suggestions/feedback thanks! #3284

@ArnavBalyan ArnavBalyan requested a review from wgtmac August 28, 2025 13:26
@github-actions

Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has had no activity for at least 2 months. If you are still working on this change or plan to move it forward, please leave a comment or push a new commit so we know to keep it open. Otherwise, this PR will be closed automatically in about one month. Thank you for your contribution to Apache Parquet!

@github-actions github-actions Bot added the stale label Apr 23, 2026
@github-actions

Copy link
Copy Markdown

Closing this pull request due to at least 3 months of inactivity. If you would like to continue the work, please feel free to reopen this pull request or open a new one. Thank you for your contribution to Apache Parquet!

@github-actions github-actions Bot closed this May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants