Skip to content

test: add count-min C++ serialization compatibility tests#92

Merged
tisonkun merged 2 commits into
apache:mainfrom
PsiACE:countmin-xlang-serde-tests
Jun 25, 2026
Merged

test: add count-min C++ serialization compatibility tests#92
tisonkun merged 2 commits into
apache:mainfrom
PsiACE:countmin-xlang-serde-tests

Conversation

@PsiACE

@PsiACE PsiACE commented Feb 13, 2026

Copy link
Copy Markdown
Member

No description provided.

@tisonkun

This comment was marked as outdated.

@tisonkun

Copy link
Copy Markdown
Member

Failed on Windows. Perhaps related to line separator?

@tisonkun

Copy link
Copy Markdown
Member

Looks like when generating the cpp snapshot, it prints CRLF on windows.

Should work with their people like @proost

@leerho

leerho commented Jun 18, 2026

Copy link
Copy Markdown
Member

WRT Windows and CRLF issues. I noticed that there is no .gitattributes file in ds-cpp. I ran into similar issues recently when refactoring ds-memory. It turns out all the major modern OSs can correctly process text files with just LF "\n" line endings. So structuring the .gitattributes file to force LF endings for all text type files fixed the problem and it works for Ubuntu, MacOS, and Windows. See this .gitattributes.

@tisonkun

Copy link
Copy Markdown
Member

@PsiACE the corresponding cpp patch has been merged.

You may rebase this patch and pin the cpp checkout to the latest commit and retry.

@PsiACE PsiACE force-pushed the countmin-xlang-serde-tests branch from cb2c7f2 to 837bda2 Compare June 24, 2026 18:18
@PsiACE

PsiACE commented Jun 24, 2026

Copy link
Copy Markdown
Member Author

Rebased this branch onto current origin/main and pinned the C++ serialization data generator to apache/datasketches-cpp@0bab259, which includes the merged .gitattributes update.

@PsiACE PsiACE force-pushed the countmin-xlang-serde-tests branch from 837bda2 to bf64c87 Compare June 24, 2026 18:22
@PsiACE PsiACE changed the title tests: add count-min C++ serialization compatibility tests test: add count-min C++ serialization compatibility tests Jun 24, 2026
@PsiACE PsiACE force-pushed the countmin-xlang-serde-tests branch from bf64c87 to 76ffa56 Compare June 24, 2026 18:25
@tisonkun

Copy link
Copy Markdown
Member

Seems C++ snapshot generation issue, not gitattribute.

Let me prepare a fix.

@PsiACE

PsiACE commented Jun 24, 2026

Copy link
Copy Markdown
Member Author

Seems C++ snapshot generation issue, not gitattribute.

Let me prepare a fix.

*.bin -> *.sk , and with std::ios::binary , emmmm, it looks like this should fix the problem.

@tisonkun

Copy link
Copy Markdown
Member

See apache/datasketches-cpp#505.

@tisonkun

Copy link
Copy Markdown
Member

You may test with checkout to this patch's commit so we use it as an e2e test.

Comment thread tools/generate_serialization_test_data.py Outdated
Comment on lines 137 to +150
@@ -142,6 +146,8 @@ def generate_cpp_files(workspace_dir, project_root):
repo_url,
str(temp_dir)
])
run_command(["git", "fetch", "--depth", "1", "origin", fetch_ref], cwd=temp_dir)
run_command(["git", "checkout", "--detach", commit], cwd=temp_dir)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some temp code here; needs to be cleaned once PR505 merged into c++ impl

@tisonkun tisonkun marked this pull request as ready for review June 25, 2026 03:26

@tisonkun tisonkun left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - merging now.

We can drop python scripts trick once datasketches-cpp updated.

@tisonkun tisonkun merged commit b941327 into apache:main Jun 25, 2026
10 checks passed
@PsiACE PsiACE deleted the countmin-xlang-serde-tests branch June 25, 2026 03:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants