Skip to content

Fix: Resolved deterministic memory leak and dangling pointer in SQLParser::tokenize#262

Open
RageLiu wants to merge 7 commits intohyrise:mainfrom
RageLiu:fix/memleak-and-dangling-pointer
Open

Fix: Resolved deterministic memory leak and dangling pointer in SQLParser::tokenize#262
RageLiu wants to merge 7 commits intohyrise:mainfrom
RageLiu:fix/memleak-and-dangling-pointer

Conversation

@RageLiu
Copy link
Copy Markdown

@RageLiu RageLiu commented Mar 31, 2026

Problem

The current implementation of the SQLParser::tokenize loop contains a logic error regarding memory management:

  1. Overwrite Loss (Memory Leak): The loop retrieves the next token immediately after entering the while block. This causes the pointer to the first token (if it is a SQL_IDENTIFIER or SQL_STRING) to be overwritten and lost before it can be checked or freed.
  2. Dangling Pointer: After calling free(yylval.sval), the pointer is not set to nullptr. This stale address remains in the reused yylval structure, leading to potential Double-Free or Use-After-Free (UAF) risks in subsequent lexer calls.

Solution

This PR applies a minimal-change fix by reordering the operations within the tokenize loop:

  1. Reordered Execution: The hsql_lex call is moved to the end of the loop. This ensures that the current token (including the first one) is fully processed and its memory is safely released before the next token is fetched.
  2. Pointer Nullification: Added yylval.sval = nullptr; immediately after free() to eliminate dangling pointers.
  3. Preserved Structure: Maintained the original while (token != 0) structure to keep the diff as clean as possible.

Verification

LeakSanitizer (LSan): Confirmed that the previously detected 11-byte leak per SQL statement is now fully resolved.
AddressSanitizer (ASan): No memory corruption or illegal access detected during stress testing with consecutive identifier tokens.


if (token == SQL_IDENTIFIER || token == SQL_STRING) {
free(yylval.sval);
yylval.sval = nullptr;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to care about the dangling pointer when we overwrite sval anyways in hsql_lex?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a test that would fail with sanitizers and without the patch?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to care about the dangling pointer when we overwrite sval anyways in hsql_lex?

That’s a fair point, but while hsql_lex does overwrite yylval for strings or identifiers, explicitly nullifying the pointer remains essential for several reasons. First, since yylval is a union, the sval member is typically not modified when the lexer returns tokens that don't require string values, such as semicolons or operators, meaning the stale, freed address stays in memory . Because the yylval structure is reused throughout the loop, this dangling pointer introduces a significant risk of a Double-Free if subsequent logic or future code changes attempt to release sval again while it still holds the old address.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a test that would fail with sanitizers and without the patch?

Done! I have added the regression test to test/sql_parser.cpp.

The test uses a sequence of consecutive identifiers to ensure that the memory is correctly managed and that yylval.sval is properly nullified after being freed. I have verified locally that this test fails with a LeakSanitizer error without my patch and passes successfully with the fix applied.

Please let me know if there are any other adjustments needed!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I noticed we do not run sanitizer builds in the CI. To get such warnings automatically and to verify the PR works as intended, yould you please add sanitizer builds with clang (on Ubuntu and macOS) to the CI workflow that run the tests?

Copy link
Copy Markdown
Collaborator

@Bouncner Bouncner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pull request!


steps:
- name: Checkout
uses: actions/checkout@v4
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yours, but can you please update the action to version 6? There are several deprecation warning.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yours, but can you please update the action to version 6? There are several deprecation warning.

Sure! I've updated actions/checkout to v6 and temporarily removed the fix as requested to verify the sanitizer. I will restore the fix once we see the CI failing.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yours, but can you please update the action to version 6? There are several deprecation warning.

My apologies—the previous CI run failed due to a missing newline at the end of src/SQLParser.cpp, which triggered a compiler error (-Wnewline-eof).

I have fixed the formatting while keeping the logic fix removed as you requested. Could you please approve the workflow run again? This should now correctly show the Sanitizer findings.


}

token = hsql_lex(&yylval, &yylloc, scanner);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last sanatizer run was all good. Can you -- just temporarily -- remove your fix to check if it correctly caught by the sanatizer?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last sanatizer run was all good. Can you -- just temporarily -- remove your fix to check if it correctly caught by the sanatizer?

Done! I have temporarily removed the fix as requested to verify the sanitizer. The workflow is now awaiting approval to run. Please approve the CI whenever you're ready.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last sanatizer run was all good. Can you -- just temporarily -- remove your fix to check if it correctly caught by the sanatizer?

The leaks were correctly caught by the Ubuntu Sanitizers/Valgrind (as expected, since LSan is more robust on Linux), confirming the bug's presence.

@RageLiu
Copy link
Copy Markdown
Author

RageLiu commented Apr 1, 2026

The experiment was a success! As shown in the latest CI run, the regression test correctly triggered a memory leak detection (9 bytes definitely lost) in all Ubuntu environments when the fix was removed.

This confirms that the CI and the new test cases are effectively guarding against this bug. I have now re-applied the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants