Fix: Resolved deterministic memory leak and dangling pointer in SQLParser::tokenize by RageLiu · Pull Request #262 · hyrise/sql-parser

RageLiu · 2026-03-31T02:30:10Z

Problem

The current implementation of the SQLParser::tokenize loop contains a logic error regarding memory management:

Overwrite Loss (Memory Leak): The loop retrieves the next token immediately after entering the while block. This causes the pointer to the first token (if it is a SQL_IDENTIFIER or SQL_STRING) to be overwritten and lost before it can be checked or freed.
Dangling Pointer: After calling free(yylval.sval), the pointer is not set to nullptr. This stale address remains in the reused yylval structure, leading to potential Double-Free or Use-After-Free (UAF) risks in subsequent lexer calls.

Solution

This PR applies a minimal-change fix by reordering the operations within the tokenize loop:

Reordered Execution: The hsql_lex call is moved to the end of the loop. This ensures that the current token (including the first one) is fully processed and its memory is safely released before the next token is fetched.
Pointer Nullification: Added yylval.sval = nullptr; immediately after free() to eliminate dangling pointers.
Preserved Structure: Maintained the original while (token != 0) structure to keep the diff as clean as possible.

Verification

LeakSanitizer (LSan): Confirmed that the previously detected 11-byte leak per SQL statement is now fully resolved.
AddressSanitizer (ASan): No memory corruption or illegal access detected during stress testing with consecutive identifier tokens.

…nize

Bouncner · 2026-03-31T07:35:37Z

src/SQLParser.cpp

+
    if (token == SQL_IDENTIFIER || token == SQL_STRING) {
      free(yylval.sval);
+      yylval.sval = nullptr;


Do we need to care about the dangling pointer when we overwrite sval anyways in hsql_lex?

Can you also add a test that would fail with sanitizers and without the patch?

Do we need to care about the dangling pointer when we overwrite sval anyways in hsql_lex?

That’s a fair point, but while hsql_lex does overwrite yylval for strings or identifiers, explicitly nullifying the pointer remains essential for several reasons. First, since yylval is a union, the sval member is typically not modified when the lexer returns tokens that don't require string values, such as semicolons or operators, meaning the stale, freed address stays in memory . Because the yylval structure is reused throughout the loop, this dangling pointer introduces a significant risk of a Double-Free if subsequent logic or future code changes attempt to release sval again while it still holds the old address.

Can you also add a test that would fail with sanitizers and without the patch?

Done! I have added the regression test to test/sql_parser.cpp.

The test uses a sequence of consecutive identifiers to ensure that the memory is correctly managed and that yylval.sval is properly nullified after being freed. I have verified locally that this test fails with a LeakSanitizer error without my patch and passes successfully with the fix applied.

Please let me know if there are any other adjustments needed!

Thank you! I noticed we do not run sanitizer builds in the CI. To get such warnings automatically and to verify the PR works as intended, yould you please add sanitizer builds with clang (on Ubuntu and macOS) to the CI workflow that run the tests?

Bouncner

Thanks for the pull request!

Bouncner · 2026-03-31T15:25:30Z

.github/workflows/ci.yml

+
    steps:
      - name: Checkout
        uses: actions/checkout@v4


Not yours, but can you please update the action to version 6? There are several deprecation warning.

Not yours, but can you please update the action to version 6? There are several deprecation warning.

Sure! I've updated actions/checkout to v6 and temporarily removed the fix as requested to verify the sanitizer. I will restore the fix once we see the CI failing.

Not yours, but can you please update the action to version 6? There are several deprecation warning.

My apologies—the previous CI run failed due to a missing newline at the end of src/SQLParser.cpp, which triggered a compiler error (-Wnewline-eof).

I have fixed the formatting while keeping the logic fix removed as you requested. Could you please approve the workflow run again? This should now correctly show the Sanitizer findings.

Bouncner · 2026-03-31T15:26:13Z

src/SQLParser.cpp

+
    }
+
+    token = hsql_lex(&yylval, &yylloc, scanner);


Last sanatizer run was all good. Can you -- just temporarily -- remove your fix to check if it correctly caught by the sanatizer?

Last sanatizer run was all good. Can you -- just temporarily -- remove your fix to check if it correctly caught by the sanatizer?

Done! I have temporarily removed the fix as requested to verify the sanitizer. The workflow is now awaiting approval to run. Please approve the CI whenever you're ready.

Last sanatizer run was all good. Can you -- just temporarily -- remove your fix to check if it correctly caught by the sanatizer?

The leaks were correctly caught by the Ubuntu Sanitizers/Valgrind (as expected, since LSan is more robust on Linux), confirming the bug's presence.

RageLiu · 2026-04-01T07:35:27Z

The experiment was a success! As shown in the latest CI run, the regression test correctly triggered a memory leak detection (9 bytes definitely lost) in all Ubuntu environments when the fix was removed.

This confirms that the CI and the new test cases are effectively guarding against this bug. I have now re-applied the fix.

Fix deterministic memory leak and dangling pointer in SQLParser::toke…

dd347ac

…nize

RageLiu mentioned this pull request Mar 31, 2026

[Security] Deterministic Memory Leak and Dangling Pointer in SQLParser::tokenize #261

Open

Bouncner reviewed Mar 31, 2026

View reviewed changes

RageLiu added 3 commits March 31, 2026 16:20

Add regression test for tokenize memory leak

2b254f3

update CI with Clang sanitizer builds

571dc22

Fix CI OS detection

14c5621

Bouncner reviewed Mar 31, 2026

View reviewed changes

RageLiu added 3 commits April 1, 2026 09:22

Temporarily remove fix to verify sanitizer failure

23e894d

Fix newline EOF and keep fix removed for verification

20603d2

Re-apply the fix: Tests now pass and memory leaks are resolved

79f3c33

Conversation

RageLiu commented Mar 31, 2026

Problem

Solution

Verification

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Bouncner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RageLiu commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants