Skip to content

SQLite-Based Binary Logging #10633

Open
alex-maiorov wants to merge 6 commits into
The-OpenROAD-Project:masterfrom
alex-maiorov:binary_logging_signed
Open

SQLite-Based Binary Logging #10633
alex-maiorov wants to merge 6 commits into
The-OpenROAD-Project:masterfrom
alex-maiorov:binary_logging_signed

Conversation

@alex-maiorov

Copy link
Copy Markdown

Summary

I added a way to dump large amounts of data from various components into an SQLite database. The logger works by calling a logging function analogous to the existing logger functions, and then the data is processed by a background thread for actual writing. The actual database has separate tables for each numeric log(logToDb or logToDbBulk), and those methods are designed for low latency. There is also logToDbMetadata, which is a simple and non-optimized text to text key-value logging mechanism that goes into a single table.

Type of Change

New Feature: SQLite Binary Logging.

Impact

If turned on and called, qualitatively I have observed the performance impact to be fairly minimal for what I was doing. Given the use-case of large dumps, I would like suggestions for writing reasonable performance and correctness tests.

At runtime, this is an entirely opt-in feature. However, this does still require sqlite3 as a compile time dependency, though SQLite is almost universally available across different platforms. I also modified find_messages.py so that it enforces the uniqueness of the tool id/message id for those methods.

Note on Incompleteness

I am aware that this code is not quite ready for merging. I originally wrote it as a debugging and data analysis tool on the side for a research project I was working on, however it ended up being very useful so I would like to get it to a merge-ready state eventually.

Verification

  • I have verified that the local build succeeds (./etc/Build.sh).
  • I have run the relevant tests and they pass.
  • My code follows the repository's formatting guidelines.
  • I have included tests to prevent regressions.
  • I have signed my commits (DCO).

Related Issues

None that I know of, but @povik was interested in this functionality.

Signed-off-by: Alex Maiorov <alexander.mmaiorov@gmail.com>
Signed-off-by: Alex Maiorov <alexander.mmaiorov@gmail.com>
… ones

Signed-off-by: Alex Maiorov <alexander.mmaiorov@gmail.com>
Signed-off-by: Alex Maiorov <alexander.mmaiorov@gmail.com>
Signed-off-by: Alex Maiorov <alexander.mmaiorov@gmail.com>
Signed-off-by: Alex Maiorov <alexander.mmaiorov@gmail.com>
@alex-maiorov alex-maiorov requested review from a team as code owners June 10, 2026 21:15
@alex-maiorov alex-maiorov requested a review from maliberty June 10, 2026 21:15

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome to OpenROAD! Thanks for opening your first PR.
Before we review:

Please ensure:

  • CI passes
  • Code is properly formatted
  • Tests are included where applicable
    A maintainer will review shortly!

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a parallel logging mechanism to stream large quantities of numeric data into an SQLite database, featuring a background thread, type-erased queues, memory pressure management, and corresponding TCL/C++ APIs. The code review identifies several critical issues and improvement opportunities, including an uncommitted transaction bug when draining empty queues, a database connection leak during shutdown due to retained shared pointers in the schema registry, a SQL injection risk from unvalidated table names, the use of deprecated atomic shared pointer functions, a potential race condition in concurrent calls to stopLogDb, and a regex limitation in the message finder script for multi-line template headers.

Comment on lines +235 to +274
size_t drain_to_db(sqlite3* db, size_t max_records) override
{
// The latter means the statement could not be built, so fail the whole thing. Otherwise use cached stmt.
if (!stmt_ && !build_insert_stmt(db)) {
return 0;
}

sqlite3_exec(db, "BEGIN", nullptr, nullptr, nullptr);

size_t count = 0;
bool ok = true;

while (count < max_records) {
row_type row;
{
std::lock_guard<std::mutex> lock(mutex_);
if (queue_.empty()) {
break;
}
row = std::move(queue_.front());
queue_.pop_front();
item_count_.fetch_sub(1, std::memory_order_release);
}

// C++17 template magic to pass the column name strings as column keys for sqlite
bind_row(stmt_, row, std::index_sequence_for<Args...>{});
int rc = sqlite3_step(stmt_);
sqlite3_reset(stmt_);
if (rc != SQLITE_DONE) {
ok = false;
break;
}
++count;
}

// Rollback if there was a failure.
if (count > 0) {
sqlite3_exec(db, ok ? "COMMIT" : "ROLLBACK",
nullptr, nullptr, nullptr);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

If drain_to_db is called when the queue is empty, sqlite3_exec(db, "BEGIN", ...) is executed, but the loop terminates immediately with count == 0. Consequently, the matching COMMIT or ROLLBACK is never executed because of the if (count > 0) guard. This leaves an uncommitted transaction open on the database connection, which will cause subsequent database operations to fail or behave unexpectedly.

We should only start the transaction (BEGIN) when we are about to process the first item in the loop.

  size_t drain_to_db(sqlite3* db, size_t max_records) override
  {
    // The latter means the statement could not be built, so fail the whole thing. Otherwise use cached stmt.
    if (!stmt_ && !build_insert_stmt(db)) {
      return 0;
    }

    size_t count = 0;
    bool ok = true;

    while (count < max_records) {
      row_type row;
      {
        std::lock_guard<std::mutex> lock(mutex_);
        if (queue_.empty()) {
          break;
        }
        row = std::move(queue_.front());
        queue_.pop_front();
        item_count_.fetch_sub(1, std::memory_order_release);
      }

      if (count == 0) {
        sqlite3_exec(db, "BEGIN", nullptr, nullptr, nullptr);
      }

      // C++17 template magic to pass the column name strings as column keys for sqlite
      bind_row(stmt_, row, std::index_sequence_for<Args...>{});
      int rc = sqlite3_step(stmt_);
      sqlite3_reset(stmt_);
      if (rc != SQLITE_DONE) {
        ok = false;
        break;
      }
      ++count;
    }

    // Rollback if there was a failure.
    if (count > 0) {
      sqlite3_exec(db, ok ? "COMMIT" : "ROLLBACK",
                   nullptr, nullptr, nullptr);
    }

    last_flush_ms_.store(
        std::chrono::duration_cast<std::chrono::milliseconds>(
            std::chrono::steady_clock::now().time_since_epoch())
            .count(),
        std::memory_order_release);
    return count;
  }

Comment thread src/utl/src/Logger.cpp
Comment on lines +647 to +649
// Destroy queues before closing so TypedQueue destructors can finalize
// their prepared statements.
local_registry.clear();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

During shutdown, local_registry.clear() is called to destroy the local queues and trigger their destructors (which finalize the prepared statements). However, schema_registry_ still holds std::shared_ptr references to these same queues. Because of this, the reference count of the queues does not drop to zero, their destructors are not called, and the prepared statements remain active. As a result, sqlite3_close(db_) will fail and return SQLITE_BUSY, leading to a leaked database connection.

To fix this, we must clear schema_registry_ as well before clearing local_registry.

Suggested change
// Destroy queues before closing so TypedQueue destructors can finalize
// their prepared statements.
local_registry.clear();
// Destroy queues before closing so TypedQueue destructors can finalize
// their prepared statements.
schema_registry_.clear();
local_registry.clear();

Comment thread src/utl/src/Logger.cpp
Comment on lines +236 to +242
SchemaInfo Logger::logToDbBuildSchemaInfo(sqlite3* db,
SchemaKey key,
const std::string& table_name,
std::string_view header,
const std::vector<SQLiteType>& types)
{
SchemaInfo info;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The table_name parameter passed to logToDb is a runtime string that is directly concatenated into SQL statements (e.g., CREATE TABLE IF NOT EXISTS " + info.table_name and INSERT INTO " + info_.table_name). If a developer passes a table name containing spaces, hyphens, or SQL keywords, it will cause SQL syntax errors. If the table name is dynamically constructed from user/design inputs, it could also present a SQL injection risk.

We should validate that table_name contains only alphanumeric characters and underscores. If validation fails, we can report a terminal error using this->error which will terminate the program.

SchemaInfo Logger::logToDbBuildSchemaInfo(sqlite3* db,
                                          SchemaKey key,
                                          const std::string& table_name,
                                          std::string_view header,
                                          const std::vector<SQLiteType>& types)
{
  for (char c : table_name) {
    if (!std::isalnum(c) && c != '_') {
      this->error(UTL, 111, "Invalid database table name '{}'. Only alphanumeric characters and underscores are allowed.", table_name);
    }
  }

  SchemaInfo info;
References
  1. Calls to logger_->error are expected to terminate the program, so additional error handling like returning an error code or throwing an exception is not necessary.

Comment on lines +346 to +392
class SchemaRegistry {
public:
using RegistryMap
= std::unordered_map<SchemaKey, std::shared_ptr<AbstractQueue>, SchemaKeyHasher>;

SchemaRegistry() : registry_ptr_(std::make_shared<const RegistryMap>()) {}

std::shared_ptr<const RegistryMap> get_map() const {
return std::atomic_load(&registry_ptr_);
}

std::shared_ptr<AbstractQueue> register_schema(
SchemaKey key,
std::shared_ptr<AbstractQueue> queue)
{
std::lock_guard<std::mutex> lock(registration_mutex_);
auto current_map = get_map();
auto it = current_map->find(key);
if (it != current_map->end()) {
return it->second;
}
auto new_map = std::make_shared<RegistryMap>(*current_map);
(*new_map)[key] = queue;
std::atomic_store(
&registry_ptr_,
std::const_pointer_cast<const RegistryMap>(new_map));
return queue;
}

void remove_schema(SchemaKey key) {
std::lock_guard<std::mutex> lock(registration_mutex_);
auto current_map = get_map();
auto it = current_map->find(key);
if (it == current_map->end()) {
return;
}
auto new_map = std::make_shared<RegistryMap>(*current_map);
new_map->erase(key);
std::atomic_store(
&registry_ptr_,
std::const_pointer_cast<const RegistryMap>(new_map));
}

private:
std::shared_ptr<const RegistryMap> registry_ptr_;
std::mutex registration_mutex_;
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In C++20, std::atomic_load and std::atomic_store for std::shared_ptr are deprecated. We should modernize SchemaRegistry to use std::atomic<std::shared_ptr<const RegistryMap>> which is standard in C++20. Additionally, we need to add a clear() method to allow releasing all registered queues during database shutdown to prevent SQLITE_BUSY on close.

// unordered map with some helpers
class SchemaRegistry {
public:
  using RegistryMap
      = std::unordered_map<SchemaKey, std::shared_ptr<AbstractQueue>, SchemaKeyHasher>;

  SchemaRegistry() : registry_ptr_(std::make_shared<const RegistryMap>()) {}

  std::shared_ptr<const RegistryMap> get_map() const {
    return registry_ptr_.load();
  }

  std::shared_ptr<AbstractQueue> register_schema(
      SchemaKey key,
      std::shared_ptr<AbstractQueue> queue)
  {
    std::lock_guard<std::mutex> lock(registration_mutex_);
    auto current_map = get_map();
    auto it = current_map->find(key);
    if (it != current_map->end()) {
      return it->second;
    }
    auto new_map = std::make_shared<RegistryMap>(*current_map);
    (*new_map)[key] = queue;
    registry_ptr_.store(std::const_pointer_cast<const RegistryMap>(new_map));
    return queue;
  }

  void remove_schema(SchemaKey key) {
    std::lock_guard<std::mutex> lock(registration_mutex_);
    auto current_map = get_map();
    auto it = current_map->find(key);
    if (it == current_map->end()) {
      return;
    }
    auto new_map = std::make_shared<RegistryMap>(*current_map);
    new_map->erase(key);
    registry_ptr_.store(std::const_pointer_cast<const RegistryMap>(new_map));
  }

  void clear() {
    std::lock_guard<std::mutex> lock(registration_mutex_);
    registry_ptr_.store(std::make_shared<const RegistryMap>());
  }

private:
  std::atomic<std::shared_ptr<const RegistryMap>> registry_ptr_;
  std::mutex registration_mutex_;
};

Comment thread src/utl/src/Logger.cpp
Comment on lines +137 to +145
void Logger::stopLogDb()
{
if (!log_db_running_) {
return;
}
log_db_running_ = false;
if (log_db_thread_.joinable()) {
log_db_thread_.join();
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If stopLogDb is called concurrently by multiple threads (or if a user-triggered stop races with the destructor), both threads might attempt to call log_db_thread_.join(). Calling join() concurrently on the same std::thread object is undefined behavior and typically throws std::system_error.

We can make this completely thread-safe by using compare_exchange_strong on log_db_running_ to ensure only one thread performs the shutdown and join.

Suggested change
void Logger::stopLogDb()
{
if (!log_db_running_) {
return;
}
log_db_running_ = false;
if (log_db_thread_.joinable()) {
log_db_thread_.join();
}
void Logger::stopLogDb()
{
bool expected = true;
if (!log_db_running_.compare_exchange_strong(expected, false)) {
return;
}
if (log_db_thread_.joinable()) {
log_db_thread_.join();
}

Comment thread etc/find_messages.py
Comment on lines +64 to +75
logtodb_regexp_c = re.compile(
r"""
(?:->|\.) # deref
logToDb(?:Bulk)? # logToDb or logToDbBulk
<.+?> # template header
\s*\(\s* # (
(?:utl::)?(?P<tool>[A-Z]{3}) # tool
\s*,\s* # ,
(?P<id>\d+) # id
""",
re.VERBOSE | re.MULTILINE,
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regex logtodb_regexp_c uses .+? to match the template header. Since the script does not compile the regex with re.DOTALL, it will fail to match any logToDb or logToDbBulk calls where the template header spans multiple lines (e.g., if a developer formats the column names list with newlines).

Using [\s\S]+? instead of .+? is a robust way to match any character including newlines without needing to change the global regex flags.

Suggested change
logtodb_regexp_c = re.compile(
r"""
(?:->|\.) # deref
logToDb(?:Bulk)? # logToDb or logToDbBulk
<.+?> # template header
\s*\(\s* # (
(?:utl::)?(?P<tool>[A-Z]{3}) # tool
\s*,\s* # ,
(?P<id>\d+) # id
""",
re.VERBOSE | re.MULTILINE,
)
logtodb_regexp_c = re.compile(
r"""
(?:->|\\.) # deref
logToDb(?:Bulk)? # logToDb or logToDbBulk
<[\\s\\S]+?> # template header
\\s*\\(\\s* # (
(?:utl::)?(?P<tool>[A-Z]{3}) # tool
\\s*,\\s* # ,
(?P<id>\\d+) # id
""",
re.VERBOSE | re.MULTILINE,
)

@povik

povik commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@alex-maiorov Thanks. It's worth opening the PR for so it's preserved if anyone needs it. I would hold off on adding tests or polishing unless @maliberty signals interest in merging.

Can you post example of use from the C++ side? I.e. what a method call emitting into the database looks like.

@maliberty

Copy link
Copy Markdown
Member

Please say more about what this feature is useful for that is not provided by text (or even more structured text) logging.

@alex-maiorov

Copy link
Copy Markdown
Author

Please say more about what this feature is useful for that is not provided by text (or even more structured text) logging.

There are a few reasons, first I wanted to avoid any possible latency costs of writing data especially if I was doing it very often, avoiding both serialization overhead and possible disk flushes at the point where the log functions are called by buffering it(though the same method can be used for text logging). For my usecase the main benefit is that I got a lot of functionality for free from SQLite in post-analysis which made it much more efficient, such as having one file with an easily readable schema, performant cross indexing with joins and other SQL features, and the ability to build indeces over tables for fast lookups, and also to be able to build derived data tables if necessary and store them easily.

There are also some disk size and bandwidth reductions, though SQLite does not optimally pack everything. To be honest, for pure speed and space efficiency there might be better solutions, but SQLite is nice for being easy to analyze later without needing to provide separate schema files for something like Protobufs or similar. Besides lower overhead, the benefit over more advanced text logging is we avoid the tradeoff of file clutter and performance. With text logging we would have to either create many files for all data sources(of which there might be many, even within one tool), or we would need to somehow collect all relevant information in a large batch. Largely, text logging becomes constrained by the requirement to keep all files in append-only mode as storing everything in memory is impractical. I will upload some example C++ code tomorrow that will hopefully more clearly demonstrate this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants