TSEdge is a small embedded C11 time-series storage library for Linux edge
devices. Applications include tsedge.h, link with libtsedge, and call the
API directly. It is not a server and does not implement SQL, networking, MQTT,
replication, access control, or lossy compression.
The documentation site is available on GitHub Pages:
https://liminfinity.github.io/tsedge/
mkdir build
cd build
cmake ..
cmake --build .
ctestMore build details are in docs/build.md.
The default compressed block size is 16384 points. It can be overridden at
build time with -DTSEDGE_BLOCK_MAX_POINTS=<N>.
The build produces:
libtsedge.solibtsedge.atsedge_demotsedge_benchtsedge_tests
Block-size tuning can be reproduced with:
./bench/run_block_size_benchmarks.shThe project is checked on Ubuntu with GCC and Clang, on macOS with Clang, and with AddressSanitizer/UndefinedBehaviorSanitizer in Debug builds.
TSEdge can be packaged as a .tar.gz archive for GitHub Releases:
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
cmake --build .
ctest --output-on-failure
cpackThe archive is written into the build directory and contains include/, lib/,
bin/, docs/, README.md, INSTALL.md, and LICENSE.
Example use after unpacking:
tar -xzf tsedge-0.1.0-*.tar.gz
cc app.c -Itsedge-0.1.0-*/include -Ltsedge-0.1.0-*/lib -ltsedge -o app
LD_LIBRARY_PATH=tsedge-0.1.0-*/lib ./appOn macOS, use DYLD_LIBRARY_PATH for the dynamic library path. More release
details are in docs/release.md.
The public API lives in include/tsedge.h and exposes an opaque tsedge_db.
The first version supports:
- open/close database directory
- create series
- delete series
- set a soft disk quota for old segment cleanup
- list existing series
- append
(int64 timestamp, double value)points - append batches of
tsedge_pointvalues - inspect lightweight per-series statistics
- read points by inclusive time range
- aggregate min/max/sum/avg/count by range
- aggregate by time windows for graph downsampling
- export a range to CSV
Function-level API notes are in docs/api.md.
Applications can list known series without scanning database files manually:
tsedge_series_info* series = NULL;
size_t count = 0;
int rc = tsedge_list_series(db, &series, &count);
if (rc == TSEDGE_OK) {
for (size_t i = 0; i < count; ++i) {
printf("%s\n", series[i].name);
}
tsedge_free_series_list(series);
}The function returns a copied array owned by the caller. Empty databases return
count = 0 and series = NULL.
Batch append avoids repeating public validation and series lookup for every point:
tsedge_point points[3] = {
{1710000000000LL, 72.4},
{1710000001000LL, 72.5},
{1710000002000LL, 72.6},
};
tsedge_append_batch(db, "motor.temperature", points, 3);If a batch append fails partway through, points accepted before the error may remain stored. TSEdge does not provide all-or-nothing batch transactions.
Buffered points can be forced to disk before export or shutdown:
tsedge_append(db, "motor.temperature", timestamp, value);
tsedge_flush(db, "motor.temperature");
tsedge_export_csv(db, "motor.temperature", from, to, "temperature.csv");tsedge_flush writes one series buffer into a segment file.
tsedge_flush_all does the same for every series. Empty buffers are not an
error.
Series can be deleted with their metadata and segment files:
int rc = tsedge_delete_series(db, "motor.temperature");
if (rc != TSEDGE_OK) {
fprintf(stderr, "failed to delete series\n");
}Before deletion, TSEdge flushes buffers through the normal WAL path so pending WAL entries cannot restore the deleted series after reopen.
Database size can be limited with a soft runtime quota:
tsedge_set_disk_quota(db, 128ull * 1024ull * 1024ull);
int rc = tsedge_enforce_disk_quota(db);
if (rc == TSEDGE_ERR_QUOTA_EXCEEDED) {
fprintf(stderr, "quota is too small for safe cleanup\n");
}Quota cleanup removes only old sealed segment_*.tse files. It does not delete
active segments, the last segment of any series, WAL, metadata, or arbitrary
files in the database directory.
Database files can be checked without modifying them:
tsedge_verify_report report;
int rc = tsedge_verify("demo_db", &report);
if (rc != TSEDGE_OK) {
printf("corrupt database: %s\n", report.first_error_message);
}tsedge_verify checks the database directory, manifest, series metadata,
segment files, block headers, payload bounds and WAL entries. It reports the
first problem found but does not repair files.
Series statistics can be read without decoding segment payloads:
tsedge_series_stats stats;
if (tsedge_get_series_stats(db, "motor.temperature", &stats) == TSEDGE_OK) {
printf("segments=%zu active=%u blocks=%zu buffered=%zu indexed=%zu bytes=%llu ratio=%.2fx\n",
stats.segment_count,
stats.active_segment_id,
stats.block_count,
stats.buffered_points,
stats.total_indexed_points,
(unsigned long long)stats.total_segment_size_bytes,
stats.compression_ratio);
}The statistics are collected from the in-memory block index, the current buffer and all segment file sizes. They also include compression stats: estimated raw size, bytes stored in segment files, compression ratio and average bytes per point.
Window aggregation returns compact buckets for charts without reading every raw point:
tsedge_window_aggregate* windows = NULL;
size_t window_count = 0;
tsedge_aggregate_windowed(
db,
"motor.temperature",
1710000000000LL,
1710003600000LL,
60000,
&windows,
&window_count
);
for (size_t i = 0; i < window_count; ++i) {
printf("%lld..%lld count=%llu avg=%f min=%f max=%f\n",
(long long)windows[i].window_start,
(long long)windows[i].window_end,
(unsigned long long)windows[i].count,
windows[i].avg,
windows[i].min,
windows[i].max);
}
tsedge_free_window_aggregates(windows);Windows are half-open ([start, end)) and empty windows are omitted. This is
used by dashboards to downsample large raw ranges into a small number of
display buckets.
Old data can be removed at segment-file granularity:
tsedge_delete_before(db, "motor.temperature", 1710001000000LL);The function deletes only segment_*.tse files whose maximum timestamp is older
than the threshold. If a segment contains both old and new points, it is kept
unchanged because this prototype does not implement compaction or point-level
deletion.
TSEdge stores data on the local filesystem:
database_dir/
manifest.txt
wal.log
series/
motor.temperature/
metadata.txt
segment_000001.tse
segment_000002.tse
segment_000003.tse
manifest.txt lists known series. Each series stores immutable compressed
blocks in append-only segment_%06u.tse files. The active segment rotates at a
block boundary when it reaches the internal size limit, which is 64 MiB by
default. wal.log stores not-yet-flushed points for crash recovery. After a
successful block flush, the WAL is rewritten from current in-memory buffers so
already persisted blocks are not replayed twice.
Segment files contain a sequence of blocks. Multi-byte integers are encoded explicitly as little-endian values.
Each block starts with a version 2 header:
u32 magic "TSEB" as 0x42455354
u32 version 2
u32 point_count
u32 compression_type 1 for delta timestamps + Gorilla-inspired XOR values
i64 min_timestamp
i64 max_timestamp
u32 compressed_timestamp_size
u32 compressed_value_size
u32 payload_size
f64 min_value
f64 max_value
f64 sum_value
u32 reserved 0
The header is followed by compressed timestamp bytes and compressed value bytes. The min/max timestamp metadata allows range queries to skip blocks that do not intersect the requested interval. The value statistics allow aggregate queries to use fully covered blocks without decompressing them.
When a database is opened, TSEdge discovers all segment_*.tse files for every
series and rebuilds one in-memory block index. Each index entry stores the
segment id and offset of a block, so range reads and aggregates can cross segment
file boundaries without changing the public API.
More storage format details are in docs/storage_format.md.
TSEdge is split into small internal layers:
src/apicontains the public facade: it validates public arguments, finds a series, and delegates to internal modules.src/coreowns database and series coordination, including block index rebuild, range/aggregate queries, lightweight stats and retention.src/storageowns segment files, segment rotation, block metadata and WAL.src/compressionowns timestamp/value compression and primitive byte encoding.src/exportowns CSV export.
The demo program shows the main API flow: it creates three series, writes data
with both tsedge_append and tsedge_append_batch, reads a range, computes
aggregates, prints stats, applies segment-level retention, exports CSV, closes
the database, reopens it, and checks that the rebuilt segment index still
exposes the remaining data.
The crash recovery demo uses two small programs. The first writes points and
exits without tsedge_close; the second opens the database and checks that WAL
replay restored the points.
rm -rf crash_recovery_demo_db
./tsedge_crash_writer crash_recovery_demo_db || true
./tsedge_recover_check crash_recovery_demo_dbThe non-zero exit code from tsedge_crash_writer is expected: it intentionally
simulates a crashed process.
TSEdge includes tests for corrupted segment files, invalid block headers, damaged WAL entries and deterministic random byte inputs. These tests check that malformed storage files return errors instead of crashing the process.
Sanitizer build:
cmake -S . -B build-sanitize -DCMAKE_BUILD_TYPE=Debug -DTSEDGE_ENABLE_SANITIZERS=ON -DTSEDGE_SEGMENT_MAX_BYTES=8192
cmake --build build-sanitize
ctest --test-dir build-sanitize --output-on-failureThe interactive demo shows a standalone environmental monitoring station:
C-agent -> TSEdge -> live_state.json -> Next.js simulator
The C agent writes real points to TSEdge through the public API. The website
does not connect to the database and is not a TSEdge server. It reads state
files produced by the C program and writes commands to command.json.
The demo agent code lives in examples/ecopost/ and is split by
responsibility: configuration, state, sensor generation, commands, TSEdge
operations, live_state.json output and filesystem helper functions.
Build and run the agent:
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release -DTSEDGE_SEGMENT_MAX_BYTES=8192 ..
cmake --build .
ctest --output-on-failure
./tsedge_ecopost_agent --live --interval-ms 1000Run the website in another terminal:
cd demo/system-simulator
npm install
TSEDGE_LIVE_OUTPUT=../../build/ecopost_live_output npm run devOpen http://localhost:3000. This is the end-user weather station dashboard:
temperature, humidity, pressure, wind, PM2.5, battery and connection status.
Engineering diagnostics are available at http://localhost:3000/diagnostics.
This page shows the WAL, buffer, compressed blocks, segment files, old-data
cleanup and CSV export.
The diagnostics page can export CSV for a selected series. The UI writes the
series name to command.json, for example:
{"command":"export_csv","series":"pm25.concentration"}The C agent calls tsedge_export_csv and saves the file in the output
directory.
tsedge_delete_before implements a simple retention policy on top of segment
rotation. Before deleting files, TSEdge flushes the current in-memory buffer so
the WAL and segment files describe the same accepted points. It then computes
the timestamp range of each segment from the in-memory block index and removes
only segments whose segment_max_timestamp < older_than_timestamp.
Partially overlapping segments are preserved completely. Remaining segment files
are not renamed, so after deleting segment_000001.tse and
segment_000002.tse, later files such as segment_000003.tse keep their names
and future rotation continues from the highest existing segment id. After
deletion, the block index is rebuilt from the segment files that are still on
disk. Precise deletion inside a segment would require compaction and is not
implemented in this prototype.
Compression is lossless.
Timestamps are encoded as:
- first timestamp as raw
int64 - first delta as raw
int64 - subsequent delta-of-delta values as zigzag varints
Double values are encoded as:
- first value as raw IEEE-754 64-bit bits
- each next value as a marker:
0when XOR with previous value is zero1followed by a byte-aligned significant XOR window when it saves space2followed by the raw 64-bit XOR as a worst-case fallback
This is a simplified Gorilla-inspired XOR stream. It favors correctness and explainability over maximum compression ratio.
- Block skipping by timestamp bounds during range reads.
- WAL recovery for not-yet-flushed points.
- WAL v2 entry validation with magic, version, entry size and checksum.
- Optional WAL
fsyncmode throughTSEDGE_WAL_FSYNC. - Block-level aggregate statistics for fully covered blocks.
- In-memory block index rebuilt from segment headers at open.
- Segment rotation at block boundaries with
segment_%06u.tsefiles. - Segment-level retention with
tsedge_delete_before. - Byte-aligned Gorilla-inspired value XOR encoding with raw fallback.
Planned future optimizations are documented as limitations below where they are larger than a safe incremental change.
- No SQL parser or query language.
- No network server, HTTP, MQTT, sockets, or replication.
- No concurrent writer support.
- No full ACID transaction system.
- No production-grade WAL checkpointing.
- No disk-based B+Tree index.
- No point-level deletion, compaction, or retention inside partially overlapping segment files.
- The in-memory block index is rebuilt on open and is not persisted separately.
- The value compression remains a simplified Gorilla-inspired XOR stream, not full Gorilla bit-packing.
- The prototype assumes mostly append-oriented time-series workloads.
./tsedge_bench 1000000Benchmark methodology is described in docs/benchmarking.md.
Read-path benchmarks can be run with bench/run_read_benchmarks.sh.
Block-size tuning can be run with bench/run_block_size_benchmarks.sh.
The benchmark emits copy-friendly lines for smooth, noisy, step,
constant, and irregular_timestamps datasets. For TSEdge it includes both
single-point writes (write_mode=append, batch_size=1) and batch writes
(write_mode=append_batch, with batch sizes such as 100, 1000, 4096):
dataset=smooth
write_mode=append_batch
batch_size=1000
points=1000000
write_points_per_sec=...
read_points_per_sec=...
aggregate_seconds=...
db_size_bytes=...
raw_size_bytes=16000000
compression_ratio=...
segment_count=...
block_count=...
total_segment_size_bytes=...