Compatibility And Protocol Extensions for Apache Fluss
Transform Apache Fluss into a multi-model database by adding HBase, Redis, Kafka, and PostgreSQL protocol compatibility.
Fluss CAPE is an external compatibility layer that enables applications to interact with Apache Fluss using familiar HBase, Redis, Kafka, and PostgreSQL protocolsβwithout modifying Fluss itself.
Fluss CAPE operates as a stateless proxy layer that performs real-time protocol translation:
- Decode: Receives requests via standard protocol handlers (HBase RPC, Redis RESP, Kafka Wire, PG Wire).
- Translate: Maps protocol-specific operations (e.g.,
HSET,Put,Produce) to Fluss native table operations. - Execute: Uses the high-performance Fluss Client to interact with the underlying distributed storage.
Fluss provides two primary table types, both of which are leveraged by CAPE to provide multi-model capabilities:
- KV Tables (Primary Key Tables): Used primarily for HBase and Redis. These tables are optimized for low-latency point lookups and range scans. In HBase, the
row_keymaps to the Fluss primary key; in Redis, a composite key(redis_key, sub_key)is used to support complex data structures like Hashes and Sets. - Log Tables: Used primarily for the Kafka protocol. These are append-only tables optimized for high-throughput ingestion and sequential consumption.
- Hybrid Support: The PostgreSQL protocol primarily operates on Primary Key Tables, but leverages both the KV component (for snapshots/lookups) and Log component (for real-time changelog replay) to provide consistent SQL query results. Every write to a Primary Key table also generates a changelog in a Log table, enabling unified streaming access.
Fluss CAPE fully inherits Fluss's "Lake-Stream Integration" architecture, providing a unified view of data:
- Unified Interface: Write data via a "streaming" protocol (Kafka) and immediately query it via a "database" protocol (PostgreSQL/HBase).
- Changelog as Stream: All mutations in HBase or Redis tables are automatically captured as Fluss changelogs, which can be consumed via the Kafka protocol for downstream real-time processing.
- Snapshot + Incremental: The PostgreSQL protocol uses a hybrid scan strategy that combines KV snapshots with recent logs to ensure data consistency and freshness.
- Full Protocol Support: Get, Put, Scan, Delete, Multi-row operations
- Dynamic Tables: Create/drop tables on-the-fly via HBase shell or client API
- Service Discovery: Automatic RegionServer discovery via ZooKeeper
- Admin Operations: Table management, enable/disable tables
- Authentication: SASL/GSSAPI support
- Standard API: Works with HBase Client 2.x, 3.x and tools like Spark
- 150+ Commands: Full RESP protocol support across 14 data types
- Automatic Sharding: Redis Cluster-style CRC16 hashing (16384 slots)
- Data Types: Strings, Hashes, Sets, Lists, Sorted Sets, Streams, Geo, HyperLogLog, Pub/Sub
- Client Compatible: Works with redis-cli, Python, Node.js, Java, and all standard Redis clients
- Persistent Storage: All data durably stored in Fluss
- Wire Protocol Support: Connect using any standard Kafka client (kafka-console-producer, kafka-python, librdkafka, etc.)
- Unified Messaging: Seamlessly bridge Fluss changelogs and Kafka topics
- High Performance: Low-latency message production and consumption
- Ecosystem Ready: Integration with Kafka Connect, Schema Registry, and stream processing engines (Flink, Spark Streaming)
- Wire Protocol Support: Connect using any standard PostgreSQL client (psql, DBeaver, etc.)
- SQL Interface: Query Fluss tables using standard SQL syntax
- Hybrid Scan Strategy: Reliable data retrieval combining snapshots and changelogs
- Information Schema: Full support for metadata discovery and database introspection
For details, see ARCHITECTURE.md for design overview and docs/ for comprehensive guides.
- Java 11+
- Apache Fluss cluster running
- ZooKeeper (shared with or separate from Fluss)
Note: Fluss CAPE currently depends on Apache Fluss 0.9-SNAPSHOT. Please ensure you have built and installed Fluss 0.9-SNAPSHOT locally before building CAPE.
Steps to install Fluss locally:
git clone https://github.com/apache/fluss.git
cd fluss
./mvnw clean install -DskipTestsBuild CAPE:
# Clone repository
git clone https://github.com/gnuhpc/fluss-cape.git
cd fluss-cape
# Build JAR (required for Docker)
mvn clean package -DskipTests
# Build Docker image
docker build -t fluss-cape:1.0.0 .
# Run single instance
docker run -d \
--name fluss-cape \
--network host \
-p 16020:16020 \
-p 6379:6379 \
-p 5432:5432 \
-p 9092:9092 \
-p 8080:8080 \
-e FLUSS_BOOTSTRAP=localhost:9123 \
-e ZK_QUORUM=localhost:2181 \
-e BIND_PORT=16020 \
-e HEALTH_PORT=8080 \
fluss-cape:1.0.0
# Kafka and other protocols are all served from the same container; Kafka clients can connect to `localhost:9092` (the default `KAFKA_BIND_PORT`, which is enabled unless you disable Kafka explicitly).
# Check status
curl http://localhost:8080/health
### Use HBase Protocol
```bash
# Connect with HBase shell
hbase shell
# Create table
hbase> create 'users', 'cf'
# Insert data
hbase> put 'users', 'user1', 'cf:name', 'Alice'
hbase> put 'users', 'user1', 'cf:age', '30'
# Retrieve data
hbase> get 'users', 'user1'
# Scan data
hbase> scan 'users'# Connect with redis-cli
redis-cli -p 6379
# String operations
> SET user:1:name "Alice"
OK
> GET user:1:name
"Alice"
# Hash operations
> HSET user:1 name "Alice" age 30
(integer) 2
> HGETALL user:1
1) "name"
2) "Alice"
3) "age"
4) "30"
# Sorted sets
> ZADD leaderboard 100 "Alice" 200 "Bob"
(integer) 2
> ZRANGE leaderboard 0 -1 WITHSCORES
1) "Alice"
2) "100"
3) "Bob"
4) "200"# Produce messages
kafka-console-producer.sh --bootstrap-server localhost:9092 --topic my_topic
> Hello Fluss!
> This is a Kafka message.
# Consume messages
kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic my_topic --from-beginning
Hello Fluss!
This is a Kafka message.# Connect with psql
psql -h localhost -p 5432 -U fluss -d default
# Create table (note: table name must include database prefix)
CREATE TABLE default.employees (
id INTEGER PRIMARY KEY,
name VARCHAR(100),
age INTEGER,
email VARCHAR(100)
);
# Insert data
INSERT INTO default.employees (id, name, age, email) VALUES (1, 'Alice', 30, 'alice@example.com');
INSERT INTO default.employees (id, name, age, email) VALUES (2, 'Bob', 25, 'bob@example.com');
# Query all data
SELECT * FROM default.employees;
# Query with WHERE clause (only equality predicates supported)
SELECT * FROM default.employees WHERE id = 1;
# Update data
UPDATE default.employees SET age = 31 WHERE id = 1;
# Delete data
DELETE FROM default.employees WHERE id = 2;
# Drop table
DROP TABLE default.employees;- Architecture - System design and component architecture
- Getting Started - Detailed installation and setup guide
- HBase Guide - Complete HBase usage with examples
- Redis Guide - Redis commands and client examples
- PostgreSQL Guide - PostgreSQL wire protocol usage and SQL examples
- Configuration - All configuration parameters
- Performance Benchmarks - YCSB benchmarks and tuning
- Functional Tests - Automated testing suite for single and multi-instance deployments
Fluss CAPE includes a comprehensive test suite for validating Redis/Valkey and HBase protocol compatibility:
# Run all tests (both single and multi-instance)
cd tests
./run-tests.sh
# Run specific tests
./run-tests.sh -m single # Single instance only
./run-tests.sh -t redis # Redis protocol only
./run-tests.sh -m multi -t hbase # Multi-instance HBase only
./run-tests.sh -v # Verbose output
# Generate HTML report
./generate-html-report.sh test-reports/test_report_*.logTest Coverage:
- β Redis: String, Hash, List, Set, Sorted Set operations
- β HBase: Table management, Put/Get, Scan, Delete operations
- β Single-instance and multi-instance deployment modes
- β Load balancing and service discovery validation
See tests/README.md for detailed documentation.
Migrate existing HBase applications to Fluss without code changes:
- Spring Data HBase applications
- Apache Phoenix SQL queries
- Spark HBase Connector jobs
Create and modify tables on-the-fly:
- Development and testing environments
- Rapid prototyping and experimentation
- Ad-hoc data analysis workflows
Access the same data through different interfaces:
- Write via Redis, read via HBase
- Batch processing (HBase) + Real-time access (Redis)
- SQL queries (PostgreSQL) + KV operations (Redis)
- Analytics (PostgreSQL) + Low-latency reads (HBase)
Use Redis protocol with Fluss's durable storage:
- Session storage with replay capability
- Real-time leaderboards with historical data
- Message queues with persistence
Query Fluss tables using standard SQL:
- Business intelligence tools (DBeaver, pgAdmin, Tableau)
- PostgreSQL-compatible ORMs (SQLAlchemy, Hibernate)
- Ad-hoc analysis with familiar SQL syntax
- Integration with PostgreSQL ecosystem
Bridge your event-driven architecture with Fluss:
- Ingest real-time events from existing Kafka producers
- Consume Fluss changelogs using standard Kafka consumers
- Seamless integration with the Kafka ecosystem (Connect, KSQL, etc.)
- Unified storage for both streaming events and relational data
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
Apache License 2.0. See LICENSE for details.
- Apache Fluss: https://github.com/alibaba/fluss
- HBase Documentation: https://hbase.apache.org/
- Redis Documentation: https://redis.io/
- GitHub Issues: Report Issues
Transform Apache Fluss into a multi-model database!
β Star us on GitHub β’ π Read the Docs β’ π Architecture

