Skip to content

CASSSIDECAR-373: Define storage provider interface and Cassandra implementation to support durable job tracking#339

Open
andresbeckruiz wants to merge 10 commits into
apache:trunkfrom
andresbeckruiz:CASSSIDECAR-373
Open

CASSSIDECAR-373: Define storage provider interface and Cassandra implementation to support durable job tracking#339
andresbeckruiz wants to merge 10 commits into
apache:trunkfrom
andresbeckruiz:CASSSIDECAR-373

Conversation

@andresbeckruiz
Copy link
Copy Markdown
Contributor

@andresbeckruiz andresbeckruiz commented Apr 23, 2026

CASSSIDECAR-373

Implements CassandraStorageProvider, the default Cassandra backed implementation of the StorageProvider interface for durable operational job state.

Changes

  • StorageProvider interface, which adds a provider-agnostic abstraction for persisting, querying, and coordinating operational jobs

  • OperationalJobRecord is the data transfer object representing persisted job state

  • OperationalJobConfiguration provides configurable table TTL, and in the future can support remote Cassandra cluster connection

  • Three Cassandra table schemas:

    • cluster_ops — persists and tracks operational jobs (restarts, upgrades, etc.)
    • cluster_ops_node_state — tracks per-node status within an operation
    • active_cluster_ops — provides mutual exclusion of concurrent operations via LWT
  • Database accessors: (ClusterOpsDatabaseAccessor, ClusterOpsNodeStateDatabaseAccessor, ActiveClusterOpsDatabaseAccessor)

  • CassandraStorageProvider composes the three accessors, acts as thin delegation layer

  • Integration tests for all three database accessors

Future Work

  • Remote Cassandra cluster storage support
  • Future CEP-53 work:
    • CASSSIDECAR-374: Implement durable operational job tracker
    • CASSSIDECAR-375: Add storage provider to enable JVM Distributed tests for cluster-wide operational jobs
    • CASSSIDECAR-377: Implement job coordination for cluster-wide operations

Comment thread server/src/main/java/org/apache/cassandra/sidecar/db/schema/ClusterOpsSchema.java Outdated
@andresbeckruiz
Copy link
Copy Markdown
Contributor Author

andresbeckruiz commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

@isaacreath isaacreath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 (nb)

{
BoundStatement statement = tableSchema.trySetActive().bind(clusterName, operationType, operationId);
statement.setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM);
statement.setSerialConsistencyLevel(ConsistencyLevel.LOCAL_SERIAL);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be using global SERIAL to ensure the active operation will be consistent across all DCs?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are initially defaulting to LOCAL_SERIAL because GLOBAL_SERIAL for globally distributed clusters (eg: us-east, us-west, eu) would likely timeout. LOCAL_SERIAL can still guarantee that we only have one job ongoing per datacenter, assuming submitted nodes are restricted to that DC.

Once configuration is added for the operational job framework, we should allow consistency level to be set as needed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going for per-DC jobs, then it may make sense to add this into the schema as well so that we can get the active job by DC.

Copy link
Copy Markdown
Contributor Author

@andresbeckruiz andresbeckruiz May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added datacenter to ActiveClusterOps schema in 55bdd43

@andresbeckruiz andresbeckruiz force-pushed the CASSSIDECAR-373 branch 3 times, most recently from 6c99923 to e5a64df Compare May 15, 2026 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants