Skip to content

CASSSIDECAR-419 : Add Docker compose setup for CDC#330

Open
jyothsnakonisa wants to merge 2 commits into
apache:trunkfrom
jyothsnakonisa:cdc-docker-compose
Open

CASSSIDECAR-419 : Add Docker compose setup for CDC#330
jyothsnakonisa wants to merge 2 commits into
apache:trunkfrom
jyothsnakonisa:cdc-docker-compose

Conversation

@jyothsnakonisa
Copy link
Copy Markdown
Contributor

No description provided.

Comment thread docker/cdc-demo/README.md

## Architecture

```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice, easy to understand for new users

Copy link
Copy Markdown
Contributor

@skoppu22 skoppu22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


YAML="/etc/cassandra/cassandra.yaml"

patch_yaml() {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth adding a "didn't find it, couldn't patch" style warning or error here?

Comment thread docker/cdc-demo/Dockerfile.sidecar Outdated
Comment thread build.gradle Outdated
@jyothsnakonisa jyothsnakonisa force-pushed the cdc-docker-compose branch 3 times, most recently from 06c2451 to 1cd0b8d Compare April 6, 2026 20:00
Comment thread docker/cdc-demo/Dockerfile.sidecar Outdated

WORKDIR /build

# Build descriptors first — keeps the dependency-download layer cached separately.
Copy link
Copy Markdown
Contributor

@smiklosovic smiklosovic Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is confusing a bit as the user has to do "gradle jar" at least least before this is executed. Without doing "jar", what you copy will not contain "sidecar.version" and then it all passes but it fails to start like this

sidecar-1         | INFO  [main] 2026-04-08 16:14:45,333 SidecarSchemaModule.java:59 - Registering table schema: sidecar_internal.restore_range_v1
sidecar-1         | INFO  [main] 2026-04-08 16:14:45,333 SidecarSchemaModule.java:59 - Registering table schema: sidecar_internal.restore_slice_v3
sidecar-1         | Exception in thread "main" com.google.inject.ProvisionException: Unable to provision, see the following errors:
sidecar-1         | 
sidecar-1         | 1) [Guice/ErrorInCustomProvider]: IllegalStateException: Failed to retrieve Sidecar version from resource /sidecar.version
sidecar-1         |   at UtilitiesModule.sidecarVersionProvider(UtilitiesModule.java:88)
sidecar-1         |   at ConfigurationModule.instancesMetadata(ConfigurationModule.java:175)
sidecar-1         |       \_ for 4th parameter
sidecar-1         |   at ConfigurationModule.instancesMetadata(ConfigurationModule.java:175)
sidecar-1         |   at InstanceMetadataFetcher.<init>(InstanceMetadataFetcher.java:51)

Comment thread docker/cdc-demo/docker-compose.yml Outdated
Comment thread docker/cdc-demo/scripts/start.sh
@smiklosovic
Copy link
Copy Markdown
Contributor

smiklosovic commented Apr 8, 2026

I suggest to also include this to make it more robust. I like to be explicit in the creation of a topic. If somebody starts the stack and goes to kafka-ui then there will not be any topic for them to see messages in which might be confusing.

7eb33fc

@smiklosovic smiklosovic self-requested a review May 21, 2026 06:17
@jyothsnakonisa
Copy link
Copy Markdown
Contributor Author

@smiklosovic Thank you, will add 7eb33fc as a separate PR

Copy link
Copy Markdown

@jmckenzie-dev jmckenzie-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still working on testing it; my immediate naive attempt to run things went like:

  1. Try and build the dockerfile locally in the directory (failed; pathing needs to be REPO_ROOT)
  2. Look for scripts in REPO_ROOT/scripts for running the docker test (nothing)
  3. Then actually read the documentation in the docker directory to see what I'm actually supposed to be doing 😀
  4. Run into an error with docker compose on the Dockerfile; try start.sh with --help and get no help there

So there's some devx "intuitive explorability" rough edges that I think we could do some work to smooth over. Will update more when I figure out why I'm getting these errors on building the docker file and determine if it's a local env problem or with the patch.

Comment thread server/build.gradle
implementation "com.esotericsoftware:kryo-shaded:${kryoVersion}"

// Confluent Avro serializer — used when value.serializer=KafkaAvroSerializer (confluent mode)
implementation 'io.confluent:kafka-avro-serializer:7.6.0'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only needed for this demo? Or do we need this for execution in the sidecar broadly for the avro serializer support in CDC? If the former, I'm a little on the fence on including this non-trivial dependency in the broader project vs. having a separate gradle subproject for the demo so it doesn't pollute the primary project namespaces w/stuff we only need for the demo.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KafkaAvroSerializer is a popular schemastore integrated serializer. If one wants to use it, without this dependency, they will run into class not found issues.

Stefan was trying this serializer and ran into class not found error hence I added this dependency as it is needed for both Demo and other users who want to use this serializer

export CASSANDRA_USE_JDK11=true
# Trunk (5.1-SNAPSHOT) compiles 2700+ source files; without an explicit heap
# limit ant hits the JVM default (~2GB) and gets OOM-killed on large executors.
export ANT_OPTS="${ANT_OPTS:-} -Xmx4g"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to this CDC dockerfile demo setup out of curiosity? I'm fine either way, just wondered how they connected.

Comment thread docker/cdc-demo/README.md
| `cassandra` | `cassandra:5.0` | CDC-enabled Cassandra node |
| `cassandra-init` | `cassandra:5.0` | One-shot: seeds sidecar schema + configs |
| `sidecar` | `cassandra-sidecar:dev` | Reads commit logs, publishes to Kafka |
| `kafka-ui` | `ghcr.io/kafbat/kafka-ui:v1.6.1` | Browse topics + decoded Avro messages |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file has 1.5.0 and we have 1.6.1 here - update this or that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried 1.6.1 version earlier and it worked, after a while that version was no longer available. I then changed it to 1.5.0 but missed updating it in the readme. will update it!

'docker compose logs -f sidecar 2>&1 | grep -m 1 "CDC iterators started successfully"' \
> /dev/null || echo "Warning: timed out waiting for CDC iterators — check: docker compose logs sidecar"
else
docker compose logs -f sidecar 2>&1 | grep -m 1 "CDC iterators started successfully" > /dev/null || true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any other options on macOS instead of just "follow forever"? :)

# ── Wait for sidecar ─────────────────────────────────────────────────────────
echo ""
echo "Waiting for sidecar to be ready (follow progress: docker compose logs -f cassandra-init sidecar)..."
until curl -sf http://localhost:9043/api/v1/__health > /dev/null 2>&1; do
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can infinite loop in the error scenario; a timeout would be friendly. 5 minutes or something - whatever makes sense.

SKIP_BUILD=false
SERIALIZER_MODE=confluent

for arg in "$@"; do
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No --help - could we add that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants