Skip to content

Add 2026-06 RTM-on-SDP flight tracker blog example#92

Open
apingledbx wants to merge 3 commits into
databricks-solutions:mainfrom
apingledbx:2026-06-rtm-on-sdp-flight-tracker
Open

Add 2026-06 RTM-on-SDP flight tracker blog example#92
apingledbx wants to merge 3 commits into
databricks-solutions:mainfrom
apingledbx:2026-06-rtm-on-sdp-flight-tracker

Conversation

@apingledbx

Copy link
Copy Markdown

Spark Declarative Pipeline in Real-Time Mode: a stateless enrichment flow and a stateful zone-congestion flow (transformWithState), both writing to Lakebase.

Spark Declarative Pipeline in Real-Time Mode: a stateless enrichment flow and a
stateful zone-congestion flow (transformWithState), both writing to Lakebase.

Co-authored-by: Isaac

@anupkalburgi anupkalburgi left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@anupkalburgi anupkalburgi left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@anupkalburgi anupkalburgi self-requested a review June 25, 2026 22:41
spark_conf={
"pipelines.trigger": "RealTime", # turn RTM on
# checkpoint cadence for state + offsets, NOT a micro-batch size
"pipelines.trigger.interval": "1 minute",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guest 5 minutes. That is the recommended default

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

# COMMAND ----------

# Native Lakebase external sink: write straight to the operational store the
# app reads, instead of landing results in a table for analytics.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please indicate this is PrPr experience

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note

# checkpoint cadence for state + offsets, NOT a micro-batch size
"pipelines.trigger.interval": "5 minutes",
"spark.sql.shuffle.partitions": "4",
"spark.sql.streaming.jdbc.enabled": "true",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

```
spark.databricks.streaming.realTimeMode.enabled = true
pipelines.externalSink.enabled = true
spark.sql.streaming.jdbc.enabled = true

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


# Native Lakebase external sink: write results straight to the operational store
# the app reads, instead of landing them in a table for analytics.
dp.create_sink(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

# checkpoint cadence for state + offsets, NOT a micro-batch size
"pipelines.trigger.interval": "1 minute",
"spark.sql.shuffle.partitions": "4",
"spark.sql.streaming.jdbc.enabled": "true",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

…s on Lakebase sink

- congestion flow trigger.interval 1 minute -> 5 minutes (recommended RTM default)
- note the jdbcStreaming Lakebase sink is Private Preview at each sink/jdbc.enabled spot (01, 02, README)

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants