RavenDB-26046 - Add CDC Sink documentation#2387
Open
ayende wants to merge 15 commits intoravendb:mainfrom
Open
RavenDB-26046 - Add CDC Sink documentation#2387ayende wants to merge 15 commits intoravendb:mainfrom
ayende wants to merge 15 commits intoravendb:mainfrom
Conversation
Adds full CDC Sink ongoing task documentation in Docusaurus MDX format: - 16 core pages: overview, how-it-works, schema-design, embedded-tables, linked-tables, column-mapping, patching, delete-strategies, property-retention, attachment-handling, configuration-reference, api-reference, monitoring, failover-and-consistency, troubleshooting, server-configuration - 9 PostgreSQL pages: prerequisites-checklist, wal-configuration, permissions-and-roles, initial-setup, replica-identity, replica-identity-manual-setup, cleanup-and-maintenance, monitoring-postgres, studio-ui - 4 PostgreSQL examples: simple-migration, denormalization, event-sourcing, complex-nesting - 1 SQL Server stub: overview - 4 _category_.json navigation files
…al features
- Replace ColumnsMapping (Dictionary) + AttachmentNameMapping (Dictionary) with
unified Columns list of CdcColumnMapping { Column, Name, Type } across all files
- Add CdcColumnType enum documentation (Default, Json, Attachment)
- Add REST API endpoints table to configuration-reference
- Add CdcSink.PollIntervalInSec to server-configuration
- Add error handling details to monitoring (threshold, fallback, exponential backoff)
- Add ALTER PUBLICATION auto-fix note to postgres/initial-setup
- Fix how-it-works: sequential scan description, Child Before Parent section
- Fix Startup and Verification: split into per-database subsections
- Update all prose references from ColumnsMapping to Columns list
…chment handling - Replace all new() shorthand with new CdcColumnMapping() across all files - attachment-handling: clarify that text columns (text, nvarchar, etc.) as well as binary columns can use Type = CdcColumnType.Attachment
… uses application/octet-stream
…$old documentation - Add postgres/type-mapping.mdx: full reference table of PostgreSQL column types and their JavaScript/CLR equivalents (scalars, arrays, json/jsonb, bytea, pgvector) - Add patching.mdx "$row and $old: Names and Types" - Fix cleanup-and-maintenance.mdx: replace obsolete "Configuration Changes That Rename Slots" section (described hash-based naming, no longer accurate) with correct "Slot and Publication Names Are Immutable" section reflecting enforced immutability
- Name → CollectionName on CdcSinkTableConfig across all files - Remove Type from CdcSinkLinkedTableConfig (linked tables have no relation type) - Remove Disabled from CdcSinkEmbeddedTableConfig, add LinkedTables - Add FactoryName table (Npgsql, SqlClient, MySql) to configuration-reference - Add CdcColumnMapping and CdcColumnType reference sections - Add put(id, document) and del(id) to patch capabilities - Add JSON Columns section to column-mapping - Remove non-existent Array References section from linked-tables - Remove non-existent Disabling an Embedded Table section from embedded-tables - Update server-configuration descriptions (MaxBatchSize, MaxFallbackTimeInSec, PollIntervalInSec applies to SQL Server only) - Fix licensing link in overview - Add SQL Server and MySQL/MariaDB as supported source databases
- postgres/overview.mdx: connection string, logical replication explanation, prerequisites summary, section index - sql-server/overview.mdx: expand from stub to full page with connection string, CDC prerequisites, polling behavior, SourceTableSchema default - mysql/overview.mdx + _category_.json: connection string, binlog prerequisites, streaming behavior, required privileges
…roubleshooting sections - Source Schema Changes: how each database engine handles DDL changes on source tables while CDC Sink is running (adding/removing/renaming columns, SQL Server capture instance limitations) - Partial Export/Import and State Loss: @cdc-states collection, recovery guidance, SkipInitialLoad workaround, LSN editing risks
ayende
commented
Apr 7, 2026
…update behavior Broken links fixed: - api-reference: send-multiple-operations → what-are-operations - attachment-handling: what-are-attachments → attachments/overview - column-mapping, patching: postgres/type-mapping.mdx (nonexistent) → cross-reference to patching.mdx#row-column-types PR comment fixes: - MySQL overview: rename MyMySqlConnection → MySqlConnection - Postgres overview: slot/publication names are GUID-based on first use, not deterministic hash-based New content: - how-it-works: "Updating the Task Configuration" section explaining that config changes only apply to new CDC events going forward. Existing documents are not retroactively re-processed. To apply changes to all documents, delete and recreate the task.
Move schema change documentation from troubleshooting into its own page with per-engine detail: - PostgreSQL: auto-detects via RelationMessage, most resilient - MySQL: detects via TableMapEvent column types, auto-recovers - SQL Server: requires explicit capture instance procedure (create new instance, drain old, then drop) - Quick reference table, SQL examples, recovery mechanism - Troubleshooting retains a short summary with link to the new page
MySQL CDC detects changes by column position. Compound ALTER TABLE statements (add + drop, ADD COLUMN ... AFTER ...) cause positional shifts that are hard to resolve. Apply one change at a time and let CDC Sink catch up between each.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds full documentation for the new CDC Sink ongoing task (RavenDB 7.2, RavenDB-26046).
Core pages (16): overview, how-it-works, schema-design, embedded-tables, linked-tables, column-mapping, patching, delete-strategies, property-retention, attachment-handling, configuration-reference, api-reference, monitoring, failover-and-consistency, troubleshooting, server-configuration
PostgreSQL pages (9): prerequisites-checklist, wal-configuration, permissions-and-roles, initial-setup, replica-identity, replica-identity-manual-setup, cleanup-and-maintenance, monitoring-postgres, studio-ui
PostgreSQL examples (4): simple-migration, denormalization, event-sourcing, complex-nesting
SQL Server (1): overview stub
Key topics covered:
CdcColumnMappingwithColumn,Name, andCdcColumnType(Default,Json,Attachment)$row,$old,load(), OnDelete strategiesTest plan
/server/ongoing-tasks/cdc-sink/overviewand verify sidebar navigation