You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Document dependency tracking and cascade re-indexing for the AEM connector. Adds a configuration reference entry for dumont.dependencies.enabled (default false) and a new AEM connector section that explains how dependencies are discovered (collects string values starting with /content from .infinity.json), when cascade re-indexing runs (standalone/incremental only, not Index All), configuration/usage examples, and performance considerations. Includes sequence flow for event -> reindex -> dependent reindex behavior and notes about needing a Reindex All to populate dependencies for existing records.
Copy file name to clipboardExpand all lines: docs-dumont/configuration-reference.md
+6Lines changed: 6 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -151,6 +151,12 @@ logging:
151
151
| `dumont.aem.querybuilder` | `false` | Enable QueryBuilder-based content discovery instead of tree traversal during full indexing |
152
152
| `dumont.aem.querybuilder.parallelism` | `10` | Number of parallel threads for processing discovered paths |
153
153
154
+
### Dependency Tracking
155
+
156
+
| Property | Default | Description |
157
+
|---|---|---|
158
+
| `dumont.dependencies.enabled` | `false` | Persist `/content/*` references extracted from each indexed page and cascade re-index any dependent document on standalone updates. Currently populated only by the AEM connector — see [AEM Connector → Dependency Tracking](./connectors/aem.md#dependency-tracking-and-cascade-re-indexing) |
Copy file name to clipboardExpand all lines: docs-dumont/connectors/aem.md
+71Lines changed: 71 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -154,6 +154,77 @@ For each page, the connector:
154
154
155
155
<divclassName="page-break" />
156
156
157
+
## Dependency Tracking and Cascade Re-Indexing
158
+
159
+
When a page references other content (experience fragments, content fragments, shared components, linked pages), the AEM connector can automatically **re-index every page that depends on an updated path**. This prevents stale content from surviving in the index when a shared resource changes.
160
+
161
+
### How Dependencies Are Discovered
162
+
163
+
As each node's `.infinity.json` is fetched, the connector walks the JSON recursively and collects **every string value that starts with `/content`** — those paths become the document's dependency set. The extraction is completely automatic; no configuration on the AEM side is required.
164
+
165
+
The dependency set is then attached to the `DumJobItemWithSession` and persisted alongside the indexing record (`dum_connector_dependency` table) whenever the record is saved or updated.
166
+
167
+
### When the Cascade Fires
168
+
169
+
Dependency processing runs **only on standalone (incremental) indexing** — it is **not** triggered by `Index All`:
participant EVT as AEM Event Listener<br/>(or Manual API)
175
+
participant API as Dumont Connector
176
+
participant DB as Indexing Store
177
+
participant SE as Turing ES
178
+
179
+
EVT->>API: POST /api/v2/aem/index/{source}<br/>{paths: ["/content/wknd/.../header"]}
180
+
API->>SE: Index the updated path(s)
181
+
API->>DB: findObjectIdsByDependencies(paths)
182
+
DB-->>API: IDs of pages that reference those paths
183
+
API->>SE: Re-index the dependent pages
184
+
```
185
+
186
+
1. A page is indexed via event listener, manual API call, or the Indexing Manager.
187
+
2. The main indexing command runs first (the `Job Item` is produced and sent).
188
+
3.`DependencyHandler` queries the indexing store for every document whose stored dependency list contains one of the updated paths.
189
+
4. A second `IndexPaths` command re-indexes those dependents, which in turn also have their own dependencies refreshed.
190
+
191
+
### Configuration
192
+
193
+
A single property controls both the persistence of dependency links **and** the cascade behavior:
194
+
195
+
| Property | Default | Description |
196
+
|---|---|---|
197
+
|`dumont.dependencies.enabled`|`false`*(shipped in `application.yaml`)*| Persist `/content/*` dependencies on each indexing record and trigger cascade re-indexing on standalone operations |
- New/updated indexing records are saved **without** a dependency set (the join table stays empty for those rows)
215
+
-`DependencyHandler.processDependencies()` returns early and no cascade re-indexing happens
216
+
217
+
After turning the flag on, previously indexed content still has no stored dependencies — run a **Reindex All** to populate the dependency table for existing records.
218
+
:::
219
+
220
+
:::tip Performance impact
221
+
Every standalone index operation performs an extra lookup plus a second indexing pass for any dependents. On sources with heavily shared components (templates, headers/footers, fragments), a single page update can fan out into many re-indexations — budget accordingly, or leave the flag off and rely on scheduled full reindexing.
222
+
:::
223
+
224
+
---
225
+
226
+
<divclassName="page-break" />
227
+
157
228
## AEM Server-Side Bundle (Event Listeners)
158
229
159
230
The `aem-server` module is an **OSGi bundle installed inside AEM**. It provides event listeners that automatically notify the Dumont connector when content changes.
0 commit comments