Skip to content

Antalya 26.1: Remote initiator improvements#1577

Open
ianton-ru wants to merge 10 commits intoantalya-26.1from
feature/antalya-26.1/remote_initiator_improvements
Open

Antalya 26.1: Remote initiator improvements#1577
ianton-ru wants to merge 10 commits intoantalya-26.1from
feature/antalya-26.1/remote_initiator_improvements

Conversation

@ianton-ru
Copy link
Copy Markdown

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Different improvement for remote initiator

Documentation entry for user-facing changes

With remote initiator feature queries like

SELECT * FROM iceberg(...) SETTINGS object_storage_cluster='swarm', object_storage_remote_initiator=1

rewrites as

SELECT * FROM remote('remote_host', icebergCluster('swarm', ...)

'remote_host' is a random host from 'swarm' cluster
See #756

Current PR introduces the next improvements:

  1. Partially solved object_storage_remote_initiator auth works incorrectly #1570 - uses username and password if access to cluster requires it. Throws exception if cluster uses common secret, this should be solved in future PRs.
  2. Solved object_storage_remote_initiator with different cluster name #1571 - new setting object_storage_remote_initiator_cluster allows to choose remote_host from different cluster, not only from swarm
  3. remote query did not work with additional setting inside function, like remote('remote_host', iceberg(..., SETTINGS iceberg_metadata_file_path='path/to/metadata.json')). Now must work correctly.
  4. In query remote('remote_host', icebergCluster('remote_cluster', ...)) cluster remote_cluster can be defined only on remote_host and unknown on current initial host. Removed cluster check on early stage, this allows to execute such queries.

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 25, 2026

Workflow [PR], commit [a5eee1d]

@ianton-ru
Copy link
Copy Markdown
Author

@codex review

@ianton-ru
Copy link
Copy Markdown
Author

Audit: PR #1577 — Antalya 26.1: Remote initiator improvements

Source: Altinity/ClickHouse#1577
Base: antalya-26.1
Reviewed revision: branch including commit d7c4beebbdb (Remove unused header) and prior feature commits.


Confirmed defects

None. No confirmed defects in reviewed scope.

(Earlier revision briefly introduced an unused #include <Common/logger_useful.h> in QueryAnalyzer.cpp; commit d7c4beebbdb removes it.)


Coverage summary

Area Notes
Scope reviewed FunctionNode SETTINGS handling (QueryTreeBuilder, FunctionNode hash / clone / equal / toASTImpl), QueryAnalyzer::resolveTableFunctionTableFunctionNode bridge, StorageDistributed::buildQueryTreeDistributed, IStorageCluster::{read, convertToRemote} (initiator cluster, secret guard, URI decode, user/password remote), ITableFunctionCluster cluster-presence check removal, Settings / SettingsChangesHistory, integration tests under tests/integration/test_s3_cluster/.
Categories failed None.
Categories passed SETTINGS propagation for nested table functions; initiator cluster selection (object_storage_remote_initiator_cluster); host decode and auth branches; fail-closed on cluster secret (NOT_IMPLEMENTED); deferred cluster resolution for remote-only definitions; recursion guard via object_storage_remote_initiator = false on copied context; SettingsChangesHistory updates; integration scenarios (secret error, split initiator vs object-storage cluster).
Assumptions / limits Pre-existing static pcg64 RNG in convertToRemote may be a concurrency concern under parallel queries; not introduced by this PR. Poco::URI::decode on full Address::toString() not exhaustively validated for every exotic address encoding; aligned with escapeForFileName / dotted-host tests.

Audit metadata

  • Method: Static review per workspace audit-review standard (call graph, transitions, fault categories, concurrency and C++ hazard pass).
  • Output rule: Confirmed defects only; hypotheticals folded into assumptions/limits.

AI-generated audit note.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a0d1972000

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +354 to +358
auto remote_initiator_cluster_name = settings[Setting::object_storage_remote_initiator_cluster].value;
if (remote_initiator_cluster_name.empty())
remote_initiator_cluster_name = cluster_name_from_settings;
auto remote_initiator_cluster = getClusterImpl(context, remote_initiator_cluster_name);
auto storage_and_context = convertToRemote(remote_initiator_cluster, context, remote_initiator_cluster_name, query_to_send);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Strip initiator-only cluster setting before forwarding query

When object_storage_remote_initiator_cluster is set, the query forwarded via remote(...) still carries that setting even though convertToRemote only clears object_storage_remote_initiator. This makes remote execution depend on remote nodes understanding a setting that is only needed on the initiator; in mixed-version/rolling-upgrade clusters, older remote hosts can fail with unknown-setting errors before execution. The forwarded AST settings should drop object_storage_remote_initiator_cluster together with object_storage_remote_initiator.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ianton-ru does it make sense? it looks like it does

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

object_storage_remote_initiator_cluster does nothing without ``object_storage_remote_initiator`. Make sense to remove just for less garbage in sub-query.

Prewhere filter
Prewhere filter column: less(multiply(2, b), 100)
Filter column: and(indexHint(greater(plus(i, 40), 0)), equals(a, 0)) (removed)
Filter column: and(equals(a, 0), indexHint(greater(plus(i, 40), 0))) (removed)
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argument order depends on hash, hash was changes (see FunctionNode::updateTreeHashImpl)

@svb-alt svb-alt requested a review from arthurpassos March 26, 2026 12:33
if (settings[Setting::object_storage_remote_initiator])
{
auto storage_and_context = convertToRemote(cluster, context, cluster_name_from_settings, query_to_send);
auto remote_initiator_cluster_name = settings[Setting::object_storage_remote_initiator_cluster].value;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment explaining what this code block does. It took me a while to understand it by just reading the code.

I suggest something like:

/// In case the current node is not supposed to initiate the clustered query
/// Sends this query to a remote initiator using the `remote` table function
if (settings[Setting::object_storage_remote_initiator])
{
      /// Re-writes queries in the form of:
      /// Input: SELECT * FROM iceberg(...) SETTINGS object_storage_cluster='swarm', object_storage_remote_initiator=1
     /// Output: SELECT * FROM remote('remote_host', icebergCluster('swarm', ...)
     /// Where `remote_host` is a random host from the cluster which will execute the query
     /// This means the initiator node belongs to the same cluster that will execute the query
     /// In case remote_initiator_cluster_name is set, the initiator might be set to a different cluster
}

if (shard_addresses.size() != 1)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Size of shard {} in cluster {} is not equal 1", shard_num, cluster_name_from_settings);
auto host_name = shard_addresses[0].toString();
std::string host_name;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address here in encoded format foo%2Ebar instead of foo.bar. This wasn't catched in tests before I add a host with dot in name.

auto remote_query = makeASTFunction(remote_function_name, make_intrusive<ASTLiteral>(host_name), table_expression->table_function);
boost::intrusive_ptr<ASTFunction> remote_query;

if (shard_addresses[0].user_specified)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment please

throw Exception(ErrorCodes::CLUSTER_DOESNT_EXIST, "Requested cluster '{}' not found", cluster_name);
/// Remove check cluster existing here
/// In query like
/// remote('remote_host', xxxCluster('remote_cluster', ...))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the query is not remote? Can't we check for that?

/// If cluster not exists, query falls later

Where and with which exception? It would be good to avoid any network calls before failing

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cluster name is not a network node name, it's an internal ClickHouse name. Query falls later when tries to get hosts from cluster. Network calls can't be made without hosts.

But it's hard to understand here is cluster function inside 'remote' or not.

Comment on lines +354 to +358
auto remote_initiator_cluster_name = settings[Setting::object_storage_remote_initiator_cluster].value;
if (remote_initiator_cluster_name.empty())
remote_initiator_cluster_name = cluster_name_from_settings;
auto remote_initiator_cluster = getClusterImpl(context, remote_initiator_cluster_name);
auto storage_and_context = convertToRemote(remote_initiator_cluster, context, remote_initiator_cluster_name, query_to_send);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ianton-ru does it make sense? it looks like it does

@arthurpassos
Copy link
Copy Markdown
Collaborator

The changes look ok, but I think it needs more documentation. I also wonder if we can keep the 'cluster exists' check by wrapping that check in a `if (!is_remote0'

arthurpassos
arthurpassos previously approved these changes Mar 27, 2026
Copy link
Copy Markdown
Collaborator

@arthurpassos arthurpassos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alsugiliazova
Copy link
Copy Markdown
Member

Audit Report: PR #1577 — Remote initiator improvements

Scope: Altinity/ClickHouse PR #1577

AI audit note: This review comment was generated by AI (gpt-5.3-codex).

Confirmed defects

Medium: Forwarded query does not strip initiator-only setting

  • Impact: In mixed-version/rolling-upgrade clusters, the rewritten remote query can fail on the remote initiator with an unknown-setting error before execution, breaking object_storage_remote_initiator flow.
  • Anchor: src/Storages/IStorageCluster.cpp / IStorageCluster::convertToRemote
  • Trigger: Query uses object_storage_remote_initiator=1 and object_storage_remote_initiator_cluster='...', and selected remote initiator runs a version that does not know object_storage_remote_initiator_cluster.
  • Why defect: convertToRemote removes object_storage_remote_initiator from AST query settings, but does not remove object_storage_remote_initiator_cluster; this setting is initiator-only and is still forwarded to remote SQL.
  • Fix direction (short): Remove object_storage_remote_initiator_cluster from settings_ast.changes together with object_storage_remote_initiator, then drop SETTINGS clause if empty.
  • Regression test direction (short): Add test asserting rewritten forwarded SQL does not contain either initiator-only setting after convertToRemote.

Coverage summary

  • Scope reviewed: IStorageCluster remote initiator rewrite path; analyzer propagation of table-function SETTINGS (FunctionNode, QueryTreeBuilder, QueryAnalyzer, StorageDistributed); ITableFunctionCluster cluster existence deferral; new settings/history wiring; integration tests under tests/integration/test_s3_cluster.
  • Categories failed: Setting-forwarding compatibility contract (initiator-only setting leak to remote SQL).
  • Categories passed: Initiator host selection and URI decode path; user/password remote auth propagation; fail-closed behavior for secret-based clusters (NOT_IMPLEMENTED); nested table-function SETTINGS preservation in analyzer/query-tree conversion; deferred cluster lookup behavior; settings metadata/history registration; added integration scenarios for auth/secret/initiator-cluster split.
  • Assumptions/limits: Static audit only; no runtime mixed-version cluster execution performed in this review.

@alsugiliazova
Copy link
Copy Markdown
Member

PR #1577 CI Verification Report

CI Results Overview

Category Count
Success ~55
Failure 8 (see analysis below)
Skipped ~39 (excluded sanitizer suites)

PR's New Test Validation

The PR adds new integration tests for test_object_storage_remote_initiator in test_s3_cluster/test.py (+123 lines). These tests initially failed on the Mar 27 CI run but were fixed by subsequent commits (Fix test df2595a, Fix setting cleanup a5eee1d).

Latest run (Mar 30):

Job Test Result
Integration tests (amd_binary, 5/5) test_object_storage_remote_initiator OK
Integration tests (amd_asan, db disk, old analyzer, 2/6) test_object_storage_remote_initiator OK
Integration tests (arm_binary, distributed plan, 2/4) test_object_storage_remote_initiator OK
Integration tests (amd_asan, targeted) test_object_storage_remote_initiator[1-10] through [10-10] (10 parametrized) All OK

All 13 test executions passed on the latest CI run across amd_binary, amd_asan, and arm_binary configurations.

CI Failures

1. test_object_storage_remote_initiator (Mar 27 run) — Fixed by PR Commits

Jobs: Integration tests (amd_binary 2/5, amd_asan 4/6, arm_binary 2/4)

Failed on Mar 27 run, passed on all subsequent runs after fix commits df2595a and a5eee1d.

Related to PR: Yes — Development-stage failures, resolved in final commits

2. 01625_constraints_index_append (Fast test, Mar 25) — Fixed by PR Commits

Job: Fast test (initial run only)

The PR modifies the reference file for this test. Failed once on the earliest commit (Mar 25), then passed consistently in all 24+ subsequent runs across all stateless test configurations.

Related to PR: Yes — Reference file update, resolved in subsequent commits

3. BuzzHouse (amd_debug, arm_asan) — Known Flaky Fuzzer

Server crash during random SQL fuzzing. BuzzHouse is a known flaky fuzzer across the CI.

Related to PR: No — Known flaky fuzzer unrelated to remote initiator changes

4. test_storage_hudi (3 tests) — Pre-existing Branch Failure

Job: Integration tests (amd_binary, 4/5)

test_single_hudi_file, test_multiple_hudi_files, test_types — All 3 Hudi tests fail. Database analysis shows these fail on multiple PRs (#1577, #1581, #1594, #1568) and master (PR=0), confirming pre-existing breakage.

Related to PR: No — Pre-existing Hudi test failure across the branch

5. test_backup_to_s3_different_credentials[...-non_native_multipart] — Flaky

Job: Integration tests (amd_binary, 5/5)

1 failure out of 72 total runs for this test on PR #1577. Passed in all other configurations (amd_asan, arm_binary) and in prior runs.

Related to PR: No — Intermittent flaky test

6. 01171_mv_select_insert_isolation_long — Known Flaky

Job: Stateless tests (arm_asan, targeted) — 3 failures

Failed on all 3 targeted reruns. This is a long-running MV isolation test known to be unstable on arm_asan.

Related to PR: No — Pre-existing flaky test

7. test_move_after_processing[another_bucket-AzureQueue] — Unrelated

Job: Integration tests (arm_binary, distributed plan, 3/4)

Azure Queue storage processing test, completely unrelated to remote initiator or Iceberg changes.

Related to PR: No — Azure Queue test

8. Stateless tests (ParallelReplicas, s3 storage) — Intermittent Flaky

01038_dictionary_lifetime_min_zero_sec and 04003_cast_nullable_read_in_order_explain — 1 failure each in ParallelReplicas mode.

Related to PR: No — Intermittent ParallelReplicas flakiness

9. GrypeScan (-alpine) — CVE in Base Image

CVE in Alpine base image (altinityinfra/clickhouse-server:1577-26.1.6.20001.altinityantalya-alpine). Non-alpine image scan passed.

Related to PR: No — Base image vulnerability

Regression Test Results (PR's Internal CI)

Suite x86_64 aarch64
Iceberg (1) Fail (3h30m timeout) Pass
Iceberg (2) Pass Pass
Parquet Pass Pass
Parquet (aws_s3) Pass Pass
Parquet (minio) Pass Pass
S3 Export (part) Pass Pass
S3 Export (partition) Pass Pass
Swarms Fail Fail

Regression Failure: Iceberg (1) x86_64 — Pre-existing Timeout

The Iceberg 1 suite hit the 3h30m job timeout on x86_64. Database analysis (30-day window) shows this is a pre-existing issue:

  • 26.1.6.20001.altinityantalya x86_64: 2 Fail / 11 OK (~15% fail rate)
  • 26.1.4.20001.altinityantalya x86_64: 7 Fail / 51 OK (~12% fail rate)
  • 26.1.3.20001.altinityantalya x86_64: 66 Fail / 18 OK (~79% fail rate)
  • Iceberg 1 aarch64 passed. Iceberg 2 passed on both architectures.

Assessment: Flaky (pre-existing) — Intermittent timeout affecting the Iceberg 1 suite, not specific to this PR.

Regression Failure: Swarms — Pre-existing Node Failure Instability

Two failing scenarios on both x86_64 and aarch64:

/swarms/feature/node failure/initiator out of disk space (Fail)

Database check for: /swarms/feature/node failure/initiator out of disk space

- History: 52/113 fails in last 7 days, 202/423 fails in last 30 days
- Last fail: 2026-03-30
- Last pass: 2026-03-30
- Concentration: All versions affected — 26.1.6 (73%), 26.1.4 (28%), 26.1.3 (78%), 25.8.16 (5%), 25.8.14 (57%)
- Error signature: Consistent — UNKNOWN_DATABASE exception (Code: 81)

Assessment: Flaky (pre-existing)
Recommendation: Known unstable test, no action needed for PR verification

/swarms/feature/node failure/check restart clickhouse on swarm node (Error — 600s timeout)

Database check for: /swarms/feature/node failure/check restart clickhouse on swarm node

- History: 266 Fail + 60 Error / 97 OK in last 30 days (~77% failure rate)
- Last fail: 2026-03-26
- Last pass: 2026-03-30
- Concentration: All 26.1.x versions show 0% pass rate; 25.8.x has partial passes
- Error signature: Consistent — ExpectTimeoutError 600s

Assessment: Flaky (pre-existing)
Recommendation: Known unstable test, no action needed for PR verification

Both Swarms failures are confirmed as pre-existing instability by previous verification reports (PR #1575, PR #1583).

Verdict: Ready to merge after audit review — No unresolved PR-related failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants