Skip to content

Backport mixed-version test infrastructure to release-13.2#8496

Merged
eaydingol merged 9 commits into
release-13.2from
eaydingol/release-13-tests
Mar 9, 2026
Merged

Backport mixed-version test infrastructure to release-13.2#8496
eaydingol merged 9 commits into
release-13.2from
eaydingol/release-13-tests

Conversation

@eaydingol
Copy link
Copy Markdown
Collaborator

Backport mixed-version test infrastructure to release-13.2

This PR cherry-picks a series of commits from main that introduce mixed-version testing support and related fixes to the release-13.2 branch. These changes enable running regression tests against different Citus versions (N/N-1 scenarios), which is essential for validating backward compatibility during phased upgrades.

Cherry-picked commits

  1. Relax version check (#8356)c125b9bec563
    Relaxes the Citus version compatibility check to allow the installed extension to differ by at most one minor version from the loaded library. This enables replicas to lag by a single minor version during phased upgrades without triggering a version mismatch for distributed queries.

  2. Workflow refactor (#8361)b8ab8aac554c
    Refactors .github/workflows/build_and_test.yml to use a shared reusable workflow (run_tests.yml) for the main, failure, and CDC test suites. Introduces the CITUSVERSION environment variable so tests can create the Citus extension at a specific version (e.g., CITUSVERSION=13.2-1 make check).

  3. Test refactor for columnar separation (#8420)b7a5f698ab77
    Minor test adjustments to prepare for columnar module separation, adding necessary setup/teardown steps in several test files.

  4. Alternative lib directory for tests (#8390)2e64eb020268
    Adds support for loading citus.so and columnar.so from an alternative directory during tests via the CITUSLIBDIR environment variable (e.g., CITUSLIBDIR=/path/to/v13.2.0 make check-minimal).

  5. Store/restore exit status (#8436)1c5b304feeef
    Fixes regression test exit status being lost after test cleanup steps, ensuring the correct exit code is propagated.

  6. Mixed version tests for citus (#8431)08a27ddc0579
    The core mixed-version testing infrastructure. Extends pg_regress_multi.pl to support:

    • Creating the Citus extension at a specific version for coordinator and/or workers
    • Swapping the Citus shared library for specific nodes
    • Running N/N-1 compatibility tests in multiple configurations (SQL N-1, Lib N-1, Worker N-1, Coordinator N-1)

    Separates tests that drop/create the Citus extension into a new multi_1_create_citus_schedule to allow safe reuse of remaining multi-node tests in N-1 scenarios.

  7. Fix IsTenantSchema when version checks are disabled (#8480)546f20661a3b
    Fixes IsTenantSchema() incorrectly returning false when citus.enable_version_checks was off. The previous guard completely disabled schema-based sharding during mixed-version testing, causing 5 test failures. The fix replaces the guard with a CheckCitusVersion() call, which returns true unconditionally when version checks are disabled.

Conflicts resolved

  • build_and_test.yml (commits 2 and 6): Resolved by accepting the incoming reusable workflow pattern and the new check-multi-1-create-citus / check-tap make targets.
  • Makefile (commit 6): Resolved by accepting the new check-multi-1-create-citus and check-tap targets in check-full.

Files changed

29 files changed, +699 / -210 lines across CI workflows, test infrastructure (pg_regress_multi.pl, Makefile), core version checking logic (metadata_cache.c, columnar_tableam.c), tenant schema handling, and regression test SQL/expected output files.

Relaxes the Citus version compatibility check for asynchronous replica
upgrades.

Citus requires the loaded shared library and the installed extension to
have identical major and minor versions.
This change keeps the library vs. control-file version checks the same
(a restart is still needed after upgrading the library), but allows the
installed extension to differ by at most one minor version from the
loaded library.

With this relaxation in place, replicas can lag by a single minor
version during phased upgrades without triggering a version mismatch for
distributed queries.

Tested the change on release branch 13.2 as main already points to the
next major version. Versioned test capability from
#8361 is used in test.

See branch for test
https://github.com/citusdata/citus/tree/eag/release-13.2/relax-test ,
the so and sql versions are 13.2-1.

```
// Create extension fails
CITUSVERSION=13.0-1 citus_tests/run_test.py adaptive_executor
....
CREATE DATABASE
ERROR:  specified version incompatible with loaded Citus library
DETAIL:  Loaded library requires 13.2, but 13.0-1 was specified.
HINT:  If a newer library is present, restart the database and try the command again.
....

// Test proceeds as expected 
CITUSVERSION=13.1-1 citus_tests/run_test.py adaptive_executor
CREATE DATABASE
CREATE EXTENSION
CREATE FUNCTION

// Test proceeds as expected
CITUSVERSION=13.2-1 citus_tests/run_test.py adaptive_executor

CREATE DATABASE
CREATE EXTENSION
CREATE FUNCTION
```
The PR refactors the workflow runs and adds support for creating citus
using a given version in test suite.

- refactor `.github/workflows/build_and_test.yml` to run the shared
`run_tests.yml` workflow for the main, failure, and cdc tests
- introduce the `CITUSVERSION` in Makefile so tests can create citus on
a specific version. When a version is specified, citus version checks
are disabled, e.g. run `CITUSVERSION=13.2-1 make check` or
`CITUSVERSION=13.2-1 citus_tests/run_test.py citus_local_dist_joins`
locally
- See
6ce8cec
for a task to run tests against version 13.2-1
Test refactor for columnar separation
Adds support for loading citus and columnar so files from a different
directory during tests.

usage:
CITUSLIBDIR=/tmp/opt/citus_versions/v13.2.1 make check-minimal
CITUSLIBDIR=/tmp/opt/citus_versions/v13.2.1 citus_tests/run_test.py
columnar_create


The sample removed in the last commit from build_and_test.yml illustrates the use case. Similar tests will be added in following PRs
(using a dev image from
citusdata/the-process#180).
It generates the following test, see logs in Checks tab. 
[Build & Test / Test Citus Lib N-1 / PG17 - check-minimal - lib-v13.2.0
(pull_request)]
Restore the regression test exit status after the test clean-up.
Adds support for running regression tests in mixed‑version scenarios

This PR updates the regression test runner script, test schedules, and
some regression tests to support running Citus in mixed‑version setups.
With these changes, regression test schedules can be executed in **N/N‑1
and mixed‑version scenarios**, including:

- SQL version is N‑1
- Library version is N‑1
- A worker is N‑1
- Coordinator is N‑1

For potential usage, see the workflow jobs removed in commit
(0642595)
and the corresponding test runs at
(https://github.com/citusdata/citus/actions/runs/20746079664).

**Details**

**Schedule refactoring**
To enable N‑1 testing for the remaining multi‑node tests, this PR
separates tests that drop/create the Citus extension from the existing
multi_1_schedule into a new schedule `multi_1_create_citus_schedule`

This allows the remaining multi‑node tests to be reused safely in N‑1
scenarios.

**Test changes**
- Cleanup steps after tests.
- Minor adjustments to account for the updated schedule structure.

**Perl script changes**
This PR extends src/test/regress/pg_regress_multi.pl to support:

- Creating the Citus extension at a specific version for the coordinator
(the original PR that introduced versioned extension creation updated
only worker logic and missed the coordinator
#8361)
- Changing the Citus library (citus.so) for specific nodes (worker,
coordinator, or all)
- Creating the Citus extension at a given version for specific nodes
(worker, coordinator, or all)

These changes enable running N/N‑1 compatibility tests in multiple
configurations. Sample scenarios are listed below.

**Sample scenarios**
To demonstrate potential usage sample jobs are added to git workflow but
excluded from this PR (see previous comment on the commit and runs).

**Test Citus Lib N‑1**
All nodes load citus.so from version 13.2.
**Test Citus SQL N‑1**
All nodes create the Citus extension at version 13.2‑1.
**Test Citus Worker N‑1**
Only worker 1 loads citus.so from 13.2 and creates the extension at
13.2‑1.
**Test Citus Coordinator N‑1**
Only the coordinator loads citus.so from 13.2 and creates the extension
at 13.2‑1.

Note that the following schedules from “Test Citus” are not included in
N‑1 scenarios:

`check-multi-1-create-citus, check-multi-mx` Some tests in these
schedules drop and recreate the Citus extension using the default
version, which is incompatible with N‑1 setups.
`check-vanilla` The test preparation steps have not yet been adapted for
N‑1 workflows.

**Local testing**
Local testing can be performed using sql/versions.sql and
test_versions.sh. (removed in previous commit)
Steps:

- Install the citus 13.2
- Copy citus.so files from 13.2.0 to ~/citus-libs/17/v13.2.0
- Install the version from head
- Run ./test_versions.sh

See version_test_results.txt for example results.
Sample test runs:
CITUSLIBDIR=~/citus-libs/17/v13.2.0 CITUSVERSION=13.2-1
N1MODE=workeronly EXTRA_TESTS=versions make check-minimal
CITUSVERSION=13.2-1 N1MODE=all EXTRA_TESTS=versions make check-minimal
…8480)

IsTenantSchema checks pg_dist_schema when the version check is disabled.

Previously, IsTenantSchema() had a guard that returned false whenever
citus.enable_version_checks was off. This was added to protect against
accessing pg_dist_schema in multi_extension tests that install old Citus
versions (pre-12.0) where the table doesn't exist.

However, the guard had an unintended side effect: it completely disabled
schema-based sharding during mixed-version testing
(CITUSVERSION/N1MODE), even when no actual version mismatch existed.
This caused 5 tests to fail (single_node, schema_based_sharding,
citus_schema_distribute_undistribute, citus_schema_move,
local_shard_utility_command_execution) because tables created in tenant
schemas were not being distributed.

The root cause was an asymmetry: CREATE SCHEMA correctly registered the
tenant schema in pg_dist_schema (ShouldUseSchemaBasedSharding has no
version check guard), but CREATE TABLE in that schema silently fell
through to a local table because IsTenantSchema returned false.

CheckCitusVersion(). When version checks are disabled,
CheckCitusVersion() returns true unconditionally, which is safe because:
- pg_dist_schema exists in all supported Citus versions (>= 12.0)
- mixed-version tests never install versions old enough to lack it
- multi_extension (which installs pre-12.0 versions) is not run in
mixed-version scenarios
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 87.17949% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.90%. Comparing base (111a9ea) to head (8d78ca6).
⚠️ Report is 1 commits behind head on release-13.2.

Additional details and impacted files
@@               Coverage Diff                @@
##           release-13.2    #8496      +/-   ##
================================================
- Coverage         88.93%   88.90%   -0.03%     
================================================
  Files               287      287              
  Lines             63186    63220      +34     
  Branches           7950     7956       +6     
================================================
+ Hits              56194    56208      +14     
- Misses             4675     4683       +8     
- Partials           2317     2329      +12     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@eaydingol eaydingol merged commit 0bea31f into release-13.2 Mar 9, 2026
124 of 156 checks passed
@eaydingol eaydingol deleted the eaydingol/release-13-tests branch March 9, 2026 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants