-
Notifications
You must be signed in to change notification settings - Fork 796
MAINT: Production schema migration guard + add deliberate migration script for release #2028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jsong468
wants to merge
5
commits into
microsoft:main
Choose a base branch
from
jsong468:azure_sql_migration_guard
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+523
−4
Open
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
1f09463
azuresql prod migration guard
jsong468 62aa3a3
warn instead of raise on schema mismatch
jsong468 9f72384
pr refactor and other changes
jsong468 5389ba9
Merge branch 'main' into azure_sql_migration_guard
jsong468 1f5f9b2
clarify release instructions
jsong468 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,169 @@ | ||
| # Copyright (c) Microsoft Corporation. | ||
| # Licensed under the MIT license. | ||
|
|
||
| """ | ||
| Deliberate schema migration tool for production databases. | ||
|
|
||
| This script is the ONLY sanctioned way to apply Alembic migrations to a production | ||
| database. It is intended to be run during the release process (see | ||
| doc/contributing/10_release_process.md) or by a CD pipeline — never by normal | ||
| application startup. | ||
|
|
||
| It constructs an AzureSQLMemory with skip_schema_migration=True (bypassing the | ||
| runtime guard), then explicitly calls _run_schema_migration to upgrade to head. | ||
| The environment checks ensure this only runs from a release branch. | ||
|
|
||
| Safety rails: | ||
| - Validates the environment (release branch, clean working tree, no .dev version). | ||
| - Interactive confirmation when running in a terminal. | ||
| - Exits non-zero on any failure. | ||
|
|
||
| Usage: | ||
| python build_scripts/migrate_prod_memory_schema.py | ||
|
|
||
| The script reads the production connection string from | ||
| AZURE_SQL_DB_CONNECTION_STRING_PROD (loaded from ~/.pyrit/.env). | ||
| """ | ||
|
|
||
| import subprocess | ||
| import sys | ||
|
|
||
| import dotenv | ||
|
|
||
| from pyrit.common.path import CONFIGURATION_DIRECTORY_PATH | ||
|
|
||
| # Load .env files from ~/.pyrit/ (same files that initialize_pyrit_async loads) | ||
| # Use override=False so explicitly-set env vars take precedence over .env values | ||
| for _env_file in [CONFIGURATION_DIRECTORY_PATH / ".env", CONFIGURATION_DIRECTORY_PATH / ".env.local"]: | ||
| if _env_file.exists(): | ||
| dotenv.load_dotenv(_env_file, override=False, interpolate=True) | ||
|
|
||
| # ANSI color codes | ||
| _GREEN = "\033[92m" | ||
| _RED = "\033[91m" | ||
| _YELLOW = "\033[93m" | ||
| _RESET = "\033[0m" | ||
|
|
||
|
|
||
| def _print_error(message: str) -> None: | ||
| """Print an error message in red to stderr.""" | ||
| print(f"{_RED}ERROR: {message}{_RESET}", file=sys.stderr) | ||
|
|
||
|
|
||
| def _print_success(message: str) -> None: | ||
| """Print a success message in green.""" | ||
| print(f"{_GREEN}{message}{_RESET}") | ||
|
|
||
|
|
||
| def _check_release_environment() -> list[str]: | ||
| """ | ||
| Validate that the script is running in a proper release environment. | ||
|
|
||
| Returns a list of warning/error messages. Empty list means all checks pass. | ||
| """ | ||
| issues: list[str] = [] | ||
|
|
||
| try: | ||
| branch = subprocess.check_output( | ||
| ["git", "rev-parse", "--abbrev-ref", "HEAD"], | ||
| text=True, | ||
| stderr=subprocess.DEVNULL, | ||
| ).strip() | ||
| if not branch.startswith("releases/"): | ||
| issues.append( | ||
| f"Not on a release branch (current: '{branch}'). " | ||
| "Production migrations should run from 'releases/vX.Y.Z'." | ||
| ) | ||
| except (subprocess.CalledProcessError, FileNotFoundError): | ||
| issues.append("Could not determine current Git branch.") | ||
|
|
||
| try: | ||
| dirty_files = subprocess.check_output( | ||
| ["git", "status", "--porcelain", "--", "pyrit/memory/"], | ||
| text=True, | ||
| stderr=subprocess.DEVNULL, | ||
| ).strip() | ||
| if dirty_files: | ||
| issues.append( | ||
| "Uncommitted changes detected in pyrit/memory/:\n" | ||
| f" {dirty_files}\n" | ||
| " Commit or stash changes before migrating production." | ||
| ) | ||
| except (subprocess.CalledProcessError, FileNotFoundError): | ||
| issues.append("Could not check Git working tree status.") | ||
|
|
||
| try: | ||
| from pyrit import __version__ | ||
|
|
||
| if ".dev" in __version__: | ||
| issues.append( | ||
| f"PyRIT version is '{__version__}' (development). " | ||
| "Production migrations should use a release version (no .dev suffix)." | ||
| ) | ||
| except ImportError: | ||
| issues.append("Could not determine PyRIT version.") | ||
|
|
||
| return issues | ||
|
|
||
|
|
||
| def main() -> int: | ||
| """Entry point for production schema migration.""" | ||
| import argparse | ||
|
|
||
| parser = argparse.ArgumentParser( | ||
| description="Apply Alembic schema migrations to the production database.", | ||
| ) | ||
| parser.add_argument( | ||
| "--skip-environment-check", | ||
| action="store_true", | ||
| help="Skip release environment checks (branch, clean tree, version). Use only in CI with caution.", | ||
| ) | ||
| args = parser.parse_args() | ||
|
|
||
| # Safety rail: Verify release environment | ||
| if not args.skip_environment_check: | ||
| issues = _check_release_environment() | ||
| if issues: | ||
| _print_error("Release environment checks failed:") | ||
| for issue in issues: | ||
| _print_error(f" - {issue}") | ||
| _print_error("Fix the above issues or pass --skip-environment-check (CI only).") | ||
| return 1 | ||
| else: | ||
| print(f"{_YELLOW}WARNING: Skipping release environment checks.{_RESET}") | ||
|
|
||
| # Interactive confirmation | ||
| if sys.stdin.isatty(): | ||
| print("About to migrate production database schema to head.") | ||
| response = input("Type 'yes' to proceed: ") | ||
| if response.strip().lower() != "yes": | ||
| print("Aborted.") | ||
| return 1 | ||
|
|
||
| # Construct AzureSQLMemory with skip_schema_migration=True to bypass the runtime guard, | ||
| # then explicitly run migration. | ||
| import os | ||
|
|
||
| from pyrit.memory import AzureSQLMemory | ||
|
|
||
| prod_conn = os.environ.get(AzureSQLMemory.AZURE_SQL_DB_CONNECTION_STRING_PROD) | ||
| if not prod_conn: | ||
| _print_error(f"Environment variable '{AzureSQLMemory.AZURE_SQL_DB_CONNECTION_STRING_PROD}' is not set.") | ||
| return 1 | ||
|
|
||
| try: | ||
| memory = AzureSQLMemory( | ||
| connection_string=prod_conn, | ||
| skip_schema_migration=True, | ||
| ) | ||
| print("Running schema migration...") | ||
| memory._run_schema_migration() | ||
| _print_success("Production schema migration completed and verified successfully.") | ||
| return 0 | ||
| except Exception as e: | ||
| _print_error(f"Migration failed: {e}") | ||
| return 1 | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| raise SystemExit(main()) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if adding a 300 line script is a bit of overhead, with a lot of overlap with other migration code (i.e. the run_migrations functions there which upgrades a db given its engine.
I think this should be a thin wrapper that constructs an AzureSQLMemory with
skip=trueand the prod connection string, then calls its _run_schema_migration ... no?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, slimmed it down!