test(spanner): handle AlreadyExists idempotently on restore test retries#17174
Conversation
88b7fe6 to
19cb8be
Compare
There was a problem hiding this comment.
Code Review
This pull request enhances the Spanner backup and restore sample tests by adding handling for 'AlreadyExists' exceptions, allowing tests to verify existing resources instead of failing. A new helper function, '_verify_restored_database', was introduced to validate the state and encryption configuration of restored databases. Feedback from the review identifies a potential resource leak in 'test_create_database_with_retention_period' where the database is not dropped if successfully created, and suggests improving the helper function to verify all KMS keys instead of just the first one.
I am having trouble creating individual review comments. Click here to see my feedback.
packages/google-cloud-spanner/samples/samples/backup_sample_test.py (332-337)
The database.drop() call is currently nested within the except AlreadyExists block. This means the database will not be cleaned up if the creation succeeds on the first attempt, leading to a resource leak in the test environment. The cleanup should be moved outside the except block to ensure it runs in both the success and AlreadyExists scenarios.
except AlreadyExists:
# Verify version retention period options match
database.reload()
assert database.version_retention_period is not None
# Clean up
database.drop()
packages/google-cloud-spanner/samples/samples/backup_sample_test.py (62-63)
The current implementation only verifies the first KMS key when multiple keys are provided. This reduces test coverage compared to the original assertions which checked all keys. It is better to verify that all expected keys are present in the database's encryption configuration.
if kms_key_names:
assert len(db.encryption_config.kms_key_names) == len(kms_key_names)
for expected, actual in zip(kms_key_names, db.encryption_config.kms_key_names):
assert expected in actual
adccc36 to
dcc0af4
Compare
When a restore database test times out on the client-side, Spanner continues the restore operation in the background. When the test runner retries the test, the call would fail with AlreadyExists. This change catches the AlreadyExists exception, retrieves the database metadata, and asserts that it is restoring from the correct source backup and achieves a READY or READY_OPTIMIZING state. This allows the test to execute idempotently without namespace collisions.
dcc0af4 to
e959f9e
Compare
|
I don't quite understand this comment:
When would the test runner retry the test? Or do you mean when a new test run of all the integration tests are executed? If it is the latter, isn't this rather an indication that there is a problem with this test re-using the same database name when restoring a database? And that the actual fix would be to use a dynamically generated database name? (Note: I would also be open to just remove/skip these backup tests. I don't think they add much value, as the client code that is being tested is minimal.) |
|
For backup tests, we do test retry errors like below
when we take a |
When a restore database test times out on the client-side, Spanner continues the restore operation in the background. When the test runner retries the test, the call would fail with AlreadyExists.
This change catches the AlreadyExists exception, retrieves the database metadata, and asserts that it is restoring from the correct source backup and achieves a READY or READY_OPTIMIZING state. This allows the test to execute idempotently without namespace collisions.
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕