Skip to content

Onboard arcade-validation to OpenTelemetry Audit (OTel)#5500

Draft
haruna99 wants to merge 1 commit into
mainfrom
haruna/otel-migration
Draft

Onboard arcade-validation to OpenTelemetry Audit (OTel)#5500
haruna99 wants to merge 1 commit into
mainfrom
haruna/otel-migration

Conversation

@haruna99
Copy link
Copy Markdown

@haruna99 haruna99 commented Apr 9, 2026

Work Item: https://dev.azure.com/dnceng/internal/_workitems/edit/10352

Add OTel Audit SDK instrumentation for privileged operations per S360 Service Layer Auditing requirements

Changes:

  • Add OpenTelemetry.Audit.Geneva SDK (v2.5.2) with AzureGenevaMonitoring NuGet feed
  • Create AuditHelper.cs with LogControlPlane/LogDataPlane wrappers, auto-detected dual Service Tree IDs (GitHub + AzDO),
    GetLocalIpAddress helper, and Golden Schema field population
  • Create eng/validation/audit-logging.ps1 PowerShell module with typed helpers for channel promotion, branch ops, build invocation, and build retention
  • Instrument validation scripts: update-channel, test-publishing, build-arcadewithrepo, remove-oldbranches, validation-functions, create-baridtag
  • Instrument C# test code: signing certificate selection/override, strong name signing config, process execution, directory operations

Add OTel Audit SDK instrumentation for privileged operations per S360 Service Layer Auditing requirements
@haruna99 haruna99 self-assigned this Apr 9, 2026
Comment thread eng/validation/update-channel.ps1
@missymessa
Copy link
Copy Markdown
Member

I'd run this branch through https://dnceng.visualstudio.com/internal/_build?definitionId=282 to make sure it does what you expect it to.

string round0FilePath = Path.Combine(builder.TestRepoRoot, "artifacts", "tmp", "Release", "Signing", "Round0-Sign.proj");
string round0ProjectText = File.ReadAllText(round0FilePath);
string expectedCert = useDotNetCert.GetValueOrDefault() ? DotNetCertificate : MicrosoftCertificate;
AuditHelper.LogControlPlane(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to initialize AuditHelper?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need to initialize AuditHelper. I will fix that

@haruna99 haruna99 marked this pull request as draft April 10, 2026 18:27
@davfost
Copy link
Copy Markdown

davfost commented Apr 14, 2026

File: eng/validation/update-channel.ps1 (line ~10)

There appears to be a wrong path for dot-sourcing the audit module. The file is loaded as:

   . $PSScriptRoot\..\common\audit-logging.ps1

But audit-logging.ps1 lives in eng/validation/, not eng/common/. Since update-channel.ps1 is also in
eng/validation/, shouldn't this be:

   . $PSScriptRoot\audit-logging.ps1

I think this would cause a script error at runtime when update-channel.ps1 runs, since the path
eng/common/audit-logging.ps1 doesn't exist.

@davfost
Copy link
Copy Markdown

davfost commented Apr 14, 2026

File: eng/validation/build-arcadewithrepo.ps1 (multiple Write-AuditLog-* call sites)

Several audit calls in the happy path are hardcoded as -Result "Success" without checking whether the preceding
operation actually succeeded. For example, Write-AuditLog-ChannelDeletion and Write-AuditLog-BranchOperation are
called unconditionally after & $darc delete-default-channel and Git-Command ... push origin --delete. If those
commands fail (and $ErrorActionPreference = 'Stop' prevents reaching the audit line), they won't log at all — but
if they fail silently, the audit record would be incorrect.

Is there an intent to add failure-path audit logging here in the same way test-publishing.ps1 does, or is the
assumption that $ErrorActionPreference = 'Stop' is sufficient to prevent false-success logs?

@davfost
Copy link
Copy Markdown

davfost commented Apr 14, 2026

File: src/Validation/src/AuditHelper.cs (Initialize() method, lines ~609–615)

The _initialized check-then-set pattern is not thread-safe. If Initialize() is called concurrently from
multiple threads (e.g., from parallel test execution), both threads could see _initialized == false, call
AuditLogger.Initialize() twice, and potentially double-initialize the SDK.

Unless AuditLogger.Initialize() is already idempotent and thread-safe internally, would it make sense to guard
this with a lock statement or use Lazy<T> / Interlocked to ensure single initialization?

@haruna99
Copy link
Copy Markdown
Author

File: eng/validation/update-channel.ps1 (line ~10)

There appears to be a wrong path for dot-sourcing the audit module. The file is loaded as:

   . $PSScriptRoot\..\common\audit-logging.ps1

But audit-logging.ps1 lives in eng/validation/, not eng/common/. Since update-channel.ps1 is also in eng/validation/, shouldn't this be:

   . $PSScriptRoot\audit-logging.ps1

I think this would cause a script error at runtime when update-channel.ps1 runs, since the path eng/common/audit-logging.ps1 doesn't exist.

@davfost thanks for pointing this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants