Skip to content

feat(audience-session): session lifecycle (SDK-145)#699

Merged
ImmutableJeffrey merged 19 commits into
mainfrom
feat/sdk-145-session
Apr 23, 2026
Merged

feat(audience-session): session lifecycle (SDK-145)#699
ImmutableJeffrey merged 19 commits into
mainfrom
feat/sdk-145-session

Conversation

@ImmutableJeffrey

@ImmutableJeffrey ImmutableJeffrey commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds Runtime/Core/Session.cs — pure C# session lifecycle emitting session_start on Init (or consent upgrade from None), session_heartbeat every 60s while foreground, and session_end on Shutdown or Resume after a pause longer than 30s. durationSec is engagement time only (excludes pause).
  • Wires ImmutableAudience.Init / SetConsent / Shutdown / ResetState to create, swap, and dispose the active Session under _initLock. Routes OnPause / OnResume from the Unity lifecycle bridge to the session. Reset ends the old session silently and starts a new one, matching Web SDK reset().
  • Adds internal TrackDelegate for the Session's _track callback.
  • Hardens Log.Emit with try/catch so a throwing Writer cannot escape SafeTrack and kill the Timer thread.
  • Renames wire field durationdurationSec on session_heartbeat / session_end to match the Unity SDK Event Reference v1.
  • Grows the test suite from 180 to 220 — heartbeat quiescence while paused, extended-pause rollover, clock-rewind clamps, exception containment on _track and PerformanceSnapshotProvider, double-Start timer drain, concurrent consent-upgrade-from-None stress, concurrent Init+SetConsent consistency, Reset-starts-new-session.

Linear: SDK-145


Note

Medium Risk
Adds a new session lifecycle with background heartbeats and refactors Init/SetConsent/Reset/Shutdown locking and teardown ordering; mistakes here could cause event leaks, timer/thread issues, or missed/duplicated session events under concurrency.

Overview
Adds a new Session component that emits session_start, periodic session_heartbeat, and session_end, with engagement-time durationSec (pause-adjusted), heartbeat quiescence while paused, and rollover to a new session after extended pauses.

Wires session creation/teardown into ImmutableAudience so sessions start on Init (or consent upgrade from None), end on Shutdown, and Reset() swaps to a fresh session while purging the queued session_end for Web-SDK parity; also adds internal OnPause/OnResume routing and restructures Shutdown/SetConsent/ResetState to release _initLock before blocking teardown work.

Hardens logging (Log.Emit) and session tracking/snapshot callbacks to swallow exceptions (especially on timer threads), and expands tests substantially to cover lifecycle ordering, pause/clock edge cases, exception containment, and concurrency stress around consent/session state.

Reviewed by Cursor Bugbot for commit ebe26c8. Bugbot is set up for automated code reviews on this repo. Configure here.

@ImmutableJeffrey ImmutableJeffrey requested review from a team as code owners April 23, 2026 04:53
@ImmutableJeffrey ImmutableJeffrey force-pushed the feat/sdk-145-session branch 4 times, most recently from 8715768 to cf7d0ca Compare April 23, 2026 05:22
Comment thread src/Packages/Audience/Runtime/ImmutableAudience.cs Outdated
ImmutableJeffrey added a commit that referenced this pull request Apr 23, 2026
Shutdown and Reset held _initLock across every blocking step — session
heartbeat drain (1s), send-timer drain (2s), queue drain-thread join
(5s), and the final send (2s by default). Worst-case ~10s. Any caller
racing on the lock (SetConsent from a UI thread, a follow-up Init)
waited for the full budget.

Split both methods into two phases:

- Phase 1 under _initLock: flip _initialized, capture disposables into
  locals, null out the statics. Nanoseconds.
- Phase 2 outside _initLock: dispose session, drain timer, shut down
  queue, flush transport, cancel cts, dispose everything. Same order
  as before, same idempotency on re-entry — callers arriving after
  the flag flip see _initialized == false and return cleanly.

ResetState used to wrap Shutdown in its own _initLock (relying on
Monitor recursion). With Shutdown now releasing the lock before its
teardown, the outer wrap would have re-stranded waiters for the full
Phase 2 budget. Dropped to Shutdown() + a brief post-lock for the
defensive _consent/_session reset.

Addresses #699 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ImmutableJeffrey added a commit that referenced this pull request Apr 23, 2026
… before Phase 2

Locks a Shutdown inside its Phase 2 final flush via a BlockingHandler
that never returns from SendAsync, then calls Reset on the main
thread with a stopwatch. Pre-refactor, Reset would wait on _initLock
for the full ShutdownFlushTimeoutMs (up to 10s in this test's setup).
Post-refactor, Reset acquires the lock immediately, sees
!_initialized, and returns in microseconds.

Asserts Reset completes in under 500ms — generous margin to avoid CI
flakes while still catching a regression that re-introduces a
seconds-scale hold.

Addresses the reviewer's concern from PR #699 and guards against
future changes that move blocking work back inside the lock.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread src/Packages/Audience/Runtime/ImmutableAudience.cs
Comment thread src/Packages/Audience/Runtime/ImmutableAudience.cs
Comment thread src/Packages/Audience/Runtime/ImmutableAudience.cs Outdated
Comment thread src/Packages/Audience/Runtime/ImmutableAudience.cs

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit c43044a. Configure here.

Comment thread src/Packages/Audience/Runtime/Core/Session.cs
@ImmutableJeffrey ImmutableJeffrey enabled auto-merge (rebase) April 23, 2026 11:31
ImmutableJeffrey and others added 14 commits April 23, 2026 21:32
Unity session lifecycle with engagement-aware duration, crash-safe
heartbeats, and thread-safe integration with ImmutableAudience.

- Session class: fires session_start on Init, session_heartbeat every
  60s while foreground, session_end on Shutdown or Resume after >30s
  pause. duration excludes pause time.
- ImmutableAudience: _session field, serialised under _initLock across
  Init / SetConsent / Shutdown / ResetState. OnPause / OnResume route
  to Session on the Unity lifecycle bridge.
- SetConsent upgrade from None starts a new Session under the lock.
- SetConsent downgrade to None disposes the Session; session_end is
  gated out by CanTrack so no post-revocation event leaks.
- Shutdown drains the heartbeat timer before emitting session_end so
  wire ordering holds.
- Log.Emit swallows Writer exceptions so SafeTrack stays safe on the
  Timer thread.
- EnqueueChecked gate closes the Track-races-SetConsent(None) purge
  window.

Tests cover heartbeat quiescence while paused, extended-pause
rollover, clock rewind clamps, exception containment on _track and
performance-snapshot, double-Start timer drain, concurrent SetConsent
upgrade, concurrent Init+SetConsent consistency.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace narrative comment blocks with terse fact-headers.

- Session.cs: class header 13→3 lines; method headers condensed; inline
  rationale kept only where non-obvious.
- ImmutableAudience.cs: Init/Shutdown/SetConsent/ResetState narratives
  condensed; redundant blocks removed.
- Log.cs: Emit swallow rationale condensed.

No semantic changes. All 218 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace Action<string, Dictionary<string, object>> in the Session
constructor with a named internal TrackDelegate.

Self-documents the calling convention (parameter names visible in IDE)
and makes the Session API easier to skim.

Test mocks convert via method-group → compatible delegate; no test changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Align with the Unity SDK Event Reference (v1), which specifies
session_heartbeat and session_end properties as durationSec. The
explicit unit suffix matches the fpsAvg / memoryUsedMb style on the
same event and makes the units unambiguous on the wire.

Web SDK still emits "duration"; dashboards comparing surfaces must
join on the Unity-specific key.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Start / End / Dispose are not reentrant-safe from multiple threads.
Caller (ImmutableAudience) serialises via _initLock. Pin the contract
in the class-level comment so a future caller doesn't assume the
public-surface thread-safety guarantee extends to Start.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The prior comment named only the live-pause path (End fires while
paused). The test that motivates the guard exercises the Resume path,
where Resume's pauseDuration (not the live pause) shrinks. Mention
both so a future reader can map either test to the rationale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Session.SafeTrack and SafePerformanceSnapshot log warnings when the
studio-supplied _track or performance-snapshot delegate throws. The
exception type is diagnostic; the message is consumer-supplied and
may contain PII (e.g., a track impl that includes player-data in its
exception text).

Log only the exception type name. Studios can attach a debugger for
the full message if needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The double-Pause guard silently returned. A studio seeing unexpected
engagement numbers had no way to confirm a spurious lifecycle-bridge
Pause was the cause.

Log.Debug (gated by config.Debug) surfaces the event during integration
without spamming production logs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Prior Reset cleared userId and purged the queue but kept the old
Session running. Subsequent Track events carried the old sessionId
with a new anonymousId — a confusing "session straddles two users"
wire state.

Match Web SDK reset() behaviour:
- Dispose old session (session_end is enqueued then wiped by PurgeAll,
  matching Web SDK's silent teardown).
- Reset identity on disk (next GetOrCreate mints a fresh anonymousId).
- Mint a new Session when consent allows tracking; fire session_start
  via Start() outside _initLock.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sion_end

Pin the M2-A1 fix (commit dc286a0e): Reset must end the old session
silently, mint a fresh anonymousId, and fire session_start for the
new session. Matches Web SDK reset().

- Reset_StartsNewSession_DoesNotEmitSessionEnd: new session_start present,
  no session_end on disk, new anonymousId differs from the first.
- Reset_ConsentNone_DoesNotStartSession: at None consent, Reset must not
  spin up a new session (no session_start emitted).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rewrites jargon-heavy comments (off-thread, serialise, wire-ordered,
fence, CAS, Timer thread) into plain language that reads without a
threading background.

No code changes. All 220 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ocals

The tests project does not enable nullable reference types, so
`Exception? initEx = null` triggered CS8632 "The annotation for
nullable reference types should only be used in code within a
'#nullable' annotations context."

Drop the `?` to match the file's and project's existing convention
(other tests write e.g. `string capturedKey = null`). Behaviour is
unchanged — the locals are still null until the catch writes to them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Shutdown and Reset held _initLock across every blocking step — session
heartbeat drain (1s), send-timer drain (2s), queue drain-thread join
(5s), and the final send (2s by default). Worst-case ~10s. Any caller
racing on the lock (SetConsent from a UI thread, a follow-up Init)
waited for the full budget.

Split both methods into two phases:

- Phase 1 under _initLock: flip _initialized, capture disposables into
  locals, null out the statics. Nanoseconds.
- Phase 2 outside _initLock: dispose session, drain timer, shut down
  queue, flush transport, cancel cts, dispose everything. Same order
  as before, same idempotency on re-entry — callers arriving after
  the flag flip see _initialized == false and return cleanly.

ResetState used to wrap Shutdown in its own _initLock (relying on
Monitor recursion). With Shutdown now releasing the lock before its
teardown, the outer wrap would have re-stranded waiters for the full
Phase 2 budget. Dropped to Shutdown() + a brief post-lock for the
defensive _consent/_session reset.

Addresses #699 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…set's Identity.Reset

Extends acc3bab to the remaining _initLock callers that still held
blocking work:

SetConsent held _initLock through Session.Dispose (heartbeat drain,
up to 1s on Full→None), ConsentStore.Save (disk), PurgeAll (disk I/O
under _drainLock), Identity.Reset (File.Delete), and
ApplyAnonymousDowngrade (synchronous file rewrite). Same two-phase
shape as Shutdown: flip _consent and swap _session / _userId under
the lock; persist, dispose, purge, downgrade, sync, and start outside.

Reset still held Identity.Reset inside the lock. Hoisted for symmetry
with Shutdown / SetConsent. Order in Phase 2 is Dispose → PurgeAll →
Identity.Reset → newSession.Start, matching the in-lock sequence it
replaces (session_end is enqueued then wiped; new session_start runs
against the fresh anonymousId).

Updated the class-level memory-model comment with one paragraph
naming the four lifecycle methods and why their hold time is now
nanoseconds, not seconds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ImmutableJeffrey and others added 5 commits April 23, 2026 21:32
… before Phase 2

Locks a Shutdown inside its Phase 2 final flush via a BlockingHandler
that never returns from SendAsync, then calls Reset on the main
thread with a stopwatch. Pre-refactor, Reset would wait on _initLock
for the full ShutdownFlushTimeoutMs (up to 10s in this test's setup).
Post-refactor, Reset acquires the lock immediately, sees
!_initialized, and returns in microseconds.

Asserts Reset completes in under 500ms — generous margin to avoid CI
flakes while still catching a regression that re-introduces a
seconds-scale hold.

Addresses the reviewer's concern from PR #699 and guards against
future changes that move blocking work back inside the lock.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…in Shutdown

The two-phase refactor broke Shutdown_FiresSessionEnd. Phase 1 flipped
_initialized to false, then Phase 2 called session.Dispose(). Session
fires session_end through ImmutableAudience.Track, which guards on
CanTrack() = _initialized && _consent.CanTrack(). With _initialized
already false, Track dropped the event on the floor.

Added Session.EmitEndAndSeal — an emit-only variant of End() that
fires session_end and resets state but does not drain the heartbeat
timer. Shutdown calls it under _initLock before the flag flip, so
Track's gate still lets session_end through. Heartbeat timer drain
continues to happen in Phase 2 via session.Dispose(); the second
End() inside Dispose no-ops because EmitEndAndSeal already cleared
_sessionId.

Lock hold time is unchanged: EmitEndAndSeal takes Session._lock for
microseconds to compute duration and reset state, then releases
before calling SafeTrack → Track → Enqueue (which takes EventQueue's
_drainLock, also microseconds). Nothing blocking runs under _initLock.

Also drops the column-aligned capture block in Shutdown Phase 1 —
dotnet-format rejects the extra whitespace, so one assignment per line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… inside it

Cursor Bugbot flagged that EmitEndAndSeal's Track call chain (Identity
GetOrCreate → Enqueue → _drainLock) was running under _initLock in
20e5e86, contradicting the "nanoseconds" claim and re-introducing the
exact concern nattb8 raised in the original review.

Move _session?.EmitEndAndSeal() to before the lock acquire. Idempotent
via Session._sessionId reset — safe to call without outer serialisation.
_initLock now covers only the flag flip and reference captures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eal in Shutdown

Two Cursor Bugbot findings on 63b13bb:

SetConsent held _initLock across Identity.GetOrCreate / Identity.Get.
On the None → Anonymous/Full upgrade path, GetOrCreate creates
directories and writes a UUID file — real disk I/O under the lock.
Moved the capture outside _initLock. Identity's own _sync lock
serialises the I/O; TOCTOU on previous is accepted — a racing
SetConsent fires its own PUT and our slightly-stale anonymousId
still identifies the user.

Shutdown's outside-lock EmitEndAndSeal targeted whatever _session
pointed to at call time. A Reset completing fully (including new
session Start()) in the narrow window before Shutdown's lock acquire
would leave the new session's session_start on the wire with no
matching session_end — Shutdown disposes the new session after
flipping _initialized=false, so Track drops the closing event.
Added a second EmitEndAndSeal inside the lock as a race guard.
Idempotent on the same instance (no-op via _sessionId null check);
the slow path only runs when the race actually happened.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cursor Bugbot finding on c43044a.

Resume() clamps pauseDuration ≥ 0 for wall-clock rewinds, but the
mirror calculation in ComputeEngagedSecondsLocked did not. If the
clock rewinds while the session is still paused and End fires (e.g.
Shutdown while backgrounded with NTP pulling time back), livePause
goes negative. Being subtracted, it inflated engagedSeconds past the
wall-clock window. The final engagedSeconds ≥ 0 clamp caught
negatives but not over-credit.

Mirror the Resume-side clamp: livePause < 0 → 0.

Added End_ClockRewindsWhilePaused_DoesNotInflateDuration as a
regression guard. 10s before pause, clock rewinds 5s while paused,
End fires — duration must stay within the 5s wall-clock window.
Without the clamp the test reports 15s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ImmutableJeffrey ImmutableJeffrey merged commit 8a52e2b into main Apr 23, 2026
19 checks passed
ImmutableJeffrey added a commit that referenced this pull request Apr 23, 2026
Shutdown and Reset held _initLock across every blocking step — session
heartbeat drain (1s), send-timer drain (2s), queue drain-thread join
(5s), and the final send (2s by default). Worst-case ~10s. Any caller
racing on the lock (SetConsent from a UI thread, a follow-up Init)
waited for the full budget.

Split both methods into two phases:

- Phase 1 under _initLock: flip _initialized, capture disposables into
  locals, null out the statics. Nanoseconds.
- Phase 2 outside _initLock: dispose session, drain timer, shut down
  queue, flush transport, cancel cts, dispose everything. Same order
  as before, same idempotency on re-entry — callers arriving after
  the flag flip see _initialized == false and return cleanly.

ResetState used to wrap Shutdown in its own _initLock (relying on
Monitor recursion). With Shutdown now releasing the lock before its
teardown, the outer wrap would have re-stranded waiters for the full
Phase 2 budget. Dropped to Shutdown() + a brief post-lock for the
defensive _consent/_session reset.

Addresses #699 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ImmutableJeffrey ImmutableJeffrey deleted the feat/sdk-145-session branch April 23, 2026 11:38
ImmutableJeffrey added a commit that referenced this pull request Apr 23, 2026
… before Phase 2

Locks a Shutdown inside its Phase 2 final flush via a BlockingHandler
that never returns from SendAsync, then calls Reset on the main
thread with a stopwatch. Pre-refactor, Reset would wait on _initLock
for the full ShutdownFlushTimeoutMs (up to 10s in this test's setup).
Post-refactor, Reset acquires the lock immediately, sees
!_initialized, and returns in microseconds.

Asserts Reset completes in under 500ms — generous margin to avoid CI
flakes while still catching a regression that re-introduces a
seconds-scale hold.

Addresses the reviewer's concern from PR #699 and guards against
future changes that move blocking work back inside the lock.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants