Skip to content

refactor!: remove the EventEmitter interface from SessionPool#3643

Open
barjin wants to merge 11 commits into
v4from
refactor/remove-session-pool-events
Open

refactor!: remove the EventEmitter interface from SessionPool#3643
barjin wants to merge 11 commits into
v4from
refactor/remove-session-pool-events

Conversation

@barjin
Copy link
Copy Markdown
Member

@barjin barjin commented May 11, 2026

Removes extends EventEmitter from SessionPool. The event system was originally only used to retire browsers on expiring Sessions. BrowserCrawler now retires the browser at the end of Request processing, if the Session proves to be retired. This leads to a simpler API.

This refactor simplifies the SessionPool interface and allows us to drop the SessionPool references from different parts of the codebase.

Related to (prerequisite of) #3617

barjin added 3 commits May 11, 2026 14:22
…able

Previously, calling retire() bumped errorScore to maxErrorScore but a subsequent markGood() (e.g. the automatic markGood after a successful requestHandler that explicitly retired the session) could decrement the score back below the threshold, making the session usable again. Track retirement in a dedicated _retired flag checked by isUsable() so retire() is a true terminal state.
Replace the global EVENT_SESSION_RETIRED listener and the per-controller
browserSessionIds map with a check at the per-request cleanup hook: if
the session ended the request unusable, retire the browser controller.
The previous mechanism tore down browsers eagerly mid-flight; the new
one lets the in-flight request finish on the doomed browser and retires
it once the request is done. Same outcome, no global event subscription
needed.
SessionPool no longer extends EventEmitter and no longer fires a
sessionRetired event. The Session->SessionPool back-reference, the
sessionPool constructor option on Session, and the EVENT_SESSION_RETIRED
constant are gone with it. The only consumer of that event was the
browser crawler, which now retires browsers via the per-request context
pipeline cleanup. Custom createSessionFunction implementations that
manually constructed Session instances should drop the sessionPool
argument.
@barjin barjin requested a review from Copilot May 11, 2026 13:04
@barjin barjin self-assigned this May 11, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR is a breaking refactor that removes the EventEmitter-based event system from SessionPool and updates crawler logic to retire browser instances based on session usability at the end of request processing. This simplifies the SessionPool/Session relationship and reduces cross-package coupling, as described in the PR metadata and #3617 context.

Changes:

  • Remove SessionPool extends EventEmitter and drop EVENT_SESSION_RETIRED / sessionPool back-reference from Session.
  • Update BrowserCrawler to retire the current browser controller in a deferred cleanup when session.isUsable() is false.
  • Update tests and upgrade docs to reflect the new Session / SessionPool construction patterns.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/core/session_pool/session.test.ts Updates Session tests to not require a SessionPool and removes event-emission assertions.
test/core/session_pool/session_pool.test.ts Adjusts SessionPool tests for sessions no longer holding a sessionPool reference.
test/core/crawlers/puppeteer_crawler.test.ts Updates custom createSessionFunction to construct Session without sessionPool.
test/core/crawlers/browser_crawler.test.ts Updates browser-retirement test to rely on crawler end-of-request retirement behavior.
packages/core/src/session_pool/session.ts Removes EventEmitter validation, removes sessionPool option, adds _retired terminal state affecting isUsable().
packages/core/src/session_pool/session_pool.ts Removes EventEmitter inheritance and updates default session creation + load path accordingly.
packages/core/src/session_pool/index.ts Stops exporting ./events.js.
packages/core/src/session_pool/events.ts Deletes EVENT_SESSION_RETIRED.
packages/browser-crawler/src/internals/browser-crawler.ts Removes session-retired listener mechanism and retires browser controllers during deferred cleanup when session becomes unusable.
packages/basic-crawler/src/internals/basic-crawler.ts Updates default createSessionFunction signature and removes setMaxListeners call on SessionPool.
docs/upgrading/upgrading_v4.md Documents the breaking change and new recommended patterns.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/core/src/session_pool/session.ts
Comment thread packages/core/src/session_pool/session.ts Outdated
Comment thread packages/browser-crawler/src/internals/browser-crawler.ts
Comment thread packages/core/src/session_pool/session.ts
barjin added 7 commits May 12, 2026 08:59
Bail early when the session is already retired so subsequent
retire() calls don't keep inflating errorScore and usageCount.
Without this guard, the auto markGood() the crawler invokes after
a successful requestHandler that explicitly retired the session
triggers _maybeSelfRetire() -> retire() and double-bumps both
counters.
The previous block still referred to the removed sessionRetired
event emission. Rewrite it to describe the current behaviour: a
terminal retire state, idempotent, with markGood/markBad unable
to bring the session back.
The terminal _retired flag was not part of SessionState, so retired
sessions could resurrect after a SessionPool persist/restore cycle:
a previously retired session would be rebuilt without the flag, and
since the auto markGood() following a retire() leaves errorScore
below maxErrorScore, isUsable() would return true again.

Thread retired through SessionState, expose a public Session.retired
getter, accept retired in the constructor (ow shape + destructuring),
and emit it from getState() so the flag survives the round-trip.
The leading SessionPool argument was only useful to pass into the
Session constructor, which no longer keeps a back-reference to the
pool. CreateSession now takes a single options object — same shape
as before, just without the redundant first parameter.

Merge the two SessionPool-related sections in the v4 upgrade guide
into a single 'createSessionFunction signature has changed' entry
covering both the merged-options behaviour and the dropped argument.
The JSDoc on the createSessionFunction option still said the function
receives a SessionPool instance — that argument was just removed.
Match the naming of the other Session predicates (isBlocked,
isExpired, isMaxUsageCountReached, isUsable). The persisted
SessionState.retired field stays as-is — it's a noun-style
state field consistent with errorScore, usageCount, etc.
@barjin barjin requested review from B4nan, janbuchar and l2ysho May 12, 2026 09:04
Comment thread packages/browser-crawler/src/internals/browser-crawler.ts
… session

Address PR review: spell out the leak-prevention rationale (non-incognito
controllers carry cookies/storage across pages, so a no-longer-usable
session would taint whichever session lands on the controller next).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants