[scheduler] Add scheduling mode: E-PVM#2398
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughThe PR integrates Redis-backed resource accounting and placement strategy scoring into the scheduler, replacing prior resource accounting services. Facility identifiers become strings throughout the data model. The host cache checkout API changes to accept placement profiles and scoring gates, supporting both saturation (first-fit) and E-PVM (lowest-score) strategies. Cluster feed streaming is refactored with control-flow resilience and round-trip timing. Matcher logic now performs accounting prechecks and constructs placement contexts for strategy-aware host selection. ChangesScheduler-managed accounting and placement integration
Sequence Diagram(s)sequenceDiagram
participant Matcher as Matcher Layer<br/>Processing
participant Redis as Redis<br/>Accounting
participant HostCache as Host Cache<br/>Actor
participant HostDB as Host DB<br/>Lookup
Matcher->>Redis: read job cores_in_use
Redis-->>Matcher: job_cores_in_use (or 0)
Matcher->>Redis: read show/alloc subscription
Redis-->>Matcher: (int_cores, burst)
Matcher->>Matcher: compute profile (burst,<br/>cores_in_use, weights)
Matcher->>Matcher: select Saturation<br/>or Epvm strategy
alt pre-checkout burst limit exceeded
Matcher->>Matcher: emit FeedMessage::Sleep
Matcher->>Redis: increment<br/>accounting_limit_exceeded
Note over Matcher: Early return, no host checkout
else proceed to checkout
Matcher->>HostCache: send CheckOut<br/>(profile, gate, facility_id)
HostCache->>HostDB: fetch candidate hosts<br/>by show/facility/tag
HostDB-->>HostCache: candidate list
alt Saturation strategy
HostCache->>HostCache: first valid host<br/>gate(host, profile)=Some()
else Epvm strategy
HostCache->>HostCache: K-best by lowest score<br/>min(gate(host, profile))
HostCache->>HostCache: CAS removal with retries
Note over HostCache: Record inner-retries metric
end
HostCache-->>Matcher: Host or error
Matcher->>Matcher: dispatch frames,<br/>track consumption
Matcher->>Matcher: update booked counters<br/>for next iteration
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 14
🧹 Nitpick comments (4)
rust/crates/scheduler/src/config/mod.rs (1)
101-105: 💤 Low valueConsider adding authentication support to Redis URL construction.
The
url()method constructs a basicredis://host:port/URL without authentication. If production deployments require Redis AUTH, this method may need enhancement to support username/password or connection string overrides.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rust/crates/scheduler/src/config/mod.rs` around lines 101 - 105, The RedisConfig::url() method currently returns "redis://host:port/" with no auth support; update RedisConfig (struct) to include optional credentials (e.g., username: Option<String>, password: Option<String>) or an optional raw_connection_string: Option<String>, and modify RedisConfig::url() to: if raw_connection_string is Some return it; otherwise build the URL including "redis://username:password@host:port/" when credentials are present (omit username if only password needed per Redis URI rules) and percent-encode credentials as necessary to avoid malformed URLs. Ensure you reference the RedisConfig struct and the url() method so callers keep backwards compatibility by keeping fields optional.rust/crates/scheduler/src/pipeline/matcher.rs (1)
306-312: ⚡ Quick winHandle
FeedMessage::Sleepsend failures explicitly.Ignoring the send result hides closed-channel failures; then this over-burst path returns without applying any backoff locally.
Suggested patch
- let _ = feed_sender - .send(FeedMessage::Sleep( - (*cluster).clone(), - CONFIG.queue.cluster_empty_sleep, - )) - .await; + if let Err(err) = feed_sender + .send(FeedMessage::Sleep( + (*cluster).clone(), + CONFIG.queue.cluster_empty_sleep, + )) + .await + { + debug!( + "Failed to enqueue cluster sleep for {} (over-burst path): {}", + cluster, err + ); + tokio::time::sleep(CONFIG.queue.cluster_empty_sleep).await; + } return;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rust/crates/scheduler/src/pipeline/matcher.rs` around lines 306 - 312, The code currently ignores the result of feed_sender.send(FeedMessage::Sleep((*cluster).clone(), CONFIG.queue.cluster_empty_sleep)).await; change this to handle the Result explicitly: if send returns Ok proceed as before, but if Err (channel closed) log the failure (include context like cluster and that FeedMessage::Sleep failed) and apply a local backoff before returning (e.g., await tokio::time::sleep(Duration::from_millis(CONFIG.queue.cluster_empty_sleep)) or equivalent) so the over-burst path does not immediately return without backoff; update matcher.rs around the feed_sender send call to match on the Result and handle Err accordingly.rust/crates/scheduler/src/dao/resource_accounting_dao.rs (1)
149-152: 💤 Low valueRedundant
p.pk_show = ANY($1)filter in point recompute query.Line 151 applies
p.pk_show = ANY($1)in the LEFT JOIN condition, butp.pk_show = pt2.pk_showis already constrained by the join, andpt2.pk_show = ANY($1)at line 152 already filters the point rows. The filter at line 151 is redundant.Not a correctness issue, but removing it would simplify the query slightly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rust/crates/scheduler/src/dao/resource_accounting_dao.rs` around lines 149 - 152, Remove the redundant p.pk_show = ANY($1) predicate from the LEFT JOIN on proc (alias p) in the point recompute SQL: the join already enforces p.pk_show = pt2.pk_show and the WHERE clause pt2.pk_show = ANY($1) filters pt2, so delete the p.pk_show = ANY($1) condition in the LEFT JOIN to simplify the query; update the SQL string in resource_accounting_dao.rs where the LEFT JOIN proc p ... AND p.pk_show = ANY($1) appears and ensure spacing/AND chaining remains valid after removal.pycue/tests/wrappers/test_show.py (1)
191-201: ⚡ Quick winCover both toggle directions in the RPC assertion.
testSetSchedulerManagedcurrently validates onlyenabled=True. Addingenabled=Falsein the same test tightens contract coverage for the new toggle RPC with minimal effort.♻️ Suggested test update
def testSetSchedulerManaged(self, getStubMock): stubMock = mock.Mock() stubMock.SetSchedulerManaged.return_value = show_pb2.ShowSetSchedulerManagedResponse() getStubMock.return_value = stubMock show = opencue.wrappers.show.Show(show_pb2.Show(name=TEST_SHOW_NAME)) - show.setSchedulerManaged(True) - - stubMock.SetSchedulerManaged.assert_called_with( - show_pb2.ShowSetSchedulerManagedRequest(show=show.data, enabled=True), - timeout=mock.ANY) + show.setSchedulerManaged(True) + show.setSchedulerManaged(False) + + stubMock.SetSchedulerManaged.assert_any_call( + show_pb2.ShowSetSchedulerManagedRequest(show=show.data, enabled=True), + timeout=mock.ANY) + stubMock.SetSchedulerManaged.assert_any_call( + show_pb2.ShowSetSchedulerManagedRequest(show=show.data, enabled=False), + timeout=mock.ANY)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pycue/tests/wrappers/test_show.py` around lines 191 - 201, The test only asserts the RPC for enabling the scheduler; update testSetSchedulerManaged to exercise both toggle directions by invoking show.setSchedulerManaged(True) and show.setSchedulerManaged(False) (or call once and then again) and assert the stubMock.SetSchedulerManaged was called for both cases using either two assert_called_with checks referencing show_pb2.ShowSetSchedulerManagedRequest(show=show.data, enabled=True) and ...enabled=False) or a single assert_has_calls with both expected requests; target symbols: testSetSchedulerManaged, show.setSchedulerManaged, stubMock.SetSchedulerManaged, and ShowSetSchedulerManagedRequest.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cuebot/build.gradle`:
- Around line 41-44: Update the Lettuce dependency to a non-vulnerable release
by changing the dependency declaration for io.lettuce:lettuce-core (currently
'6.3.2.RELEASE') to at least '6.5.1.RELEASE' (preferably the latest, e.g.
'6.7.1.RELEASE'); keep the explicit reactor-core override for
io.projectreactor:reactor-core at the 3.6.x line (e.g. '3.6.5') so Lettuce
compatibility is preserved and remove/replace the old version reference wherever
io.lettuce:lettuce-core '6.3.2.RELEASE' is declared.
In
`@cuebot/src/main/resources/conf/ddl/postgres/migrations/V44__scheduler_pending_query_indexes.sql`:
- Line 20: The manual backfill example in
V44__scheduler_pending_query_indexes.sql incorrectly inserts version 40 into
flyway_schema_history; update the VALUES tuple to use 44 to match this migration
(ensure the INSERT INTO flyway_schema_history(... ) VALUES (..., 44, ... )
reflects the V44__scheduler_pending_query_indexes.sql migration) so the
flyway_schema_history row matches the migration version and prevents poisoning
subsequent runs.
In `@docs/_docs/developer-guide/redis-accounting.md`:
- Around line 247-253: The two unlabeled fenced code blocks that contain the
numbered acct booking steps and the cueadmin command examples should be given
explicit languages to satisfy markdownlint MD040: change the fence wrapping the
numbered list (the block starting "1. Read current state of acct:sub /
acct:folder / acct:job / acct:layer / acct:point") to use a text label (e.g.,
```text) and change the fence wrapping the cueadmin examples (the block with
"cueadmin -show <name> -setSchedulerManaged true/false") to use a shell label
(e.g., ```bash); update both occurrences (the one around the acct steps and the
one around the cueadmin commands) accordingly.
- Line 14: The heading "How Cuebot and the Rust scheduler coordinate per-show
resource accounting through Redis" is using ### causing a level jump from # to
### and failing MD001; change that heading token from "###" to "##" so it
follows the top-level "#" and fixes the markdown lint error (update the heading
line where that exact text appears).
In `@docs/_docs/developer-guide/scheduler.md`:
- Around line 648-654: The fenced code block listing the scheduler steps
(starting "1. Read current state of acct:sub / acct:folder / acct:job /
acct:layer / acct:point" through "5. Then transactionally INSERT proc in
Postgres (outside Lua)") is unlabeled and triggers MD040; add a language label
to the opening triple-backtick (e.g., ```text) to satisfy the linter and
preserve formatting so the block is treated as a code/preformatted block.
In `@docs/_docs/getting-started/deploying-scheduler.md`:
- Around line 262-265: The docs contain inconsistent cueadmin usage: keep the
correct form "cueadmin -scheduler-managed <SHOW> ON|OFF" and replace any
incorrect examples using "-show <name> -setSchedulerManaged true|false" (look
for occurrences of the flags -setSchedulerManaged and -show in docs, e.g.,
developer-guide scheduler and redis-accounting sections); update those examples
to use the exact flag and value format "-scheduler-managed <SHOW> ON" or "OFF"
so all docs match the working CLI syntax.
In `@pycue/opencue/wrappers/show.py`:
- Around line 223-235: setSchedulerManaged currently calls
stub.SetSchedulerManaged but doesn't update the wrapper's local state, so
subsequent schedulerManaged() reads can be stale; after calling
SetSchedulerManaged (in setSchedulerManaged), assign the returned value to
self.data.scheduler_managed (or set self.data.scheduler_managed = value if the
RPC returns no body) using the response from stub.SetSchedulerManaged to
synchronize the wrapper, ensuring schedulerManaged() reflects the new state.
In `@rust/crates/scheduler/src/accounting/dao.rs`:
- Around line 146-157: The query_managed_show_ids function currently maps DB
rows using parse_uuid which unwraps and will panic on malformed UUIDs; change
the handling to use a fallible parse instead: update parse_uuid (in
dao/helpers.rs) to return Result<Uuid, Report> or provide a new try_parse_uuid
that calls Uuid::parse_str and returns an Err on failure, then modify
query_managed_show_ids (and other callers) to collect a Result<Vec<Uuid>, _> or
filter/log and skip invalid rows rather than calling unwrap—propagate the error
with wrap_err for diagnostics or log and skip malformed rows so a single corrupt
DB value cannot crash the scheduler.
In `@rust/crates/scheduler/src/accounting/recompute.rs`:
- Around line 46-52: The current recompute task exits permanently if
ResourceAccountingDao::new().await returns Err, causing a long-lived outage;
change the initialization to retry instead of returning: wrap the call to
ResourceAccountingDao::new().await (the code creating pg_dao/Arc::new(d)) in a
retry loop with exponential backoff (or fixed delay) and error logging, continue
retrying until success, and only proceed to create pg_dao and enter the
recompute loop after a successful Result; ensure any temporary failures do not
cause an early return from the recompute task.
In `@rust/crates/scheduler/src/accounting/redis_client.rs`:
- Around line 142-144: Update the unit wording in the Redis client docs: change
"centicores" to "cores" in the doc comment that begins "Reads the subscription
hash's booked cores + burst in one round-trip..." and the other similar doc
comment around the subsequent block (the occurrences currently at the same doc
block and the one at lines ~161-162). Ensure both doc comments and any inline
mention in redis_client.rs consistently state that Redis counters operate in
cores (not centicores) so the Lua contract and callers are not misled.
- Line 155: The call currently uses conn.hget(&key, &["int_cores",
"burst"]).await? which is a single-field read API but is passing multiple
fields; change this to the multi-field API conn.hmget(&key, &["int_cores",
"burst"]).await? (keeping the expected type Vec<Option<i64>> and the surrounding
error handling) so the Redis client sends HMGET for the fields "int_cores" and
"burst" and maps into values as before.
In `@rust/crates/scheduler/src/cluster.rs`:
- Around line 418-423: The code can panic when current_index_atomic contains an
out-of-range value after self.clusters shrinks; before indexing clusters use the
current clusters length to clamp the index (and handle empty clusters).
Specifically, in the block around current_index_atomic.load, get let len =
clusters.len(); if len == 0 return/skip; compute a safe_index = current_index %
len (or safe_index = min(current_index, len-1)), then read item =
clusters[safe_index].clone(); compute next_index = (safe_index + 1) % len and
store that back into current_index_atomic; this prevents indexing with stale,
too-large indices while preserving wrap-around detection and forward progress.
In `@rust/crates/scheduler/src/models/core_size.rs`:
- Around line 68-71: The CoreSize conversion functions
(CoreSize::from_multiplied and CoreSize::from_multiplied_cap) perform unchecked
i64 -> i32 narrowing using `as i32`, which can silently wrap; change them to use
a checked conversion (e.g., try_into / i32::try_from) or explicit saturation
with bounds checking against i32::MIN/i32::MAX and handle failures by returning
a Result or clamping to safe limits. Update the functions that reference
CONFIG.queue.core_multiplier to compute the intermediate i64, check for
overflow/underflow before converting, and either return an error (propagate
Result) or clamp to i32 bounds and document the behavior so no silent wrapping
occurs.
In `@rust/crates/scheduler/src/pipeline/dispatcher/actor.rs`:
- Around line 482-491: BookingDelta is being built with DispatchLayer.folder_id
and dept_id that may be Uuid::nil(), which would produce incorrect
acct:folder:{folder_id} and acct:point:{dept_id}:{show_id} Redis keys; add an
explicit nil-UUID guard before constructing BookingDelta (best place: inside
dispatch_virtual_proc right before creating BookingDelta or at the callsite in
dispatch_inner/consume_host_virtual_resources) that checks
DispatchLayer.folder_id.is_nil() and DispatchLayer.dept_id.is_nil() and
rejects/returns an Err (or skips booking) with a logged error; ensure the check
references the DispatchLayer fields and prevents creating BookingDelta when
either UUID is nil so no acct:folder or acct:point keys are written for the zero
UUID.
---
Nitpick comments:
In `@pycue/tests/wrappers/test_show.py`:
- Around line 191-201: The test only asserts the RPC for enabling the scheduler;
update testSetSchedulerManaged to exercise both toggle directions by invoking
show.setSchedulerManaged(True) and show.setSchedulerManaged(False) (or call once
and then again) and assert the stubMock.SetSchedulerManaged was called for both
cases using either two assert_called_with checks referencing
show_pb2.ShowSetSchedulerManagedRequest(show=show.data, enabled=True) and
...enabled=False) or a single assert_has_calls with both expected requests;
target symbols: testSetSchedulerManaged, show.setSchedulerManaged,
stubMock.SetSchedulerManaged, and ShowSetSchedulerManagedRequest.
In `@rust/crates/scheduler/src/config/mod.rs`:
- Around line 101-105: The RedisConfig::url() method currently returns
"redis://host:port/" with no auth support; update RedisConfig (struct) to
include optional credentials (e.g., username: Option<String>, password:
Option<String>) or an optional raw_connection_string: Option<String>, and modify
RedisConfig::url() to: if raw_connection_string is Some return it; otherwise
build the URL including "redis://username:password@host:port/" when credentials
are present (omit username if only password needed per Redis URI rules) and
percent-encode credentials as necessary to avoid malformed URLs. Ensure you
reference the RedisConfig struct and the url() method so callers keep backwards
compatibility by keeping fields optional.
In `@rust/crates/scheduler/src/dao/resource_accounting_dao.rs`:
- Around line 149-152: Remove the redundant p.pk_show = ANY($1) predicate from
the LEFT JOIN on proc (alias p) in the point recompute SQL: the join already
enforces p.pk_show = pt2.pk_show and the WHERE clause pt2.pk_show = ANY($1)
filters pt2, so delete the p.pk_show = ANY($1) condition in the LEFT JOIN to
simplify the query; update the SQL string in resource_accounting_dao.rs where
the LEFT JOIN proc p ... AND p.pk_show = ANY($1) appears and ensure spacing/AND
chaining remains valid after removal.
In `@rust/crates/scheduler/src/pipeline/matcher.rs`:
- Around line 306-312: The code currently ignores the result of
feed_sender.send(FeedMessage::Sleep((*cluster).clone(),
CONFIG.queue.cluster_empty_sleep)).await; change this to handle the Result
explicitly: if send returns Ok proceed as before, but if Err (channel closed)
log the failure (include context like cluster and that FeedMessage::Sleep
failed) and apply a local backoff before returning (e.g., await
tokio::time::sleep(Duration::from_millis(CONFIG.queue.cluster_empty_sleep)) or
equivalent) so the over-burst path does not immediately return without backoff;
update matcher.rs around the feed_sender send call to match on the Result and
handle Err accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7a7aa165-b9e9-4987-a82f-97ec0dd9c477
📒 Files selected for processing (82)
.gitignoreVERSION.incueadmin/cueadmin/common.pycueadmin/tests/test_common.pycuebot/build.gradlecuebot/src/main/java/com/imageworks/spcue/ShowEntity.javacuebot/src/main/java/com/imageworks/spcue/VirtualProc.javacuebot/src/main/java/com/imageworks/spcue/config/AppConfig.javacuebot/src/main/java/com/imageworks/spcue/dao/ShowDao.javacuebot/src/main/java/com/imageworks/spcue/dao/postgres/DispatchQuery.javacuebot/src/main/java/com/imageworks/spcue/dao/postgres/DispatcherDaoJdbc.javacuebot/src/main/java/com/imageworks/spcue/dao/postgres/ProcDaoJdbc.javacuebot/src/main/java/com/imageworks/spcue/dao/postgres/ShowDaoJdbc.javacuebot/src/main/java/com/imageworks/spcue/dao/postgres/WhiteboardDaoJdbc.javacuebot/src/main/java/com/imageworks/spcue/servant/ManageShow.javacuebot/src/main/java/com/imageworks/spcue/service/AccountingRedisPublisher.javacuebot/src/main/java/com/imageworks/spcue/service/LettuceAccountingRedisPublisher.javacuebot/src/main/resources/conf/ddl/postgres/migrations/V44__scheduler_pending_query_indexes.sqlcuebot/src/main/resources/conf/ddl/postgres/migrations/V45__show_scheduler_managed.sqlcuebot/src/main/resources/conf/spring/applicationContext-accounting.xmlcuebot/src/main/resources/opencue.propertiescuebot/src/test/java/com/imageworks/spcue/config/TestAppConfig.javacuebot/src/test/java/com/imageworks/spcue/test/dao/postgres/DispatcherDaoTests.javacuebot/src/test/java/com/imageworks/spcue/test/dao/postgres/ProcDaoTests.javacuebot/src/test/java/com/imageworks/spcue/test/dao/postgres/ShowDaoTests.javacuebot/src/test/java/com/imageworks/spcue/test/servant/ManageShowTests.javacuebot/src/test/java/com/imageworks/spcue/test/service/LettuceAccountingRedisPublisherTests.javadocs/_docs/developer-guide/redis-accounting.mddocs/_docs/developer-guide/scheduler.mddocs/_docs/getting-started/deploying-scheduler.mddocs/news/2025-12-12-distributed-scheduler-release.mdproto/src/show.protopycue/opencue/wrappers/show.pypycue/tests/wrappers/test_show.pyrust/config/scheduler.yamlrust/crates/scheduler/Cargo.tomlrust/crates/scheduler/src/accounting/booking_delta.rsrust/crates/scheduler/src/accounting/bootstrap.rsrust/crates/scheduler/src/accounting/dao.rsrust/crates/scheduler/src/accounting/error.rsrust/crates/scheduler/src/accounting/limit_reseed.rsrust/crates/scheduler/src/accounting/lua.rsrust/crates/scheduler/src/accounting/managed_shows.rsrust/crates/scheduler/src/accounting/mod.rsrust/crates/scheduler/src/accounting/recompute.rsrust/crates/scheduler/src/accounting/redis_client.rsrust/crates/scheduler/src/cluster.rsrust/crates/scheduler/src/cluster_key.rsrust/crates/scheduler/src/config/mod.rsrust/crates/scheduler/src/dao/cluster_dao.rsrust/crates/scheduler/src/dao/frame_dao.rsrust/crates/scheduler/src/dao/host_dao.rsrust/crates/scheduler/src/dao/job_dao.rsrust/crates/scheduler/src/dao/layer_dao.rsrust/crates/scheduler/src/dao/mod.rsrust/crates/scheduler/src/dao/proc_dao.rsrust/crates/scheduler/src/dao/resource_accounting_dao.rsrust/crates/scheduler/src/host_cache/actor.rsrust/crates/scheduler/src/host_cache/cache.rsrust/crates/scheduler/src/host_cache/messages.rsrust/crates/scheduler/src/host_cache/mod.rsrust/crates/scheduler/src/lib.rsrust/crates/scheduler/src/main.rsrust/crates/scheduler/src/metrics/mod.rsrust/crates/scheduler/src/models/core_size.rsrust/crates/scheduler/src/models/frame.rsrust/crates/scheduler/src/models/layer.rsrust/crates/scheduler/src/models/mod.rsrust/crates/scheduler/src/models/subscription.rsrust/crates/scheduler/src/models/virtual_proc.rsrust/crates/scheduler/src/pipeline/dispatcher/actor.rsrust/crates/scheduler/src/pipeline/dispatcher/error.rsrust/crates/scheduler/src/pipeline/dispatcher/mod.rsrust/crates/scheduler/src/pipeline/entrypoint.rsrust/crates/scheduler/src/pipeline/layer_permit.rsrust/crates/scheduler/src/pipeline/matcher.rsrust/crates/scheduler/src/pipeline/mod.rsrust/crates/scheduler/src/pipeline/placement.rsrust/crates/scheduler/src/resource_accounting.rsrust/crates/scheduler/tests/redis_integration.rsrust/crates/scheduler/tests/smoke_tests.rsrust/crates/scheduler/tests/util.rs
💤 Files with no reviewable changes (5)
- rust/crates/scheduler/src/resource_accounting.rs
- rust/crates/scheduler/src/models/mod.rs
- rust/crates/scheduler/src/models/subscription.rs
- rust/crates/scheduler/src/dao/mod.rs
- cuebot/src/main/java/com/imageworks/spcue/dao/postgres/DispatcherDaoJdbc.java
| implementation group: 'io.lettuce', name: 'lettuce-core', version: '6.3.2.RELEASE' | ||
| // Lettuce 6.3.x requires reactor-core 3.6.x (for reactor.core.publisher.Sinks). | ||
| // Spring Boot 2.2.1's dependency management would otherwise pin this to 3.3.x. | ||
| implementation group: 'io.projectreactor', name: 'reactor-core', version: '3.6.5' |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check for security advisories on the added dependencies
echo "=== Checking Lettuce security advisories ==="
gh api graphql -f query='
{
securityVulnerabilities(first: 5, ecosystem: MAVEN, package: "io.lettuce:lettuce-core") {
nodes {
advisory {
summary
severity
publishedAt
}
vulnerableVersionRange
firstPatchedVersion {
identifier
}
}
}
}'
echo ""
echo "=== Checking Reactor Core security advisories ==="
gh api graphql -f query='
{
securityVulnerabilities(first: 5, ecosystem: MAVEN, package: "io.projectreactor:reactor-core") {
nodes {
advisory {
summary
severity
publishedAt
}
vulnerableVersionRange
firstPatchedVersion {
identifier
}
}
}
}'
echo ""
echo "=== Checking latest Maven Central versions ==="
curl -s "https://search.maven.org/solrsearch/select?q=g:io.lettuce+AND+a:lettuce-core&rows=1&wt=json" | jq '.response.docs[0].latestVersion'
curl -s "https://search.maven.org/solrsearch/select?q=g:io.projectreactor+AND+a:reactor-core&rows=1&wt=json" | jq '.response.docs[0].latestVersion'Repository: AcademySoftwareFoundation/OpenCue
Length of output: 593
Update Lettuce dependency due to known vulnerability
io.lettuce:lettuce-core:6.3.2.RELEASE falls under a known MODERATE advisory range (< 6.5.1.RELEASE: “Withdrawn Advisory: Netty vulnerability included in redis lettuce”); upgrade to io.lettuce:lettuce-core:6.5.1.RELEASE or newer (latest on Maven Central: 6.7.1.RELEASE) and keep reactor-core on the required 3.6.x line for Lettuce compatibility. io.projectreactor:reactor-core:3.6.5 did not return any advisories in the checked results.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cuebot/build.gradle` around lines 41 - 44, Update the Lettuce dependency to a
non-vulnerable release by changing the dependency declaration for
io.lettuce:lettuce-core (currently '6.3.2.RELEASE') to at least '6.5.1.RELEASE'
(preferably the latest, e.g. '6.7.1.RELEASE'); keep the explicit reactor-core
override for io.projectreactor:reactor-core at the 3.6.x line (e.g. '3.6.5') so
Lettuce compatibility is preserved and remove/replace the old version reference
wherever io.lettuce:lettuce-core '6.3.2.RELEASE' is declared.
Add an opt-in `Epvm` host-booking strategy alongside the existing `Saturation` strategy (which stays the default). When `Epvm` is enabled, the host cache scores candidate hosts using E-PVM stranding and picks the lowest-scoring host within a configurable scan cap, replacing today's first-fit-by-saturation iteration for the same call.
This review was done before the branch has been properly rebased with master and contained many comments unrelated to the actual work. |
Signed-off-by: Diego Tavares <dtavares@imageworks.com>
Add new scheduling mode to scheduler module that implements E-PVM, inspired by the work of Aghiles' scheduler.
Summary by CodeRabbit
Release Notes
New Features
cueadmin -scheduler-managed <show> on|offcommandscheduler_managedproperty to shows via gRPC API and Python wrappersConfiguration
accounting.redis.enabled,accounting.redis.host,accounting.redis.port)Documentation
Version
Summary by CodeRabbit
Release Notes
New Features
Improvements