diff --git a/CHANGELOG.md b/CHANGELOG.md index 8a03632..8493171 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added +- Performance analytics — a new Performance page (`/jobs/performance`) shows per-job-class statistics derived from the history table: run count, average duration, p50, p95, min, and max; rows are sorted by p95 descending so the slowest classes appear first; a period filter (1h / 24h / 7d / All) scopes the dataset; each class name links to the History page pre-filtered to that class; business logic lives in a `JobPerformanceStats` service using a single pluck query with Ruby-side aggregation for DB-agnostic percentile computation - Metrics / health endpoint — `GET /jobs/metrics.json` returns a JSON document with job counts (`ready`, `scheduled`, `claimed`, `blocked`, `failed`), throughput (`completed_1h`, `completed_24h`), per-queue depth and pause state, and process health (`total`, `healthy`, `stale`, `by_kind`); when `slow_job_threshold` is configured, a `slow_jobs` count is also included; the endpoint goes through the same authentication and `connects_to` middleware as all other routes - Recurring task "Run Now" — a "Run Now" button on the Recurring Tasks page triggers `task.enqueue(at: Time.current)` to enqueue the job immediately without waiting for its next scheduled run; SolidQueue's `RecurringExecution` deduplication prevents double-enqueuing - Read replica support — when `connects_to` is set to `{ reading: , writing: }`, the engine automatically routes GET requests to the reading role and mutating requests (POST/DELETE/PATCH) to the writing role via `ActiveRecord::Base.connected_to(role:)`; passing any other hash (e.g. `{ role: :writing }`, `{ shard: :name }`) falls through to `connected_to` directly; defaults to `nil` so single-database setups are unaffected diff --git a/README.md b/README.md index 036b33b..1a2f8ae 100644 --- a/README.md +++ b/README.md @@ -50,6 +50,7 @@ SolidQueueWeb surfaces all of this in a browser UI available at any route you ch - **CSV export** — "Export CSV" button on the jobs, failed jobs, and history pages downloads all records matching the current filters; columns are tailored per view - **Slow job detection** — when `slow_job_threshold` is configured, claimed jobs running longer than the threshold are flagged with an orange row, a "slow" badge, and a "Running For" duration column on the Running tab; a "Slow Jobs" warning card appears on the dashboard with a link to the Running tab - **Webhook alerts** — set `alert_webhook_url` and `alert_failure_threshold` to receive a POST request whenever the failed job count meets or exceeds the threshold; fires asynchronously so dashboard performance is unaffected; a configurable cooldown (default 1 h) prevents repeated alerts while the count stays elevated +- **Performance analytics** — per-job-class statistics at `/jobs/performance` showing run count, average, p50, p95, min, and max duration; sorted by p95 descending so the slowest classes surface first; period filter scopes to 1h / 24h / 7d or all time; each class name links to the filtered History view - **Metrics / health endpoint** — `GET /jobs/metrics.json` returns a machine-readable JSON document with job counts, throughput, per-queue depth and pause state, and process health summary; suitable for Prometheus scraping, uptime monitors, or external dashboards; `slow_jobs` count included when `slow_job_threshold` is configured ## Screenshots @@ -212,7 +213,6 @@ Planned features, roughly ordered by priority: - Bulk scheduled job actions — "Run All Now" button on the Scheduled tab, mirroring the "Retry All" pattern on the Failed Jobs page **Observability** -- Performance analytics — average and percentile (p50/p95) duration per job class derived from the history table; surfaces slow job types before they become a problem - Priority filter — filter and sort the jobs list by Solid Queue job priority **Notifications** diff --git a/app/controllers/solid_queue_web/performance_controller.rb b/app/controllers/solid_queue_web/performance_controller.rb new file mode 100644 index 0000000..7d88eed --- /dev/null +++ b/app/controllers/solid_queue_web/performance_controller.rb @@ -0,0 +1,12 @@ +module SolidQueueWeb + class PerformanceController < ApplicationController + def index + @period = params[:period].presence_in(PERIOD_DURATIONS.keys) + + scope = SolidQueue::Job.where.not(finished_at: nil) + scope = scope.where("finished_at >= ?", PERIOD_DURATIONS[@period].ago) if @period.present? + + @rows = JobPerformanceStats.new(scope).rows + end + end +end diff --git a/app/services/solid_queue_web/job_performance_stats.rb b/app/services/solid_queue_web/job_performance_stats.rb new file mode 100644 index 0000000..b880119 --- /dev/null +++ b/app/services/solid_queue_web/job_performance_stats.rb @@ -0,0 +1,38 @@ +module SolidQueueWeb + class JobPerformanceStats + Row = Struct.new(:class_name, :count, :avg, :p50, :p95, :min, :max, keyword_init: true) + + def initialize(scope) + @scope = scope + end + + def rows + grouped = @scope.pluck(:class_name, :created_at, :finished_at) + .group_by(&:first) + + grouped.map do |class_name, records| + durations = records.map { |_, created, finished| (finished - created).to_f }.sort + Row.new( + class_name: class_name, + count: durations.size, + avg: mean(durations), + p50: percentile(durations, 50), + p95: percentile(durations, 95), + min: durations.first, + max: durations.last + ) + end.sort_by { |r| -r.p95 } + end + + private + + def mean(sorted) + sorted.sum / sorted.size + end + + def percentile(sorted, pct) + idx = [(pct / 100.0 * sorted.size).ceil - 1, 0].max + sorted[idx] + end + end +end diff --git a/app/views/layouts/solid_queue_web/application.html.erb b/app/views/layouts/solid_queue_web/application.html.erb index 8e3142b..4c1d881 100644 --- a/app/views/layouts/solid_queue_web/application.html.erb +++ b/app/views/layouts/solid_queue_web/application.html.erb @@ -21,6 +21,7 @@
  • <%= link_to "Queues", queues_path, class: current_page?(queues_path) ? "active" : "", aria: { current: current_page?(queues_path) ? "page" : nil } %>
  • <%= link_to "Jobs", jobs_path, class: current_page?(jobs_path) ? "active" : "", aria: { current: current_page?(jobs_path) ? "page" : nil } %>
  • <%= link_to "History", history_path, class: current_page?(history_path) ? "active" : "", aria: { current: current_page?(history_path) ? "page" : nil } %>
  • +
  • <%= link_to "Performance", performance_path, class: current_page?(performance_path) ? "active" : "", aria: { current: current_page?(performance_path) ? "page" : nil } %>
  • <%= link_to "Failed", failed_jobs_path, class: current_page?(failed_jobs_path) ? "active" : "", aria: { current: current_page?(failed_jobs_path) ? "page" : nil } %>
  • <%= link_to "Recurring", recurring_tasks_path, class: current_page?(recurring_tasks_path) ? "active" : "", aria: { current: current_page?(recurring_tasks_path) ? "page" : nil } %>
  • <%= link_to "Processes", processes_path, class: current_page?(processes_path) ? "active" : "", aria: { current: current_page?(processes_path) ? "page" : nil } %>
  • diff --git a/app/views/solid_queue_web/performance/index.html.erb b/app/views/solid_queue_web/performance/index.html.erb new file mode 100644 index 0000000..7a12851 --- /dev/null +++ b/app/views/solid_queue_web/performance/index.html.erb @@ -0,0 +1,50 @@ +
    +

    Performance

    +
    + + + +<% if @rows.any? %> +
    + + + + + + + + + + + + + + <% @rows.each do |row| %> + + + + + + + + + + <% end %> + +
    Job ClassRunsAvgp50p95MinMax
    + <%= link_to row.class_name, history_path(q: row.class_name, period: @period), + class: "sqd-table-link" %> + <%= row.count %><%= format_duration(row.avg) %><%= format_duration(row.p50) %><%= format_duration(row.p95) %><%= format_duration(row.min) %><%= format_duration(row.max) %>
    +
    +<% else %> +
    +
    No finished jobs found<%= " in the last #{@period}" if @period %>.
    +
    +<% end %> \ No newline at end of file diff --git a/config/routes.rb b/config/routes.rb index d3cc83b..6af7982 100644 --- a/config/routes.rb +++ b/config/routes.rb @@ -3,8 +3,9 @@ resource :blocked_jobs, only: [:destroy] get "metrics", to: "metrics#index", as: :metrics, defaults: { format: :json } - get "search", to: "search#index", as: :search - get "history", to: "history#index", as: :history + get "search", to: "search#index", as: :search + get "history", to: "history#index", as: :history + get "performance", to: "performance#index", as: :performance resources :recurring_tasks, only: [:index], param: :key do resource :run, only: [:create], controller: "recurring_tasks/runs" diff --git a/spec/requests/solid_queue_web/performance_spec.rb b/spec/requests/solid_queue_web/performance_spec.rb new file mode 100644 index 0000000..e1b1e8d --- /dev/null +++ b/spec/requests/solid_queue_web/performance_spec.rb @@ -0,0 +1,96 @@ +require "rails_helper" + +RSpec.describe "Performance", type: :request do + def finished_job(class_name:, duration_seconds:, finished_ago: 1.hour) + finished = finished_ago.ago + job = SolidQueue::Job.new( + queue_name: "default", class_name: class_name, + arguments: {}.to_json, priority: 0, active_job_id: SecureRandom.uuid + ) + job.finished_at = finished + job.created_at = finished - duration_seconds + job.updated_at = finished + job.save!(validate: false) + job + end + + describe "GET /jobs/performance" do + it "returns HTTP success" do + get "/jobs/performance" + expect(response).to have_http_status(:ok) + end + + it "displays the Performance heading" do + get "/jobs/performance" + expect(response.body).to include("Performance") + end + + it "shows an empty state when no finished jobs exist" do + get "/jobs/performance" + expect(response.body).to include("No finished jobs found") + end + + it "renders a row for each distinct job class" do + finished_job(class_name: "AlphaJob", duration_seconds: 10) + finished_job(class_name: "BetaJob", duration_seconds: 20) + + get "/jobs/performance" + expect(response.body).to include("AlphaJob") + expect(response.body).to include("BetaJob") + end + + it "links each job class to the history page filtered by that class" do + finished_job(class_name: "AlphaJob", duration_seconds: 10) + + get "/jobs/performance" + expect(response.body).to include("/jobs/history?q=AlphaJob") + end + + it "renders period filter pills" do + get "/jobs/performance" + expect(response.body).to include("1h") + expect(response.body).to include("24h") + expect(response.body).to include("7d") + end + + it "filters results to the selected period" do + finished_job(class_name: "RecentJob", duration_seconds: 5, finished_ago: 30.minutes) + finished_job(class_name: "OldJob", duration_seconds: 5, finished_ago: 48.hours) + + get "/jobs/performance", params: { period: "24h" } + expect(response.body).to include("RecentJob") + expect(response.body).not_to include("OldJob") + end + + it "shows empty state message with period when no jobs match the filter" do + get "/jobs/performance", params: { period: "1h" } + expect(response.body).to include("No finished jobs found in the last 1h") + end + + it "sorts rows by p95 descending (slowest class first)" do + finished_job(class_name: "FastJob", duration_seconds: 2) + finished_job(class_name: "SlowJob", duration_seconds: 120) + + get "/jobs/performance" + slow_pos = response.body.index("SlowJob") + fast_pos = response.body.index("FastJob") + expect(slow_pos).to be < fast_pos + end + + describe "authentication" do + after { SolidQueueWeb.instance_variable_set(:@authenticate, nil) } + + it "allows access when auth block returns truthy" do + SolidQueueWeb.authenticate { true } + get "/jobs/performance" + expect(response).to have_http_status(:ok) + end + + it "returns 401 when auth block returns falsy" do + SolidQueueWeb.authenticate { false } + get "/jobs/performance" + expect(response).to have_http_status(:unauthorized) + end + end + end +end diff --git a/spec/services/solid_queue_web/job_performance_stats_spec.rb b/spec/services/solid_queue_web/job_performance_stats_spec.rb new file mode 100644 index 0000000..299116e --- /dev/null +++ b/spec/services/solid_queue_web/job_performance_stats_spec.rb @@ -0,0 +1,83 @@ +require "rails_helper" + +RSpec.describe SolidQueueWeb::JobPerformanceStats do + def finished_job(class_name:, duration_seconds:, finished_ago: 1.hour) + finished = finished_ago.ago + job = SolidQueue::Job.new( + queue_name: "default", class_name: class_name, + arguments: {}.to_json, priority: 0, active_job_id: SecureRandom.uuid + ) + job.finished_at = finished + job.created_at = finished - duration_seconds + job.updated_at = finished + job.save!(validate: false) + job + end + + let(:scope) { SolidQueue::Job.where.not(finished_at: nil) } + + describe "#rows" do + it "returns an empty array when no finished jobs exist" do + expect(described_class.new(scope).rows).to be_empty + end + + it "returns one row per distinct job class" do + finished_job(class_name: "AlphaJob", duration_seconds: 10) + finished_job(class_name: "BetaJob", duration_seconds: 20) + + rows = described_class.new(scope).rows + expect(rows.map(&:class_name)).to match_array(%w[AlphaJob BetaJob]) + end + + it "computes count correctly" do + 3.times { finished_job(class_name: "RepeatedJob", duration_seconds: 10) } + + row = described_class.new(scope).rows.find { |r| r.class_name == "RepeatedJob" } + expect(row.count).to eq(3) + end + + it "computes avg, min, and max correctly" do + finished_job(class_name: "MathJob", duration_seconds: 10) + finished_job(class_name: "MathJob", duration_seconds: 20) + finished_job(class_name: "MathJob", duration_seconds: 30) + + row = described_class.new(scope).rows.find { |r| r.class_name == "MathJob" } + expect(row.avg).to be_within(0.5).of(20) + expect(row.min).to be_within(0.5).of(10) + expect(row.max).to be_within(0.5).of(30) + end + + it "computes p50 as the median" do + [10, 20, 30, 40, 50].each { |d| finished_job(class_name: "P50Job", duration_seconds: d) } + + row = described_class.new(scope).rows.find { |r| r.class_name == "P50Job" } + expect(row.p50).to be_within(0.5).of(30) + end + + it "computes p95 near the high end of the distribution" do + 20.times { |i| finished_job(class_name: "P95Job", duration_seconds: i + 1) } + + row = described_class.new(scope).rows.find { |r| r.class_name == "P95Job" } + expect(row.p95).to be_within(1).of(19) + end + + it "sorts rows by p95 descending" do + finished_job(class_name: "FastJob", duration_seconds: 2) + finished_job(class_name: "SlowJob", duration_seconds: 120) + + rows = described_class.new(scope).rows + expect(rows.first.class_name).to eq("SlowJob") + expect(rows.last.class_name).to eq("FastJob") + end + + it "respects a pre-filtered scope" do + finished_job(class_name: "InScopeJob", duration_seconds: 10, finished_ago: 30.minutes) + finished_job(class_name: "OutScopeJob", duration_seconds: 10, finished_ago: 48.hours) + + filtered = scope.where("finished_at >= ?", 1.hour.ago) + rows = described_class.new(filtered).rows + expect(rows.map(&:class_name)).to include("InScopeJob") + expect(rows.map(&:class_name)).not_to include("OutScopeJob") + end + end +end