diff --git a/CHANGELOG.md b/CHANGELOG.md index 2582114..7459ebd 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +### Added + +- Error frequency report — `GET /failed_jobs/errors` groups all failed jobs by exception class and message prefix, showing count and an expandable sample backtrace per group; links through to a filtered failed jobs list via `?error_class=`; the failed jobs index gains an "Error Summary" button and shows an active-filter breadcrumb with a clear link + ## [1.0.0] - 2026-05-27 ### Added diff --git a/README.md b/README.md index 202c3fa..ca3dcac 100644 --- a/README.md +++ b/README.md @@ -21,23 +21,13 @@ Run: bundle install ``` -Mount the engine in `config/routes.rb`: - -```ruby -mount SolidStackWeb::Engine, at: "/solid_stack" -``` - -The dashboard will be available at `/solid_stack` (or whatever path you choose). - -### Install generator - Run the install generator to create a documented initializer and wire up the mount point in one step: ```bash rails generate solid_stack_web:install ``` -This creates `config/initializers/solid_stack_web.rb` with every configuration option commented inline, and injects `mount SolidStackWeb::Engine, at: "/solid_stack"` into `config/routes.rb`. +This creates `config/initializers/solid_stack_web.rb` with every configuration option commented inline, and injects `mount SolidStackWeb::Engine, at: "/solid_stack"` into `config/routes.rb`. The dashboard will then be available at `/solid_stack` (or whatever path you choose). --- @@ -77,7 +67,7 @@ This creates `config/initializers/solid_stack_web.rb` with every configuration o ## General configuration -Create an initializer at `config/initializers/solid_stack_web.rb`: +The install generator creates `config/initializers/solid_stack_web.rb` with all options documented inline. The available options are: ```ruby SolidStackWeb.configure do |config| @@ -159,6 +149,7 @@ The dashboard is designed to be mounted behind your application's existing authe - **Queue depth sparklines** — Queues index shows a 12-hour depth chart per queue; each bar is the ready-job count at an hourly snapshot with an instant hover tooltip - **Job detail page** — full arguments (pretty-printed JSON), queue, priority, enqueued time, Active Job ID, concurrency key, scheduled/blocked-until metadata, and a Discard button - **Failed jobs** — list with retry / discard / bulk retry / bulk discard; **Failed job detail page** — full error, backtrace, and an inline JSON argument editor; submit to update arguments and retry in one action +- **Error frequency report** — `GET /failed_jobs/errors` groups all failed jobs by exception class and message prefix with a count and expandable sample backtrace; links through to a filtered list for each error group - **Scheduled job management** — "Run Now" and offset buttons (+1h / +24h / +7d) per row update the scheduled time inline via Turbo Stream; "Run All Now (N)" back-dates all matching executions at once - **Recurring task list** — enumerates all `SolidQueue::RecurringTask` records with cron schedule, job class or command, queue, next-run and last-run times, and a static/dynamic badge; each row has a "Run Now" button - **Performance statistics page** — `GET /stats` aggregates finished jobs by class name with execution count, avg, p50, p95, min, and max duration; click any column header to sort; defaults to p95 descending diff --git a/ROADMAP.md b/ROADMAP.md index 7b09ddd..988dbb4 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -11,7 +11,6 @@ The path to v1.0.0 is staged: first achieve feature parity with `solid_queue_das > _Surface patterns in failures, not just individual failed jobs._ -- **Error frequency report** — group all failed jobs by error class + message prefix; show count and a sample backtrace; makes "ArgumentError (x212), TimeoutError (x88)" visible at a glance without clicking into each job - **Failed job trend chart** — "Failures — Last 12 Hours" sparkline on the queue dashboard overview card; makes failure spikes visible before you click into the failed jobs list - **P99 + standard deviation in performance stats** — extend the stats table with a 99th-percentile and std-dev column; high std dev signals inconsistent jobs worth investigating diff --git a/app/assets/stylesheets/solid_stack_web/_09_detail.css b/app/assets/stylesheets/solid_stack_web/_09_detail.css index 608f2a8..4c44a95 100644 --- a/app/assets/stylesheets/solid_stack_web/_09_detail.css +++ b/app/assets/stylesheets/solid_stack_web/_09_detail.css @@ -84,6 +84,29 @@ .sqw-value-truncated { font-size: 12px; margin-top: 0.5rem; } +.sqw-error-details > summary { + cursor: pointer; + list-style: none; + display: block; + max-width: 480px; +} +.sqw-error-details > summary::-webkit-details-marker { display: none; } +.sqw-error-details[open] > summary { margin-bottom: 0.5rem; } + +.sqw-error-backtrace { + font-family: ui-monospace, "SFMono-Regular", Menlo, monospace; + font-size: 11px; + background: var(--bg); + border: 1px solid var(--border); + border-radius: var(--radius); + padding: 0.5rem 0.75rem; + overflow-x: auto; + white-space: pre; + max-height: 200px; + overflow-y: auto; + margin-top: 0.25rem; +} + .sqw-link { color: var(--primary); text-decoration: none; } .sqw-link:hover { text-decoration: underline; } diff --git a/app/controllers/solid_stack_web/failed_jobs/errors_controller.rb b/app/controllers/solid_stack_web/failed_jobs/errors_controller.rb new file mode 100644 index 0000000..367ec61 --- /dev/null +++ b/app/controllers/solid_stack_web/failed_jobs/errors_controller.rb @@ -0,0 +1,9 @@ +module SolidStackWeb + module FailedJobs + class ErrorsController < ApplicationController + def index + @groups = ErrorFrequencyReport.new.groups + end + end + end +end diff --git a/app/controllers/solid_stack_web/failed_jobs_controller.rb b/app/controllers/solid_stack_web/failed_jobs_controller.rb index 966c51e..11c8c1a 100644 --- a/app/controllers/solid_stack_web/failed_jobs_controller.rb +++ b/app/controllers/solid_stack_web/failed_jobs_controller.rb @@ -4,6 +4,8 @@ def index respond_to do |format| format.html do scope = ::SolidQueue::FailedExecution.includes(:job).order(created_at: :desc) + @error_class = params[:error_class].presence + scope = scope.where(id: ids_for_error_class(@error_class)) if @error_class @pagy, @executions = pagy(scope) end format.csv do @@ -41,6 +43,15 @@ def retry private + def ids_for_error_class(ec) + ::SolidQueue::FailedExecution.pluck(:id, :error).filter_map do |id, raw| + error = raw.is_a?(Hash) ? raw : JSON.parse(raw) + id if error["exception_class"] == ec + rescue StandardError + nil + end + end + def failed_jobs_csv CSV.generate(headers: true) do |csv| csv << %w[id class_name queue_name error_class error_message failed_at] diff --git a/app/models/solid_stack_web/error_frequency_report.rb b/app/models/solid_stack_web/error_frequency_report.rb new file mode 100644 index 0000000..f3e8d0d --- /dev/null +++ b/app/models/solid_stack_web/error_frequency_report.rb @@ -0,0 +1,34 @@ +module SolidStackWeb + class ErrorFrequencyReport + Row = Data.define(:exception_class, :message_prefix, :count, :sample_backtrace) + + MESSAGE_LIMIT = 120 + + def groups + ::SolidQueue::FailedExecution + .order(created_at: :desc) + .each_with_object({}) do |execution, acc| + key = [execution.exception_class.to_s, message_prefix(execution.message)] + entry = acc[key] ||= { count: 0, sample_backtrace: nil } + entry[:count] += 1 + entry[:sample_backtrace] ||= execution.backtrace + end + .map do |(exception_class, prefix), data| + Row.new( + exception_class: exception_class, + message_prefix: prefix, + count: data[:count], + sample_backtrace: data[:sample_backtrace] + ) + end + .sort_by { |row| -row.count } + end + + private + + def message_prefix(message) + return "" if message.nil? + message.length > MESSAGE_LIMIT ? "#{message[0, MESSAGE_LIMIT]}…" : message + end + end +end diff --git a/app/views/solid_stack_web/failed_jobs/errors/index.html.erb b/app/views/solid_stack_web/failed_jobs/errors/index.html.erb new file mode 100644 index 0000000..cf67cab --- /dev/null +++ b/app/views/solid_stack_web/failed_jobs/errors/index.html.erb @@ -0,0 +1,48 @@ +
| Error Class | +Message | +Count | +Actions | +
|---|---|---|---|
| <%= group.exception_class.presence || "—" %> | +
+ <% if group.sample_backtrace.present? %>
+
+
+ <% else %>
+ <%= group.message_prefix.presence || "—" %>
+ <% end %>
+ + <%= group.message_prefix.presence || "—" %> ++<%= Array(group.sample_backtrace).first(10).join("\n") %>
+ |
+ <%= group.count %> | ++ <%= link_to "View Jobs", failed_jobs_path(error_class: group.exception_class), + class: "sqw-btn sqw-btn--muted sqw-btn--sm" %> + | +
No failed jobs
+All clear — your jobs are running without errors.
+