-
Notifications
You must be signed in to change notification settings - Fork 9
PAYMENTS-11567 Resque latency metrics #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
WillemHoman
wants to merge
2
commits into
main
Choose a base branch
from
PAYMENTS-11567-resque_latency
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
65 changes: 65 additions & 0 deletions
65
lib/bigcommerce/prometheus/integrations/resque/active_job_payload.rb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,65 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| # Copyright (c) 2019-present, BigCommerce Pty. Ltd. All rights reserved | ||
| # | ||
| # Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated | ||
| # documentation files (the "Software"), to deal in the Software without restriction, including without limitation the | ||
| # rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit | ||
| # persons to whom the Software is furnished to do so, subject to the following conditions: | ||
| # | ||
| # The above copyright notice and this permission notice shall be included in all copies or substantial portions of the | ||
| # Software. | ||
| # | ||
| # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE | ||
| # WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR | ||
| # COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR | ||
| # OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. | ||
| # | ||
| require 'time' | ||
|
|
||
| module Bigcommerce | ||
| module Prometheus | ||
| module Integrations | ||
| class Resque | ||
| ## | ||
| # Payload fields for an ActiveJob-shaped Resque job, read from the | ||
| # inner hash at `args[0]`. ActiveJob's JobWrapper stamps the three | ||
| # fields the per-job metrics consume: | ||
| # | ||
| # * job_class — the user's actual job class name; used as the | ||
| # metric label. | ||
| # * enqueued_at — ISO 8601 string; queue-latency anchor when | ||
| # scheduled_at is absent. | ||
| # * scheduled_at — ISO 8601 string; preferred over enqueued_at | ||
| # when present (e.g. retries-with-backoff, so the | ||
| # intentional wait isn't counted as latency). | ||
| class ActiveJobPayload | ||
| # @return [String] the user's actual job class name | ||
| attr_reader :job_class | ||
|
|
||
| # @return [Time, nil] the queue-latency anchor; nil when both | ||
| # timestamps are absent or unparseable | ||
| attr_reader :anchor_time | ||
|
|
||
| # @param [Hash] inner the ActiveJob-shaped hash at `args[0]`; | ||
| # JobPayload.for guarantees a truthy 'job_class' | ||
| def initialize(inner) | ||
| @job_class = inner['job_class'] | ||
| @anchor_time = parse_time(inner['scheduled_at']) || parse_time(inner['enqueued_at']) | ||
| end | ||
|
|
||
| private | ||
|
|
||
| def parse_time(value) | ||
| return value if value.is_a?(Time) | ||
| return nil if value.nil? || value.to_s.empty? | ||
|
|
||
| Time.iso8601(value.to_s) | ||
| rescue ArgumentError | ||
| nil | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end |
166 changes: 166 additions & 0 deletions
166
lib/bigcommerce/prometheus/integrations/resque/job_metrics.rb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,166 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| # Copyright (c) 2019-present, BigCommerce Pty. Ltd. All rights reserved | ||
| # | ||
| # Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated | ||
| # documentation files (the "Software"), to deal in the Software without restriction, including without limitation the | ||
| # rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit | ||
| # persons to whom the Software is furnished to do so, subject to the following conditions: | ||
| # | ||
| # The above copyright notice and this permission notice shall be included in all copies or substantial portions of the | ||
| # Software. | ||
| # | ||
| # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE | ||
| # WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR | ||
| # COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR | ||
| # OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. | ||
| # | ||
| module Bigcommerce | ||
| module Prometheus | ||
| module Integrations | ||
| class Resque | ||
| ## | ||
| # Per-Resque-job histogram metrics, recorded from the parent worker process. | ||
| # Hooked via a prepend around Resque::Worker#perform_with_fork. | ||
| # Queue latency is captured before super, perform duration after. | ||
| # | ||
| # Off unless PROMETHEUS_RESQUE_PER_JOB_METRICS_ENABLED=1 | ||
| # Emits one histogram observation per job per worker process, which can be high cardinality at scale. | ||
| # | ||
| # NOTE: queue_latency is supported for jobs enqueued via ActiveJob | ||
| # The gem reads three fields from | ||
| # `payload['args'][0]` (which must be a Hash): | ||
| # | ||
| # * job_class — the user's actual job class name; used as the | ||
| # metric label. | ||
| # * enqueued_at — ISO 8601 string; used as the queue-latency | ||
| # anchor when scheduled_at is absent. | ||
| # * scheduled_at — ISO 8601 string; preferred over enqueued_at | ||
| # when present (e.g. retries-with-backoff, so | ||
| # the intentional wait isn't counted as latency). | ||
| # | ||
| # ActiveJob produces this shape natively — the payload is wrapped by | ||
| # ActiveJob::QueueAdapters::ResqueAdapter::JobWrapper, which stamps | ||
| # the three fields above into `args[0]`. | ||
| # | ||
| # Vanilla Resque jobs enqueued via Resque.enqueue carry no enqueue timestamps. | ||
| # class MyJob | ||
| # @queue = :foo; | ||
| # def self.perform; | ||
| # end | ||
| # Their args are raw primitive values, not a wrapping hash. | ||
| # For these jobs, queue_latency silently no-ops. | ||
| # perform_duration works for both styles regardless. | ||
| # | ||
| # Payloads that replicate the three fields above are read the same way. | ||
| # Detection is by shape, not by wrapper class name. | ||
| # This means a vanilla job can opt in to queue_latency either by | ||
| # - converting to ActiveJob | ||
| # - enqueueing through a small wrapper class that stamps these fields into args[0]. | ||
| # | ||
| module JobMetrics | ||
| class << self | ||
| ## | ||
| # Install the parent-side hooks if the per-job metrics feature is enabled. | ||
| # Idempotent: safe to call multiple times. | ||
| # | ||
| # @param [PrometheusExporter::Client] client | ||
| # | ||
| def start(client:) | ||
| return unless ::Bigcommerce::Prometheus.resque_per_job_metrics_enabled | ||
|
|
||
| @client = client | ||
| install_hooks | ||
| end | ||
|
|
||
| ## | ||
| # Push the queue-latency observation for a job that's about to be picked up by a worker. | ||
| # Anchors on scheduled_at if present so retries-with-backoff don't show the intentional wait as latency. | ||
| # Falls back to enqueued_at if scheduled_at isn't present. | ||
| # | ||
| # @param [ActiveJobPayload, VanillaResquePayload] payload | ||
| # | ||
| def record_queue_latency(payload) | ||
| anchor = payload.anchor_time | ||
| return unless anchor | ||
|
|
||
| # Clock skew between the enqueuer/scheduler and the worker can put the anchor in the future. | ||
| # Clamp to zero so the histogram never records a negative latency. | ||
| latency = (Time.now - anchor).to_f.clamp(0.0..) | ||
|
|
||
| @client.send_json( | ||
| type: 'resque_job', | ||
| metric: 'queue_latency', | ||
| value: latency, | ||
| custom_labels: { job_class: payload.job_class } | ||
| ) | ||
| rescue StandardError => e | ||
| ::Bigcommerce::Prometheus.logger&.warn( | ||
| "[bigcommerce-prometheus] resque_job queue_latency push failed: #{e.message}" | ||
| ) | ||
|
Copilot marked this conversation as resolved.
|
||
| end | ||
|
|
||
| ## | ||
| # Push the perform-duration observation for a completed job. | ||
| # Called from the `Resque::Worker#perform_with_fork` prepend, so it measures the full child lifetime: | ||
| # fork + reconnect + perform + exit | ||
| # | ||
| # @param [ActiveJobPayload, VanillaResquePayload] payload | ||
| # @param [Float] duration in seconds | ||
| # | ||
| def record_perform_duration(payload, duration) | ||
| @client.send_json( | ||
| type: 'resque_job', | ||
| metric: 'perform_duration', | ||
| value: duration, | ||
| custom_labels: { job_class: payload.job_class } | ||
| ) | ||
| rescue StandardError => e | ||
| ::Bigcommerce::Prometheus.logger&.warn( | ||
| "[bigcommerce-prometheus] resque_job perform_duration push failed: #{e.message}" | ||
| ) | ||
|
Copilot marked this conversation as resolved.
|
||
| end | ||
|
|
||
| private | ||
|
|
||
| def install_hooks | ||
| return if @hooks_installed | ||
|
|
||
| ::Resque::Worker.prepend(WorkerInstrumentation) | ||
| @hooks_installed = true | ||
| end | ||
| end | ||
|
|
||
| ## | ||
| # Prepended onto Resque::Worker to capture for every job that goes through perform_with_fork: | ||
| # - queue latency: before super | ||
| # - perform duration: after super | ||
| # The duration timer starts immediately before super so the | ||
| # observation matches the documented fork-to-waitpid window and | ||
| # excludes payload parsing and the queue-latency push. The ensure | ||
| # is guarded so instrumentation failures can never mask a job | ||
| # exception. | ||
| module WorkerInstrumentation | ||
| def perform_with_fork(job, &block) | ||
| payload = begin | ||
| JobPayload.for(job) | ||
| rescue StandardError | ||
| nil | ||
| end | ||
| JobMetrics.record_queue_latency(payload) if payload | ||
| started_at = Process.clock_gettime(Process::CLOCK_MONOTONIC) | ||
| super | ||
| ensure | ||
| if payload && started_at | ||
| JobMetrics.record_perform_duration( | ||
| payload, | ||
| Process.clock_gettime(Process::CLOCK_MONOTONIC) - started_at | ||
| ) | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end | ||
69 changes: 69 additions & 0 deletions
69
lib/bigcommerce/prometheus/integrations/resque/job_payload.rb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
|
WillemHoman marked this conversation as resolved.
|
||
| # Copyright (c) 2019-present, BigCommerce Pty. Ltd. All rights reserved | ||
| # | ||
| # Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated | ||
| # documentation files (the "Software"), to deal in the Software without restriction, including without limitation the | ||
| # rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit | ||
| # persons to whom the Software is furnished to do so, subject to the following conditions: | ||
| # | ||
| # The above copyright notice and this permission notice shall be included in all copies or substantial portions of the | ||
| # Software. | ||
| # | ||
| # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE | ||
| # WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR | ||
| # COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR | ||
| # OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. | ||
| # | ||
| module Bigcommerce | ||
| module Prometheus | ||
| module Integrations | ||
| class Resque | ||
| ## | ||
| # Classifies a Resque::Job's payload and builds the matching | ||
| # shape-specific payload object for per-job metrics. | ||
| # | ||
| # A payload is ActiveJob-shaped when `args[0]` is a Hash carrying a | ||
| # truthy 'job_class' — the shape | ||
| # ActiveJob::QueueAdapters::ResqueAdapter::JobWrapper produces | ||
| # natively. Detection is by shape rather than by wrapper class name: | ||
| # the fields are ActiveJob's stable serialization format (persisted | ||
| # payloads must survive Rails upgrades), while the wrapper's class | ||
| # name is a private Rails constant — matching on it would silently | ||
| # kill the metric if Rails ever moved it. Payloads that replicate | ||
| # these fields are read the same way, by mechanism. Everything | ||
| # else — vanilla Resque jobs with primitive args, nil or non-Hash | ||
| # payloads, `args` not being an Array — is treated as vanilla. | ||
| # | ||
| # Both payload classes expose the same interface: #job_class | ||
| # (String) and #anchor_time (Time or nil). | ||
| # | ||
| module JobPayload | ||
| class << self | ||
| ## | ||
| # @param [Resque::Job] resque_job | ||
| # @return [ActiveJobPayload, VanillaResquePayload] | ||
| # | ||
| def for(resque_job) | ||
| payload = resque_job.payload | ||
| payload = {} unless payload.is_a?(Hash) | ||
|
|
||
| inner = activejob_inner(payload) | ||
| inner ? ActiveJobPayload.new(inner) : VanillaResquePayload.new(payload) | ||
| end | ||
|
|
||
| private | ||
|
|
||
| def activejob_inner(payload) | ||
| args = payload['args'] | ||
| first = args.is_a?(Array) ? args.first : nil | ||
| return nil unless first.is_a?(Hash) && first['job_class'] | ||
|
|
||
| first | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.