Skip to content

chore(sentry): lower polling rates#271

Merged
bitbeckers merged 1 commit into
developfrom
chore/lower_sentry_sample_rates
Apr 7, 2025
Merged

chore(sentry): lower polling rates#271
bitbeckers merged 1 commit into
developfrom
chore/lower_sentry_sample_rates

Conversation

@bitbeckers
Copy link
Copy Markdown
Collaborator

@bitbeckers bitbeckers commented Mar 11, 2025

As we're getting high loads on Sentry spans, we reduce the sampling rate from 1.0 to 0.1 for production and 1.0 on staging.

The reason for this is to primarily lower the amount of traces sent. However, we keep it 100% on staging to detect bugs and as traffic is lower there

Closing #270

As we're getting high loads on Sentry spans, we reduce the sampling rate
from 1.0 to 0.1 for production and 1.0 on staging.

The reason for this is to primarily lower the amount of traces sent.
However, we keep it 100% on staging to detect bugs and as traffic is
lower there
@github-actions
Copy link
Copy Markdown

Coverage Report

Status Category Percentage Covered / Total
🟢 Lines 25.09% (🎯 24%) 1094 / 4360
🟢 Statements 25.09% (🎯 24%) 1094 / 4360
🟢 Functions 62.1% (🎯 59%) 59 / 95
🟢 Branches 72.87% (🎯 72%) 180 / 247
File CoverageNo changed files found.
Generated in workflow #73 for commit e4ec417 by the Vitest Coverage Report Action

Copy link
Copy Markdown
Contributor

@Jipperism Jipperism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is permanently necessary. Resolving those 429's would make us return back to the old situation where we have a trace for every error. We could also adjust the sampling rate based on the error type, so apply the sampling rate only on 429 which is a very typical "once-its-there-you-get-many" error?

@bitbeckers
Copy link
Copy Markdown
Collaborator Author

This would happen again on the next 429 and the sampling rate was at the level than we almost hit our monthly budget in 1 week. As far as I see, sentry now samples 100% of the calls to/by our API and Indexer and this would mitigate "once-its-there-you-get-many" issues

@Jipperism
Copy link
Copy Markdown
Contributor

@bitbeckers I'm having a look but it seems to be a uniform sample rate. Doesn't that mean that it will randomly select 10% of the errors to the sentry backend. So if we get 10,000 429's, it would mean that other errors have a pretty high chance of getting lost in the noise? I think we'd like to send 10% of 429's, and otherwise 100% of the errors.

@bitbeckers
Copy link
Copy Markdown
Collaborator Author

@Jipperism this is tracing, not error monitoring. As fas as I can see, the tracing sample influences how many events will be selected at given random selection rate (ref: tracing) The monitoring of error is not specified because it defaults to sampling 100% of the errors (ref: error monitoring)

For reference, in the last 14 days we've had 10.6M spans and 36K errors

Copy link
Copy Markdown
Contributor

@Jipperism Jipperism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking the differences between traces and error logs into account, I think it's good to go.

@bitbeckers bitbeckers merged commit 5bba419 into develop Apr 7, 2025
@github-project-automation github-project-automation Bot moved this from In review to Done in Hypercerts dev team daily ops Apr 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants