Spike protection
The spike protection feature enables Raygun to monitor your incoming error volumes and protects you temporarily from a sudden influx of errors that significantly exceed the rate of errors previously processed for each of your applications.
note: When you have received an email notification about a spike occurring, please take the recommended actions to avoid the spike continuing, as you may incur additional usage charges.
Spike protection is a feature that is available and automatically enabled for our usage-based Crash Reporting plan. If you are on a different Crash Reporting plan but would like to use this feature, then please get in touch.
Definitions
Error send rate
Includes all errors Raygun receives from your application, before any inbound filters or permanently ignored errors have been applied.
Error process rate
Events that Raygun actually processed for your application after the removal of events that matched all inbound filters and permanently ignored error groups.
Example:
- Your application sends 100 errors per minute to Raygun's API before inbound filters and permanently ignored error groups are applied. This is your error send rate.
- Of the 100, Raygun will drop 60 errors that match inbound filters or permanently ignored error groups.
- Raygun then processes 40 qualifying events per minute. This is your error process rate.
The error chart you see on the Crash Reporting dashboard is an indication of your error process rate.
How it works
We use the maximum error process rate of your application observed 7 days ago to determine your spike protection rate limit today. By using the processing rate from 7 days ago, each day of the week can have a different spike protection rate limit to cater for any weekly trends. If your application has not been sending data for a week yet, then we will use the maximum error process rate from yesterday instead, until a week of processing rates are available to use.
The spike protection rate limit itself is calculated as follows:
maximum (50, 5 * maximum per minute error process rate from either 7 days ago or yesterday)
We enforce a minimum limit of 50 errors per minute, even for applications with very low error rates.
Accepting and rejecting errors
If your errors exceed the spike protection rate, we'll reject any further errors for the remainder of the current minute. This process repeats each minute, ensuring that some error data continues to flow during the spike.
Example: If spike protection has calculated a rate limit for your application of 250 errors/min and a spike of 500 errors/min is processed, then only the first 250 errors within that minute will appear in your Raygun dashboard, the rest will be dropped. Data will be accepted again during each following minute minute, up until the spike protection rate each minute again.
How the rate limit adjusts over time
Since the spike protection rate limit for today is calculated from the observed maximum rate from previous days, the spike protection rate limit for your application is revised on a daily basis. The maximum error process rate used in the calculation includes the number of errors rejected by spike protection on those previous days. This allows the rate limit to adjust itself over time to fit a natural increase in traffic. This is why it's important to take action when a spike occurs to prevent it from happening again. If the spike does not happen again, then the spike protection rate limit will adjust itself downwards again.
Example: If the spike protection rate limit is 250 errors/min, and during high traffic your application receives a maximum of 500 errors/min (some of which gets rejected), then the spike protection rate limit calculated for this day next week will by 5 * 500. During that day next week, if the maximum rate of errors is only 10 errors/min, then the spike protection rate limit for that day the following week will be reduced to 5 * 10.
Correlating spike protection with the error chart
Since spike protection is triggered by spikes in the errors that we process for your applications, you will be able to correlate the spikes directly by looking at the spikes in your Crash Reporting error chart.
You would have received spike protection emails about these spikes in your dashboard
Enabling spike protection
Spike protection is enabled by default on each of your applications within Crash Reporting when you have a usage-based plan that includes on demand events. To enable or disable spike protection, you can follow these steps:
- In the side navigation, go to Settings under Crash Reporting.
- Check the option to Enable automatic rate limiting when a spike is detected.
- Save changes to apply the setting.
Note that spike protection will take a few minutes to become active.
Notifications
We'll notify you when a spike begins, and continue to send a notification once a day while the spike remains in progress.
If you find spike protection emails too noisy, you can turn off the spike protection email but still have the feature enabled.
You can learn more about how to adjust the notifications for spike protection in our Notifications documentation.
Plan owners can not turn off spike protection email notifications. As the owner of the account, it is important that you are aware of spikes that may impact your usage billing.
What actions should I take when I receive a spike protection email?
Raygun has a usage-based billing model, this means we bill by the number of errors that we process. Spike protection gives you a strong indication that you are sending more errors for us to process than previously observed and therefore you may also incur additional usage charges.
Here are the steps you can take to bring your error spikes under control:
- Do not turn spike protection off, the built in rate limiting will help limit your error volumes
- Identify and fix the error(s) responsible for the spike by reviewing your Crash Reporting error dashboard for spikes in error volume
- Permanently ignore errors caused by 3rd party plugins/libraries
- Use inbound filters to prevent irrelevant errors from being sent to Raygun
- Use the OnBeforeSend method to inspect the error before it is sent and determine whether or not it should be sent
- Keep an eye out for further spike protection notifications in your inbox, so that you can take the above actions quickly
If you do not take action, and your spike continues, you may incur additional usage charges based on your subscription. See our billing docs for details.