Manage your Raygun quota effectively with these tips
Posted Nov 19, 2022 | 13 min. (2590 words)If you have any concerns about going over your plan’s monthly event quota, this guide will help you implement strategies to manage your Raygun error quota effectively.
The key to staying within your quota is to keep the Raygun dashboard organized. This helps you save time too, as you’ll prevent non-critical errors staying active, ensuring you and your team are only alerted to the errors that matter.
We recommend implementing the following to manage your event quota effectively. Each takes just a few minutes:
Crash Reporting:
- Pay attention to Spike Protection notifications
- Set up custom Alerts
- Triage and manage errors inside Raygun
- Mark errors as “Permanently Ignored"
- Use the “Usage” tab to see which errors are taking up the most of your plan quota
- Fix the errors in your application
- Order errors by highest occurrence
- Use the “Inbound Filters” feature to stop seeing unimportant errors
- Use the “Error Count” Dashboard tile as a visual aid
RUM:
APM:
Crash Reporting
Pay attention to Spike Protection notifications
A single runaway error can spike out of control and consume thousands or even millions of events. Spike protection prevents this by slowing down your processing during the first 24 hours of a spike. During a spike, your error volume per minute will be capped at 5x your historical volume from the past 24hrs. The limit is applied per minute to ensure some data is still captured throughout the spike. For example, if your average process rate is around 50 errors/min, and you start sending 500 errors/min, Raygun will accept and process 250 of those errors, dropping the rest.
Please note that the rate limit is calculated using your error process rate over the past 24 hours. So if a spike continues over a long period, the spike protection limit will gradually adjust to the new error process rate. This will continue to the point where all errors are processed again within 24 hours of a spike starting.
Taking action as soon as you get a spike protection notification will ensure that a single runaway error won’t rapidly consume your entire event quota. All plan owners will automatically receive all spike protection emails.
Set up threshold-based custom alerts
Set up a custom alert based on your chosen thresholds and get the alert sent to the places your team is already working via email, Slack, or wherever your workforce is with Webhooks.
Setting your own alerting criteria means that only the most critical issues are sent to you, reducing your signal-to-noise ratio. Use our custom filters to set thresholds for alerts based on an increase in error count, a spike in load time, or a new error occurrence. Once an alert goes off, the relevant stakeholders will be notified instantly. Alerts link directly to the issue in Raygun, with the actionable diagnostics to help you deploy a fix faster.
Extra tip: Make sure to name the alert in a way that prompts you to take notice and take action, so that you can spot the notification straight away in your inbox/third party tools.
How to triage and manage errors inside Raygun
When you launch Raygun Crash Reporting, you’ll see a range of tabs under the central graph:
Any new errors from your application will land in your “Active” tab. Depending on your notification settings we will also send you an email to let you know that an error has occurred. If you have integrated with one of our ChatOps providers (e.g. Slack), you’ll also get notified there.
To make sure you stay aware of the active errors your system is experiencing we recommend you check in with your Active” tab regularly. Raygun automatically orders errors in the “Active” tab chronologically. However, you can reorder by frequency by clicking the “Count” tab which will bring the highest occurring errors to the top of the list.
Errors ordered by “Count” will also indicate how many users have been affected. Raygun recommends that you prioritize error fixes based on how many users are affected by an error.
As long as an error remains in the “Active” tab, Raygun will continue to collect all the details related to the error which will count towards your quota.
Once you have fixed an error in your application and deployed the change, we recommend you immediately move the error group to the “Resolved” tab.
By keeping your dashboard organized, you can stay focused on the critical errors in your “Active” tab.
Mark errors as “Permanently Ignored”
Raygun will capture any errors that are happening in your system, including errors that may be low value or are not deemed to be of consequence for your system.
An example is if a customer’s web application is using an out-of-date and unsupported browser which caused client side errors to land in Raygun. You have no intention of fixing this type of error and do not wish to see them in your dashboard. Errors of this nature can both be distracting and have the ability to start consuming your quota.
To manage this, you can ignore them without consuming your quota by moving them into the “Permanently Ignored” tab.
You will see the following dialog where you can choose if you still want Raygun to collect any more information about this error (which will count towards your quota), or if you want to stop receiving the error information (which will mean that this error will no longer count towards your quota):
For errors that you do not intend to fix, invalid configurations that you do not intend to support, or false errors you want to ignore, we recommend that you select to stop collecting information about these errors to avoid consuming your quota. (Refer to the Inbound Filters area in this guide for how to set up filtering for errors that are deemed unimportant.)
Use the “Usage” tab to see which errors are taking up your quota
Have you ever wondered which application is contributing to the largest amount of error events reported inside Raygun? It’s quite easy to tell.
Head to your username located in the top right corner of your Raygun dashboard. Navigate to the connected plan for which you need to see your usage.
From here, select the “Usage” tab. On this page, you can view your total usage for the month with a complete error summary breakdown per application. If you have both Crash Reporting and Real User Monitoring enabled, you can sort this information by product:
The “Crash Reporting” tab shows the total processed events
This tab will break down the total processed events and reveal which applications on your plan were throwing the most errors contributing to your limit:
If you have a large number of applications, this tab quickly highlights which applications are consuming your quota. Use this tab to review the errors for that particular application. Use this as a measure of how to understand if you need to start resolving errors in your Active tab, or if you no longer wish to track them and can send them to your “Permanently ignore” tab.
The “Real User Monitoring” tab shows total processed sessions:
Use this tab to understand how many sessions are contributing to your usage.
Fix the errors in your application
The most effective way to reduce the number of errors being processed by Raygun is to fix them in your application. That way they are never sent to Raygun in the first place to be processed:
Raygun offers detailed diagnostic information about every error happening in your code. By selecting an error from the application dashboard, you can drill into the error’s technical information, and highlight where the problem is occurring in your code. We refer to this page as the “Error details” page:
The “Error details” page allows you to loop in your team members either by adding a comment and “@” mentioning them, or by using the assignment controls located on the top right of the page to assign them ownership of the error.
When an issue is assigned to a team member, the error will not only show up in its associated workflow tab (e.g. “Active” or “Resolved”), but it will also show up on the “My Errors” tab. This lets each of your team quickly see what errors have been assigned to them to fix.
Raygun also has a comprehensive set of integrations with all popular issue tracking software (e.g. JIRA or Visual Studio Online) allowing you to link a Raygun report with a new or existing issue in your issue tracking system. This lets you leverage your existing workflow for getting work completed and bugs resolved.
Order errors by highest occurrence and fix the biggest problems first
On the main Raygun dashboard view, you can sort the table underneath the main graph by the number of events (or “Count”). The highest frequency issues affecting your application will appear at the top of the table:
Prioritizing the errors that have the highest occurrences means you can vastly reduce the number of errors happening in your app while saving your quota from being taken up by the processing of these errors.
The more errors that Raygun processes for you, the more you use up your quota. Stop these error groups first and fix them in your application to keep you well under your error quota.
Set up Inbound Filtering to avoid seeing unimportant errors
Once you have an active triage and resolution process in place, you may find there is a commonality to the errors not relevant to your system.
To help save you time and effort in managing these, we offer the ability to define rules (referred to as “Inbound Filters”) which Raygun uses as a rule around which errors it should be ignoring.
A typical example is when crawler bots or other robot agents try to index your web application and raise errors. (Which is quite common.)
These errors often come from invalid or unsupported behavior. Rather than having to ignore these on a case by case basis, inbound filtering helps you create a rule that tells Raygun to discard any errors triggered by robots immediately.
Discarding errors in this way will save you the hassle of dealing with them and the frustration of being notified about a low priority issue.
Using “Inbound Filters” you can choose to ignore errors associated with the criteria listed below. For web applications, we also include a set of standard filters which cover bot traffic, 404’s and errors that we commonly see raised by vulnerability scanning tools.
Note: Inbound Filters don’t apply to currently stored errors.
Here’s how to set up the Inbound Filters feature
Step 1. Select the “Inbound filters” option
Head to your Raygun Crash Reporting application and select “Inbound Filtering”
Step 2. Select filter criteria
Select your filter based on one of the following criteria:
- IP address
- Machine name
- HTTP hostname
- Versions (including “no specific version)
- Message
- Tag
- User agent
Note: Wildcards are supported for all inbound filter types. New inbound filters are not applied to currently stored exceptions, only to incoming exceptions after the inbound filters are created/activated.
You can also toggle some of the more frequently requested filters on or off using the green sliders shown below:
Use the “Error Count” Dashboard tile as a visual aid
Raygun’s dashboard feature helps to organize vast amounts of data into easy to read charts and maps. Add the “Error Count” tile to your dashboard display to keep this number front of mind:
Error spikes are easy to spot with the “Error Group” tile. Use the TV Mode to display the data around your office, so error spikes are visible to the whole team.
Real User Monitoring (RUM)
Disregard session data in RUM
At the time of writing, the best way to disregard sessions from being recorded in RUM is the onBeforeSendRUM
event in Raygun4JS. For more details, head to our Raygun4js documentation on GitHub.
From here, you can inspect the raw payload that is about to be sent and then cancel sending. That could be on criteria like IP address, browser, tag, or any information Raygun collects.
Application Performance Monitoring (APM)
Set your sampling rate in APM
The sampling controls give you full control over the rate at which traces are accepted into the Raygun APM product. Sampling can be defined as a single rate applied to all traces or a rate per URL for a web application.
Raygun Agent-level sampling controls
All trace sampling settings are now controlled in-app. Any changes made to the sampling settings in the app are retrieved by all agents registered to that application and are applied within 10 minutes of making the sampling changes.
You can view trace sampling settings using the Raygun Profiler Configuration tool. To view the sampling settings for an application, open the Raygun Profiler Configuration tool then select the application that has been enabled.
In-app sampling controls
To view and edit trace sampling settings navigate to your Raygun application, select APM from the menu and then Sampling.
Sampling defaults
The sampling default options allow you to edit the trace sampling settings for the application. These settings are applied to all traces for the application, for each agent registered against the application.
Sampling can be specified as:
- Per trace - a single rate is applied over all traces
- Per URL - a rate defined in seconds, minutes or hours per URL and request verb (GET, PUT, POST, etc)
Sampling Overrides for Web Requests
For traces of web requests sampling overrides can only be defined per URL, allowing you to override (increase or decrease) the sample rate of a given URL.
For example you may have a specific website page you know is causing performance issues and you want to capture more traces of requests to this page. This setting allows you to increase the sampling rate of this page.
To define an override, enter the URL you want to override and the sampling rate. Click Add override to apply the settings.
Sampling Overrides for Other Traces
Not all traces will have a URL on them, for example, scheduled background activities. In this case, traces will be named by their entry point method - the first method in the trace.
To define an override:
Copy the trace name from the Traces tab and paste it into the Name field. Choose the desired sampling rate. Click Add override to apply the settings.
Note that this override type is only supported by APM Agent version 1.0.1172 or higher. Older agents will ignore these settings. This override is supported when profiling .NET / .NET Core code only. Support for other languages will be added in future.
Active Overrides
The list of active overrides allows you to see the list of current overrides. It also allows you to disable an override temporarily and delete overrides no longer needed.
To disable or enable an override simple click the slider control. To delete an override click on the X icon. Note that changes to this list are applied immediately.
In summary
Raygun ensures you have all the information you need to find and fix software errors. Adopt an effective workflow for triaging and resolving errors and you are sure to stay within your quota.
The Raygun team is always on hand to help with any questions you may have.