Grouping not working correctly
ToryB
Posted on
Jul 16 2015
Dear Raygun, We noticed a few weeks ago that our grouping has become increasingly inaccurate.
For example, some errors are miscategorized, causing us to overlook important bugs. Example: https://app.raygun.io/dashboard/b9tzih/errors/806837024#4251637693 is categorized as "Error: The security token included in the request is invalid." when you can clearly see the message is "Error: The level of configured provisioned throughput for the table was exceeded. Consider increasing your provisioning level with the UpdateTable API" - an error for which there's already a group.
Likewise, some errors aren't being grouped. https://app.raygun.io/dashboard/b9tzih/errors/809609731 and https://app.raygun.io/dashboard/b9tzih/errors/803738576 have the same error message, but aren't being automatically grouped.
What's going on with your magic grouping logic? It seems to not be performing as well as it was before :(.
Callum
Posted on
Jul 17 2015
Hi Tory,
The behavior you're seeing is due to the code-path grouping working as it has been previously. In the first error group above the various error instances all have the same stacktrace (with the exception of the first frame; this is not used as it contains the message) and thus are grouped together. We don't use the message because in a large amount of cases it contains junk instance data that would otherwise cause instances to be split out into duplicate groups, for instance with user IDs or DateTimes strings.
The scenario with the first error group is occurring due to it being a PHP Error originating from a centralized logger class. Essentially errors or exceptions need to provide a ClassName string so we can differentiate on them without using the message. If it is occurring in a library to get this working you could trap the Error at a lower level and rethrow it as a subclass of Exception.
With regards to the apparant duplicate DynamoDB Throttled error groups, again these are different stack traces which represent different code paths - if you paste two stacks from two instances into a diff checker tool you'll note that differences appear at frame 34, where some are DynamoDbClient::putItem and some are DynamoDbClient::getItem, alongside further differences with the daoObject.
In order to tidy these up you can select them with the checkboxes on the error group list (https://app.raygun.io/dashboard/b9tzih) then click Merge.
There is an alternative - we could switch this app over to use naive message-based grouping, however this will reassign all ignored errors to new, active groups, and as above any messages that contain junk instance data will cause duplicate groups, potentially making the issue worse.
Regards,
Callum Gavin
Mindscape Limited
ToryB
Posted on
Jul 18 2015
Hey Callum - I think we'd like to give the message-based grouping a try. That gives us more control. Can you let me know when that might roll out so we can keep an eye on it?
ToryB
Posted on
Jul 21 2015
Hi Callum - any word on when we can expect this setting to be changed? Also, we'd like it changed on all our applications.
We need this as soon as possible - we're having a lot of trouble differentiating the errors we need to action on versus those that we just need to monitor right now. It's causing substantial problems for our team.
Thanks in advance.
Callum
Posted on
Jul 23 2015
Hi Tory,
This has now been actioned for your LeadId-Create application; if you want any other apps switched over to message-based let me know. Note that any instance data in the message will cause duplicate groups to appear.
Regards,
Callum Gavin
Mindscape Limited
ToryB
Posted on
Jul 23 2015
Can we get our Audit application switched over as well? Thanks.
RyanR
Posted on
Jan 12 2016
Can you also enable "naive message-based grouping" for the apps in my account as well? We have a number of instances where the grouping feature is not working as expected and I expect this would resolve the issue. Better yet I would like to see this as an option that can be enabled by the user (us).
klaird
Posted on
Mar 10 2017
Is this default behaviour now, or also something we can look at - I've got a bunch of .Net errors where the errors are quite different to each other but are grouping together.