Writing Alert Names that Don't Suck - How to win hearts of SOC Analysts

“What you see and what you hear depends a great deal on where you are standing. It also depends on what sort of person you are.” ~ The Magicians Nephew - CS Lewis

Set the scene

Detection Engineering is not just about the technical nitty gritty, it almost requres you to be a storyteller

Detection engineering is inherently a very technical and often abstracted process, but conversely there’s a very intricately human part of creating a detection. The ultimate goal goal of detection engineering is to select an event that requires further investigation out of a vast ocean of events and land that in front of an analyst and (hopefully do that in the most efficient way possible), human interface is a key part of detection engineering.

But once you’ve picked a specific event to show to an analyst how do you tell the story at a glance, how do you tell the story so that an analyst is setup for success with triage and investigation, an alert name is often the very first contact, so it has to hit home first ball.

There’s a TLDR at the bottom (of sorts) if you don’t really feel led to read these (massively overcapitalized) ramblings

The Problem

How did we get to this blog post.

I was recently re-listening to Episode 24 of the DCP podcast where they were interviewing Jamie Williams from Mitre. They were talking about EDR evaluations, and Jamie made a comment that really resonated. He said:

“…We know the scenario we executed like the back of our hand, we built slides we walk them through it from a purple team perspective. Part of our results analysis, is turn your brain off pretend I have no idea what’s going, look at the images from the vendor does it actually detect and say what’s going on, and how does it communicate that to the end user based on the scenario we’re running Some of the fun we do is just that, looking at a picture and the vendor says this is this, and we say well does it really, I read everything in front I me I understand the hood what’s happening, and what the motivation for this UI was, but it doesn’t communicate it as well as you think it does "

And that comment really just struck me, and I just went “damn he’s so right there.” This really just got me thinking about some similar problems I’ve observed in various SOCs and detection teams.

Problem numero uno: You’re tasked with detection engineering, you pick a TTP, you pick a specific type of behavior, you research, you might make a tool graph (hey jarrod), and map out what happens, you do whatever you need to do. You get the syntax down in your tooling of choice, you add in necessary doco required to triage and investigate this, you might test the alert in your environment too.

But then you get to the end and you’ve put in a huge amount of work and you got to save the alert and sometimes you’ve gone so deep with this whole process that you get to the end and name the alert in 30 seconds and save it because your pushed for time.

The problem here is you’ve got deep intricate knowledge of what you’re attempting to detect and that potentially is going to cloud your judgement when it comes to naming the alert. Something that’s so simple to you, might be less simple to other with less background knowledge.

Problem numero due We deal with this consistently in my SOC, this problem is mainly with second party security products such as an IPS/IDS, EDR, Microsoft Azure Identity Protection, and their own individual and unique names for certain activities, it’s not so much wrong (although sometimes when I see them I very much feel they are), it’s just they’re often just using weird words or using specific terminology that’s unfamiliar. As Jamie said vendors are often not able to remove themselves from the product and often times they’re not fantastic at communicating what’s happening. If you’ve ever been on-call and seen a “UNFAMILIAR SIGN IN PROPERTIES” alert show up in your paging app for the first time you may agree

How do we fix this?

Good Communication

Very quickly I think it might be sensible to figure out what good communication really looks like.

I actually only recently read this study on how to communicate with future societies about the storage of hazardous nuclear waste Expert judgment on markers to deter inadvertent human intrusion into the Waste Isolation Pilot Plant

And this quote stood out to me:

“Before one can communicate with future societies about the location and dangers of the wastes, it is important to consider with whom one is trying to communicate.”

So point one is who are you trying to communicate here is also important, are they entry level, technical or non-technical?
It’s clear, good communication is clear and straight up, we don’t dilly dally around the point.
Using big words is cool, but maybe leave them out at this stage, while you might want to say how something is Salient or rather Obstreperous, however it’s super important when you’re communicating with someone that they understand the words you’re using, otherwise if you use big words people will just think you’re a bit of weird unit.
And finally, good communication is it’s generally a two way street. Meaning there is some sort of feedback mechanism.

The Plan :tm:

So here is how in a perfect world I would go about solving this problem, or at least making it better.

So here is a few steps that I like to take to make this better.

At the top of the list is have a controlled language and a documented terminology
Develop a SOP for creating alerts that includes a peer review process
What do you actually put in that SOP? What works well in alerts.

Step 1 - Controlled Language And Terminology

If you’re going to call an alert Suspicious XYZ happened then the you very clearly need to define what the suspicious parameters are easily. They need to be defined at least in a field in the alert or in a easily accessible taxonomy.

The University of Melbourne has a Great Guide on how you actually develop a naming convention.

Specifically that they call out this:

Using consistent naming conventions has many benefits, including:

Improved retrieval of documents on shared drives and University systems
Facilitated disposal of documents when no longer required for business
Ensured current or active version of a templates can be easily identified
Supported sharing of information within your team and with collaborators
Easier and more efficient file naming for colleagues as they don’t have to ’re-think’ the process each time.

Step 2 - Develop SOP for Creating Alerts

I mean this really is what it says on the can. When you are developing naming conventions often just the simple act of getting together and talking about what works for people and what doesn’t is a great starting place.

The end state here however is a well defined and written SOP. This SOP should go into the specific details and structure that you use for naming alerts at your organisation.

You should also include a peer review step as part of the alert implementation, this should be somewhat related to the technical process peer review, I’d recommend calling it out as an individual step in the process, however do what works best for your team’s process.

Step 3 - What actually works well in alerts

The 5 levels of messages

Sometimes super detailed messages are not always what you want in an alert, more words does not necessarily mean an alert name has the correct details.

Again drawing on my current obsession which is the Sandia labs report on long term radiation messaging, but they have tiers of information in messaging:

Sandia Labs

Leaning on PowerShell

I think what I really want to draw on is this, the verb noun syntax for PowerShell. I really like the concept of this as it really sticks in your mind once you know it “The verb part of the name identifies the action that the cmdlet performs. The noun part of the name identifies the entity on which the action is performed”

https://learn.microsoft.com/en-us/powershell/scripting/developer/cmdlet/approved-verbs-for-windows-powershell-commands?view=powershell-7.3

For example, take this powershell string, where at the start we have what we want to do (verb) and then after that we have on what (noun)


Get-Service | Where-Object -eq "blah"

What would this potentially look like as an alert.

If we go along the lines of this action happened here:

If we’re to draw on a common taxonomy for setting words, in this instance, Mitre ATT&CK. We might end up with this

Potential Credential Dumping of LSASS Occurred on LAP0002 By Admin_Tim

Suspicious Login from Anomalous location and device, user Tim logged in from new laptop and city

While ATT&CK is good for some things and not good for others, (a topic for another day) it is good as a Folk Taxonomy, and using the Techniques and Procedures are a great way to get everyone singing from the same hymnbook.

It’s also a good idea to put technical details in the alert to assist with earlier triage, to take from the example earlier, the things that are anomalous about the login, put them in the alert name

Obviously we don’t want entire paragraphs in the alert name, however where it’s applicable you can help facilitate a faster triage process by putting information that the analyst needs to make a decision, especially in times of pressure such as after hours, (anyone who’s ever triaged a Defender for Identity DCSync alert at 2am after deep sleep would know that you’re not exactly dealing with a full deck)

All in all, it’s very important for analysts to understand WHY and alert triggered, and we can help preload that by thinking about our alert names and using known words that trigger meaning in their mind.

Fin/TLDR

Thank you for your time, if you got this far nice work.

Upon reflection after writing all this, it really just boils down to. Don’t create alerts in a vacuum and ask your team if what you’re trying to say makes sense

You as a team, must agree on how you will communicate to each other via alerts. If there are team specific terminologies write them down, define them and make them easy for everyone to get to and use a common taxonomy

I would also recommend to document how you name detections/alerts and communicate the nomenclature to your analyst team and detection engineers to ensure important alerts are instantly recognized. This not only increases analyst productivity but also ensures critical alerts are treated with the priority they require during the triage process.

Short, sharp and concise is key to to this, I as a PowerShell nerd have very much drawn on the noun-verb philosophy here and like using action happened to thing and using a known taxonomy and know categories of attacks (for example Mitre ATT&CK)

Getting your alerts technically peer reviewed is crucial, but as mentioned earlier it’s also crucial that as part of your peer review process is that your reviewer reads out your alert name and is able to communicate back to you what it means and hopefully if you’ve done your job right they tell you the specific scenario that you’re looking for with the alert.

“Hello respected colleague, " Speak out your alert name to a minimum of one other person to ensure you’re telling the story right.

To reiterate, contrary to popular belief there is sometimes too much of a good thing*, more words does not always necessarily mean more better explained.

*more cake is always a good thing

stay groovy