Maintaining Control within Incident Response Investigations - Part 1POSTED BY RYAN TROST
Several weeks ago, I grabbed a drink with a friend and he randomly mentioned the term “incident pruning” and how critical it is to security operations efficiency. He’s got nearly 15 years of experience running a global incident response (IR) team for a large, highly targeted commercial enterprise so I knew it was something he’s been using in the trenches for quite some time. Although, our conversation didn’t focus on the term, having run several large security operation center (SOC) teams (each including an IR component) it piqued my interest enough that I spent the next couple of days pondering the term. What is incident pruning? Do all investigations require incident pruning? Are there best practices wrapped around incident pruning? Does a team’s size, maturity and/or skill set dictate incident pruning? Can automation (e.g., SOAR) play a role within incident pruning or is the process strictly manual? And, do certain stages of an IR investigation use and benefit differently from incident pruning?
The definition itself can raise questions: The process to remove investigation paths during an incident that have been deemed benign, irrelevant, or out-of-scope. So I decided to soundboard the topic off several industry friends to get their operational insights. The questions outcome was fascinating because my friends were pretty evenly split – 50% agreed with the definition, while the other 50% were strongly opposed to my definition due to the term ‘remove’. They argued that within their investigations, even if an investigation branch results in a false negative – meaning either no results or no relevant results – for visibility and to ensure the task wasn’t duplicated further down the investigation, it should not be removed.
Another interesting observation surfaced as I was thinking through my friends’ responses and how they were pretty evenly divided into two different camps. The friends who were agreeable to my original definition had spent a majority of their careers on an operational team NOT SOLELY focused on incident response but rather straddling many different security operations responsibilities. Whereas my friends who had worked on a dedicated incident response team or service offering, were hung up on the term “remove.”
This made a lot of sense and I could relate given separate teams I have managed in the past. Dedicated incident response teams that provide that skillset on a daily basis, perform IR so frequently/repetitively that each incident response engagement turns into a relatively sequential checklist of tasks to perform. So, as one friend pointed out, even if a task results in a negative finding, they still need to track and archive it in order to proceed down the checklist, avoid duplicating efforts in the future, and demonstrate they are following an established process. Given that perspective, incident pruning is more of a prioritized checklist of actions where NOTHING is deleted.
The opposing school of thought, however, where incident pruning relies heavily on the deletion of irrelevant branches, stems from analysts with a wide variety of daily responsibilities oftentimes outside the typical incident response efforts. These analysts tend to be more jack-of-all-trades where “incident response” is strictly a job title, whereas the reality of their roles and responsibilities is to tackle the spectrum of tasks found within a security department. Because their primary focus does not solely revolve around incident response actions they do not have a predefined checklist to enumerate. Instead they need to cast a wider net of intrusion possibilities and with each path explore the threat likelihood to either eliminate it and move on to the next path or advance the investigation by connecting the dots.
I struggled with the two viewpoints and decided the term itself, incident pruning, needed to be re-calibrated. For most audiences, the term incident pruning can probably be used interchangeably without raising too many objections. However, I wanted to be a bit more precise since both sides pose valid but completely different viewpoints – delete or de-prioritize. As such, the term incident pruning needs to be broken down into two more distinct terms to align with delete and de-prioritize. So, I settled on “incident deadheading” and “incident thinning”, respectively. I realize they’re not as catchy, but both more accurately describe the overall intent of a team’s efforts.
Reaching into my experiences, I can relate to both across two different teams. To demonstrate the ‘incident thinning’ mentality, I was managing a large government SOC and the client required that each investigation be peer reviewed by a completely separate team for accuracy and thoroughness. In this situation, we relied 100% on ‘incident thinning’ because our incident responders could not delete any part of their investigation as their peer review counterpart wanted to see all the same information – whether a negative finding or not. However, this approach became exponentially more difficult when complex investigations spanned several weeks, jumped lateral points, included ‘sleeper cell backdoors’ or simply had a lot of contextual associations.
To highlight the ‘incident deadheading’ mentality, I was managing a 50+ analyst Defense Industrial Base (DIB) SOC team that worked hand-in-hand with a separate incident response team. The IR team was extremely talented but understaffed and had significant responsibility outside the traditional incident response tradecraft. In most cases, when an incident was confirmed both the SOC and IR team would collaborate and whiteboard sessions were more ‘organic’ as we would brainstorm all the incident possibilities. As a result of this investigation strategy we would subconsciously prioritize the list based on the probability or likelihood. This approach does not lend itself to incident thinning because there are just too many investigation possibilities to track, so as investigation paths were deemed irrelevant or disproven we would remove or “deadhead” it from the board. This kept the team more focused on higher risk efforts without the distractions of keeping everything and getting trapped within the ‘fog of war’.
To sum it up, incident response investigations are complex efforts, shifting between chaos and order, as the incident lead maintains investigation alignment with IR policies, while the team chases down every possible clue leaving no stone unturned. Without incident pruning and its offshoots (incident thinning and incident deadheading), investigations can spin out of control within a few minutes simply due to the number of possibilities – associated indicators, adversary aliases, MITRE ATT&CK tactics or techniques, victims, attributes, sightings, etc.
In Part 2 of this blog series I’ll talk about how to apply incident pruning to incident response investigations using ThreatQ Investigations.