Maintaining Control within Incident Response Investigations - Part 3POSTED BY RYAN TROST
Security teams continuously look for ways to mature their process and improve their incident response efforts. Incident pruning should be one of the first activities to consider, however, it is commonly overlooked. Often, when an event is escalated to an incident it immediately attracts a greater set of eyes among fellow teams and that invites an increase of possible tasks, ideas, and a seemingly endless number of foxholes to investigate. This requires an incident response team lead, or someone with the ‘bigger picture’ authority, to deadhead the investigation (top down) to avoid inefficiency and confusion or perhaps an analyst to thin the investigation (bottom up) by only globally publishing relevant data points to the other teams.
Whether incident pruning is a new term to you or not, I guarantee your incident response team has been practicing it subconsciously to some degree to help prioritize tasks and available resources. Without incident pruning (or investigation pruning) investigations can spin out of control within a few minutes simply due to the number of possibilities – associated indicators, adversary aliases, MITRE ATT&CK tactics or techniques, victims, attributes, sightings, and more.
To wrap up this blog series, here are some of the most common questions I am asked when I discuss the topic.
Do you think an investigation needs to reach a certain size or complexity in order to require incident pruning – whether incident thinning or incident deadheading? If so, how can it be quantified? # of analysts? # of investigation possibilities? # of victims/targets? Lateral movement?
Absolutely, however, the juxtaposition of “when” incident pruning is necessary is going to subtly vary from team to team and incident to incident. Analysts with several years of experience often have the ability to track several simple incident paths without having to rely on incident pruning, but that quickly spirals out of control once the investigation reaches a certain complexity.
There are several investigation characteristics that will initiate the need for incident pruning, including:
- Increasing the number of analysts beyond four or five
- Increasing the involvement of more than three roles
- Increasing the involvement of two or more analysts within the same roles (symptomatic of a larger or more complex investigation)
- Transitioning from simplistic, fact-finding efforts to more complex attack methods that extend beyond five-eight investigation paths (each path requiring its own micro investigation, elaboration and collaboration)
- When the attack extends beyond 10 infected victims requiring individual dissection of attack
- When the attack reaches a point of lateral movement across five hosts (either directly from patient-0 or sequentially)
The above are general rules of thumb to draw a line in the proverbial sand to ensure investigations run efficiently, but exceptions always exist at both ends of the spectrum. Some teams will initiate incident pruning much later in the investigation, whereas other teams will initiate incident pruning techniques almost immediately when an investigation is created.
Should every possibility within an investigation have some sort of “likelihood or probability score” to better rank exploratory possibilities?
Maybe not “every” possibility but absolutely it should be pretty close. Teams can perform this more explicitly by literally assigning a score to each node/object. Alternatively, teams can do this more subconsciously where possibilities are prioritized in more of a “waterfall approach” with the highest priority items that pose the largest threat and likelihood to happen appear at the top, and decrement down to the end of the action list. The waterfall approach caters to a link-analysis type investigation methodology because it is visually driven.
Does orchestration/playbook automation play a role within incident pruning efforts?
The human element will always remain vital in security operations but automation has a place. With respect to incident pruning, automation logic can be applied to perform incident pruning based on:
- A low threat or probability score provided by an analyst
- A stale node/object based on the lack of activity taken against an item over a prolonged period of time
- A lack of associated characteristics strongly suggests the item is a conceptual stretch within the investigation and is irrelevant to the case
Are there different stages of an investigation that require more pruning than others? IR workflows vary from team to team so a good benchmark to use is the SANS Incident Response process which includes Preparation, Identification, Containment, Eradication, Recovery, Lessons Learned.
The SANS’ Incident Response checklist provides a checklist summary regarding each IR stage. Below is a description of the checklist phases along with how incident pruning effects each stage.
Preparation: As expected, this stage revolves around ensuring the team is ready for incident response actions – ownership and responsibilities for all systems, communication channels are agreed upon, the team understands the structure of an attack, standard operating procedures (SOP) are defined and agreed upon across, etc.
The Preparation stage does not lend itself to incident pruning. Preparation is more of an anticipatory checklist for the team and outside of minor periodic adjustments remains static.
Identification: This stage encompasses the discovery of the intrusion, whether generated internally or externally, as well as, the research and categorization of the attack to determine the depth and breadth of the intrusion. The identification stage includes a significant amount of the work due to the aggregation of information gathered and analyzed, and numerous attack possibilities. This includes matching the information gathered to the MITRE ATT&CK framework to better understand the attack landscape and possibly learn which adversary is behind the attack. This stage also concludes the Mean-Time-to-Detect (MTTD).
The Identification stage does lend itself to incident pruning, whether through incident thinning or deadheading, because in this stage a significant amount of the work requires brainstorming and exploration of possibilities. To minimize investigation clutter and duplicating efforts, teams need to consciously sanitize and maintain a clean investigation to ensure effective and efficient defenses. The team’s dichotomy/taxonomy will determine the degree of incident pruning necessary, but this stage will inevitably entail the most incident pruning across the SANS IR stages.
Containment: This stage is a concerted effort to stop further entrenchment. However, this stage is controversial because some teams believe the priority should be to isolate and remove the threat immediately, whereas others believe it is better to monitor the adversary prior to “pulling the ripcord” to study the adversary’s movements and logic and to identify any “sleeper cells” hidden amongst the environment. Both are reasonable paths for an incident response Team to pursue and are dictated largely by the skillset of the team and sophistication of the tools to help “safely” isolate the adversary to negate any additional harm. For example, the ability to re-route them to a controlled environment or deception technology (i.e,. Attivo Networks).
The Containment stage is an actionable phase where the security team is implementing steps to minimize additional adversary entrenchment. This stage includes updating signatures, deploying IOCs, correlating additional logs into a SIEM, pushing infected hosts into a controlled environment for monitoring, etc. The Containment stage is more of a checklist of counter measures to enable/deploy which includes incident pruning methodologies, specifically incident deadheading to eliminate counter measures that are not effective.
Eradication: This stage focuses on the removal of all malicious software, implants, remote access tunnels (RATs), internal command and control infrastructure, or backdoors from the environment. This is probably the hardest stage because adversaries can camouflage themselves extremely well.
The Eradication stage starts by having a list of all the infected hosts and all the malicious software that needs to be removed. This phase caters greatly to a checklist effort, and therefore, incident deadheading. However, rather than deadheading in the sense of deleting, leverage the Shared vs. Private feature to highlight what tasks need to be accomplished. In this case, which hosts are still infected and which ones have been returned to a safe end state.
Recovery: This stage focuses on restoring all business functions, from bringing servers back online to regaining employee productivity, and includes implementing any necessary defensive measure to ensure future attacks will not be successful. The Recovery stage also concludes the Mean-Time-to-Respond (MTTR) stopwatch.
Restoring business function, though critical, is not likely going to be enumerated within ThreatQ Investigations. However, tracking the implementation of the necessary defensive measures will align with ThreatQ Investigations and the efforts of the larger security team. In this situation, most teams will review the overall attack (i.e., align it to an attack framework like MITRE ATT&CK or the Cyber Kill Chain) to try to identify multiple defensive countermeasures for each attack phase. Typically, the countermeasures are not new technology deployments but rather correcting a misconfiguration, updating the proper signatures, funneling respective logs into the SIEM for correlation, or adjusting an internal workflow to ensure an alert receives attention significantly faster. Similar to the eradication stage, this type of checklist effort, caters greatly to a deadheading approach.
Lessons Learned: This stage is unequivocally the most commonly overlooked process – primarily because either the existing daily tasks have piled up and the team is playing catch-up, or the team is inexperienced and doesn’t perform an after action debrief.
The Lessons Learned stage is crucial, but most of the heavy-lifting effort has already been performed so there is no need to prune any of the data. The focus of this stage is to summarize the incident and determine if the counter measures in place will be effective in the long run.
I hope you’ve found this series on incident pruning helpful and that you’ll start to use ThreatQ Investigations to help you in the process. As I mentioned at the beginning of this series, incident response investigations are complex efforts, shifting between chaos and order. Without incident pruning and its offshoots (incident thinning and incident deadheading), investigations can spin out of control within a few minutes simply due to the number of possibilities – associated indicators, adversary aliases, MITRE ATT&CK tactics or techniques, victims, attributes, sightings, etc.
While many teams may be practicing some level of incident thinning and/or deadheading subconsciously, the use case should help you formalize the process and apply it more effectively and efficiently at the right stages of an investigation.