News that as many as 143 million Americans may have had their data stolen from the consumer credit reporting agency Equifax has once again thrust data security into the news.
While this attack was especially egregious — stolen data included Social Security numbers, birth dates, driver’s license numbers and some credit card numbers — every data breach has the potential to wreak havoc on lives and businesses.
Consumers want to know that they can do, and as importantly, they want to know what the companies they entrust with their data are doing to keep it safe.
To learn how Rackspace handles data security, I sat down with Rackspace Managed Security Director of Operations Daniel Clayton and the Senior Manager of our Customer Security Operations Center Travis Mercier to learn what practices all companies should follow to improve operational resiliency to better respond to similar attacks in the future.
Clayton, who came to Rackspace after more than two decades building and running operations for the NSA and the British military, oversees global customer security operations and strategy for Rackspace Managed Security. Mercier, who has more than 13 years of cybersecurity experience, runs day-to-day security operations for the company.
After a major breach, what goes on behind the scenes? What happens first?
TM: While every security operation is different, the first thing you have to do is assess the situation — understand what you’ve lost, who needs to be pulled in from various departments, and then start investigating the facts.
The next step is to determine the who, what, when, where, why and how. Specifically:
- Who or what actor was behind the data breach? This can help identify the why and how to help understand motivations and tactics, techniques and procedures, or TTPs. It’s important to not spend too much time at this step because you run the risk of bogging down answering the other key questions, and while attribution is helpful, it’s not the end all be all in an investigation.
- What data was taken and what is the scope of the data compromised? This can help you understand what data is of interest to the adversary and their motivations, to begin looking where else the adversary may have pivoted or other places the adversary may have tried to gain access.
- When did this data breach occur? When did this activity start and how long did it go on? This will help scope the investigation by narrowing down a timeframe for the security team to analyze. It avoids boiling the ocean — reviewing all activity over a larger period of time.
- Where was the breached data stored? This again will help limit the scope of the investigation and focus efforts on a specific set of systems, dramatically cutting down the time it takes to begin the investigation and start determining next steps.
- Why did the adversary target these systems? This is really to better understand the adversaries motivations, which can help determine who the adversary is and also what types of data and objectives they are trying to achieve.
- How did the adversary gain access to the compromised systems and applications? This is arguably the most important question and is key in remediating the breach and cutting off the attackers’ access. Understanding how attackers were able to exploit the environment enables organizations to deploy counter measures and remediate vulnerabilities to stop the current attacker and help prevent other attackers from using similar techniques to gain access.
Once all that is determined, the next step is to figure out the best way to stop or slow the bleeding. This will depend on the adversary and how long this breach has been active. If you know this is the first time they’ve come into the environment and have only been in for 24 hours or less with minimal actions taken, it’s advisable to cut off access immediately. But if the breach has gone on over six months or longer, it’s likely they have multiple ways to get back into the environment, and you need to assess those before you fully engage.
If the compromise has occurred for a larger period of time, it’s critical to put some mitigating controls in place, but you want to be careful — you may not want to totally cut off the adversary, because if you tip them off, they could go dormant — or worse, become malicious and destroy evidence. That can really complicate an investigation.
How does a company organize the remediation efforts? How do they communicate what’s happened?
DC: There are three levels of communication that are very important in these situations. First, you need an internal communications plan. There will be multiple teams within the organization responding to a breach. Maybe that’s because you need to take forensic images of a particular system or you need to collaborate between the security and IT teams. Whatever it is, an internal communications plan has to be in place, has to be understood and has to be executed so everyone is working together.
Secondly, you have to think about what to tell your customers. Is it possible that whatever has happened is impacting them — not in terms of their data being out there, but has the adversary been able to move from your environment to a customer environment? Do you need to be actively communicating from an incident response perspective to your customers?
Finally, there’s the legal side. What are you obligated to tell people, what is the right thing to do in terms of transparency, when do you make those calls, when do you speak to the press, etc.
A lot of the time, security people will assume the worst — you’ll see something and be pretty convinced there’s been a data exfiltration and then it will turn out later that wasn’t the case. I‘ve seen that many times, and so it’s important to have the communication piece buttoned up and be clear about what you’re saying, otherwise you can make things worse.
Are the first 24-36 hours really key?
TM: Yes. That’s when it’s time to get all the appropriate parties into a war room — a large conference room or a digital platform where people can talk about the situation.
The goal is to keep the lines of communication open between teams. Typically, these are out-of-band types of communication, because if you have an adversary in your environment, you don’t know if email or other methods of communication have been compromised. So it’s key to develop a plan of action around how to divulge information to your teams.
What can be done to help prepare for a breach?
DC: When I presented at Solve New York, Sean Wessman, the principal and cybersecurity leader for Automotive and Transportation at EY, described an incident he responded to; when he arrived in the building, people were walking down the halls crying, because they thought it was going to be this Armageddon event for the company. It’s not uncommon for people to have an emotional reaction to mistakes that may have been made — people take it personally.
If you haven’t prepared for it, it’s during this period that people aren’t thinking straight, and things can happen that result in litigation. Someone tries to cover something up, someone tries to fix something without understanding the implications of what they’re doing — these are the things that can come back and bite you. You must try to prepare for every eventuality, so when you’re reacting, it’s like a drill, it‘s muscle memory, as opposed to an emotional response, which is what happens most of the time.
My advice is to not just think about the security team but also how to support the security team, and all the peripheral things that need to happen to make an incident response effective. So that when people have to deal with it, they don’t feel like they’re doing it the first time. It’s not “oh my god,” it’s “oh my god, I know what to do.”
How do businesses improve operational resiliency?
DC: Generally, we think of operations in three layers — there’s maturity, which is what you’ve implemented from a people, process and technology perspective. There’s effectiveness, which is your ability to deliver the potential of what you’ve built. And then there’s resiliency, which is your ability to deliver under stress and under pressure.
The resiliency piece is what’s important in these situations. This is where things go wrong, where people get emotional and tired and make bad decisions. In all of my experience, I haven’t found a better way to prepare for that than practice. Most companies don’t practice at all – they put things in place they think will protect them then cross their fingers, but don’t think about what to do when things do go badly. If you haven’t thought through and processed what’s going to happen when you execute on something, then you’re really trusting to luck.
Even the most forward-thinking companies that do war gaming and take this seriously, nine times out of 10, they’re only testing the security operations center. They’re not testing their response on an enterprise level. I know I’ve said it before, but that is the stuff that gets you sued – it’s mistakes you make there that get you into trouble.