Sara Anstey, director of data analytics at Novacoast, knows a thing or two about data and how it can drive business decisions and optimize results. In this week’s episode, she shares her analytics expertise and how it can help shape an organization’s business and cyber model.
Ep. 25 | Reimagining Cyber | Data Analytics in Cyber: Work Smarter, Not Harder | Sara Anstey
Sara Anstey 00:03
You know, it's a great point that we're not reinventing the wheel here, we're not creating some brand new concept for how to analyze data. We're just reinventing it within cybersecurity, because cybersecurity as a whole has always been kind of lagging behind these other industries. And all we're saying is, if this is working for someone else, let's use it for us. We're trying to solve the same problems here.
Rob Aragao 00:27
Welcome to the Reimagining Cyber podcast where we share short and to-the-point perspectives on the cyber landscape. It's all about engaging, yet casual conversations and what organizations are doing to reimagine their cyber programs, while ensuring their business objectives are top priority. With my co-host, Stan Wisseman, Head of Security Strategist, I'm Robert Aragao Chief Security Strategist, and this is Reimagining Cyber. So Stan, who do we have joining us for this episode?
Stan Wisseman 00:55
Rob, our guest today is Sara Anstey. Sara is a Data Analytics Manager at Novacoast and is passionate about empowering businesses to use everyday data to make strategic business decisions. She believes that the intentional adoption of a data-driven culture can be a key differentiator to companies in today's security climate. Sara has experience in custom web development, artificial intelligence, data analytics, business intelligence, and applieid statistics. It's great to have you with us today, Sara. Can you start by sharing a little bit more about your background for us?
Sara Anstey 01:30
Yeah, thanks. No, it's great to be on. Thanks for inviting me. Um, so yeah, like, like you said, I'm a Data Analytics Manager for a company called Novacoast. We're a cybersecurity consulting firm. So, a lot of the work that I do is helping our clients understand their risk and understand their data within the cybersecurity space. I work with a lot of software products in the cybersecurity market to try to help with their dashboarding and their reporting. I try to do a lot of executive reporting to CISO levels, to the CEO to the board, and really help everyone make the best business decisions from their data and get the most use out of the data they're collecting in all of their systems. So, as far as my background, I've been working with Novacoast for about four years now. I worked for Apple on their maps team as a data scientist, and I majored in engineering from the University of Michigan.
Stan Wisseman 02:23
I think data analytics is being used seemingly by everybody, right, that to, to apply that kind of science to different use cases. And when it comes to cybersecurity, what are some of the specific benefits that you're seeing companies start to realize by using data analytics?
Sara Anstey 02:41
Yeah, absolutely. It's definitely something that's being adopted a lot more lately, not only in cybersecurity, but a lot of different industries in the business world. And what I find is that there's a huge step between using your data and analyzing it, and then using it to make business decisions and actually get intelligent information from it. And that's a step that companies are having a really hard time making or trying to invest a lot of money into, to make that leap. Within cybersecurity specifically, one of the biggest benefits of analyzing data is that everything within cybersecurity is is very not visible, something that is really hard to see. You can't really see your risk of getting breached, you can't see attacks coming in. And for that reason, a lot of the times CEOs or the boards of companies don't want to invest that much money in it, because they don't necessarily see a return on investment from it, or they don't see revenue being generated. So, in a way that we've been using data data analytics for a lot of our customers lately is to help people understand the value of the investment that they're making in cybersecurity, and why it's important to purchase different tools or have headcount within your cyber department, even though it's not necessarily generating revenue for a company.
Stan Wisseman 03:53
So, if you can demonstrate that value to executives, and they can, you know, hopefully get behind the the business case, to purchase or to support the team.
Sara Anstey 04:03
Exactly. So it's about looking at not only you know, what different areas are bringing in revenue, but how can we stop the potential loss of money for an organization, and realizing that investing in something that is going to stop you from losing money is just as good as making money in the organization. And that that still needs to be invested in and helping to understand that, you know, the vulnerability management product that you purchased is actually helping you patch and it's helping make you more secure, and helping you understand that using a PAM solution is going to, you know, make you more secure within within your access management. So really just helping people see the value of the tools that they already have, and then realizing whether or not they should purchase additional ones.
Rob Aragao 04:45
That makes sense. And it's a lot of the same things. I think, you know, we've been hearing right, you're talking to different organizations out there and the value realization as you just mentioned, I think is still a problem area. So when you when you take into consideration the conversations that you're having, engagements you're having with different clients. We're still seeing kind of this, this difficulty of risk quantification, cyber-related risk quantification, and providing clarity of that, right. And something that, as you said, it's tangible, so they can understand it, they can understand the impacts to the organization. When you look at different models, you know, such as fair and just kind of some question marks at times around, you know, how applicable is it and things of that nature? That's fine. But really, what are some of the different ways that you're starting to see the board, the executive suite, have more clarity on what the actual kind of risk ramifications can be kind of one of the visual representations? What are the models are able to start providing value that are relatable to them and understandable to them as to the impacts that would mean to their organizations?
Sara Anstey 05:47
Yeah, yeah, absolutely. So one of the most important ways to help the board or the CEO, realize that value and start to communicate this to them is, like you said, what we call quantifiable or quantitative risk analysis. So this is about stopping and stopping from using those qualitative skills. So think red, yellow, green, or, you know, scale of one to five and rating risks on that type of risk matrix to move into really more of a quantitative approach. And you mentioned FAIR, that's what they're all about. So the FAIR Institute is trying to help cybersecurity use quantitative models from the financial industry, and understand your risk in that way. And that's also you know, what we've been doing and helping our customers with; so going to your CEO or your board and saying we have 800 critical vulnerabilities right now, and we can patch 100 In the next month, you know. That's really hard to communicate with them, especially if it's a non-technical audience, or it's hard for them to translate those technical metrics into ROI, or what should we invest in the organization. Whereas, if you come to them with dollar amounts in risk likelihoods, it's being a lot more in their terms. And that's all about what quantitative risk analysis and Monte Carlo simulations do. It's helping to translate those really technical cybersecurity metrics into our risk for getting breached this year is $250,000, we should invest a different an additional $1,000 or $100,000, you know, to mitigate that risk, and it's helping to translate things that that board wants to hear and what they understand.
Stan Wisseman 07:26
So, you mentioned Monte Carlo models, and I guess I've seen those used elsewhere, but I hadn't really necessarily seen it applied in the cybersecurity context. Can you expand on how the use of Monte Carlo models can help potentially predict the probability of different outcomes and provide some examples? Have you utilized them?
Sara Anstey 07:48
Yeah, yeah, absolutely. So Monte Carlo models are a really interesting use case, because they're used a lot of times, like you said, in other industries, such as the financial industry or the insurance industry, to try to understand risk and how much premiums people should be charged. So if you think about the risk of insuring a patient, so you don't know if they're going to get cancer in the next year, you don't know if they're going to get in a car crash, right? You don't know the risk of something happening to them. But you have to charge them a premium, to try to mitigate and understand your risk and still make a profit. The models that insurance companies use for that is called a Monte Carlo simulation, where you basically will simulate the next year, 10,000, 100,000 different times with all different variables, all different things that could possibly happen. So all of your inputs are going to be a range. It's something that could happen in the next year, you know, you can either get cancer, you could stay healthy, you could develop diabetes, all these things that might happen and what the likelihood of them happening is, and then you simulate that next year, 100,000 times and you see, on average, how much are we spending to insure this person, and from there, you can develop what your premium is going to be? If you translate that over to cybersecurity, we can actually do the same thing. Because what we're trying to figure out is what is our risk of getting breached in the next year? And how much should we invest to potentially mitigate that, and how much of that investment is going to be seen as ROI in terms of reducing our risk for a potential breach? And we can simulate the next year 1000, 100,000 different times and look at? Well, if we, if we look at the next year, 100,000 different times, how many times do people click on a phishing link? How many times does that phishing link contain malware? Or how many times do people put in their credentials and you know, get, get their admin account taken over? How many times do these different things happen? And then we can start to look at the averages and say on average in the next year, if we look at it 100,000 different times, we lost $285,000 due to getting breached and due to penalties that we had to pay for it and PII getting stolen and reputation loss. And then we can say because we know that that is our risk. On average, we're losing that much money If we invest $100,000, it'll actually bring our risk down to, you know, only $50,000. And that's how we can see a positive ROI there. So that's kind of in a brief what a Monte Carlo simulation is, it's using a variety of different inputs, because we don't know for sure what's going to happen, and doing multiple simulations based on different distributions to try to understand, on average, what could happen and what is our risk.
Stan Wisseman 10:27
I mean, I think that's much more defensible than what I used to do, which was, okay, so an average record, as far as how much it costs, impact wise, when if there's a data breach is this much, and we have this many records, and therefore our potential breach cost is X, and we try to make a business case around that. It's sort of like pie in the sky and not very compelling to an executives like really is that all you're doing is calculating the potential maximum breach costs, that doesn't really fly very well when you're trying to get money from them. Whereas if you actually have a much more defensible, analyzed quantitative, quantitative approach, you can actually hopefully, get past those naysayers.
Rob Aragao 11:16
I think the example though, of using proven models with Monte Carlo in other industries, is a great way I think, you know, if we look at cybersecurity, it's still very immature, comparatively speaking, kind of area functional, that we have to support a business, right? Finance, marketing, all those things have been around for a while they've learned along the way, right, marketing is a lot of things around analytics to help them drive kind of targeting customers. So I think your tie in here to leveraging something that is proven in other parts of a functional part of an organization, and translating that is great, because there's a foundation for them, to from the board level, at least an executive suite to understand okay, I see where we're going with this. If you take that one step further, Sara, have you seen some real world examples of any organizations that are starting to take that path? And it's providing a better, you know, reality of you know, what the probability of these things are happening, the actual accuracy, right, the cost, the actual cost associated? Are you seeing any kind of early adopters where that's coming to fruition?
Sara Anstey 12:20
Yeah, yeah, absolutely. And it's, you know, it's a great point that we're not reinventing the wheel here. We're not creating some brand new concept for how to analyze data. We're just reinventing it within cybersecurity, because cybersecurity as a whole has always been kind of lagging behind these other industries. And all we're saying is, if this is working for someone else, let's use it for us. We're trying to solve the same problems here. So you know, let's, let's use what's already been proven. And yeah, like you said, there's a lot of organizations that are starting to not only come to us to help us do that, but have also had internal champions trying to get FAIR, you know, more realized that their organization or having quantitative risk models applied. So we were working with one of the large tech firms in the U.S. and doing some other data analytics dashboarding work for them. And I just kind of started talking to one of their cybersecurity leaders. And he was telling me all about how he's bringing quantitative risk analysis into his organization to try to help translate to the board, because when he was taking over this new position, he realized, you know, he had a monthly obligation to report to the CEO every month, they had to sync up about their risk in cybersecurity, and he really had no good way to do it. So we're seeing a lot of different organizations trying to do this. And it's everywhere from those large financials, who, like I said, are using it for their financial risk and are now starting to adopted in cybersecurity, all the way down to small companies who are using it more to try to justify budget for purchasing a product. So we find a lot of times that either smaller financial institutions or smaller health care, are trying to implement this because their CISO, or one of their VPs in cybersecurity knows that they need to buy, let's say, a PAM solution. And they know that they have a lot of risk and and they need to buy it. But they don't know how to justify that to the board and say I need $50,000 to buy this PAM solution, and here's why. And so we find for smaller use cases in smaller companies, they've been using this quantitative risk analysis to justify purchasing a single product, that a lot of times might be, you know, more of a stretch in budget for those smaller companies who just don't have as much that they can put toward a cybersecurity defense or budget.
Stan Wisseman 14:28
So obviously, the data you're working with is important, right, those data sets. You mentioned the insurance companies, right? The insurance companies have a lot of data to work with that then they can feed into the models for their simulations. As far as what you're working with, to ensure that you're, you know, hopefully working with data doesn't lead to decision making that is flawed, right? Are you curating your own data to feed those models? Are you having to extract data from the customer. Is it a mix between the two? As far as ultimately to say, here's what we recommend, based on the models that we are running with the data sets we have, you know, how are you actually curating that data?
Sara Anstey 15:13
Yeah, yeah. So there's a couple of different resources that we use. The first would be so like you mentioned insurance industries, they have a lot of historical data, they can see what's been happening in the past what people you know what likelihoods are to get different diseases or things like that. In cybersecurity, really, the only version of that kind of historical data that we're starting to have is with a lot of industry reports. So if you think of reports, like the Verizon Data Breach Report, or the IBM Cost of a Breach, we can pull data from those reports for the last few years to try to at least understand, you know, when companies are getting breached, and breached through different attack vectors, how much on average, are they losing? How many records are being breached? You know, what's the time to detect that breach, things like that. So we can pull in a little bit of historical data, but we definitely don't have as much available as other areas. So we'll use some of that data. And then what we do is we supplement it with subject matter expert estimations. So if we look back to the Monte Carlos and I said, we input a range of data, the way that we're doing this is we're talking to subject matter experts, whether it's in their organization, or you know, a third party organization, and saying, when you look at your security environment right now, in your organization, what do you think the likelihood of someone clicking on a phishing attack is? Or what do you think the likelihood of you know if a phishing attack is clicked that it's going to contain malware? And so we'll we'll conduct these interviews to try to get their estimations and put ranges in to that Monte Carlos so that we can simulate different outcomes. And then the other thing that we're doing in that space is if you look at kind of a psychological concept, actually about how humans can estimate things that they don't know, humans are actually really bad inherently at estimating when they don't know the true answer to something. So if you think of a really common example, would be as if your boss comes to you for a project that says, hey, how many hours is this going to take you. And you kind of estimate it, inherently, you usually will underestimate a lot, you're really bad at doing doing those estimations. But as humans, if you look at a lot of different psychological principles, you can actually learn how to become better at estimating things, even if you don't know the outcome. So the other thing that we'll do is we will train people on how to get rid of their internal bias and become better estimators when we're using those estimations for the input data, to try to make those models more sticky and really have more realistic inputs. And then the last thing that we'll use for input data is just environment data for different companies. So we'll look at you know, if you ran a phishing simulation last year, what percent of your users clicked? Or how many critical vulnerabilities do you have on your different laptops, things like that. So kind of a supplement of all three types, you get the industry reports, data from their environment and estimations from from experts who understand the industry and their current adoption of best practices.
Rob Aragao 18:08
So you know, it's interesting, as you're talking about the different approaches to modeling risk overall, right that people are taking, I'm just kind of putting on my visualization cap and saying, you know, when we present to gain if it's the board, if it's, you know, get executive buy in that simple visualization, that translation is what I'm getting at, you talked about earlier, like, you know, hey, we have 100, for our abilities, if we solve for these 100 equals this, and it's still it's just too much of a technical conversation. They don't understand what that impact actually is. You painted the picture of an example of, you know, a smaller company that can take some of this kind of modeling, and translate it back into why they need to invest in PAM and what the relative dollars could be there that are saved by spending X, right, but you're still saving, I think, in your examples, it was like $200,000, if something happens relative to not having a type of solution. Great, great examples. Is there an example that though you kind of can walk through again, even a visual representation, where they're having this conversation to get the executive buy in? If it's a board level conversation, you know, it's kind of see, so seven minute slot, as I like to call it? What's that simple kind of view that you've seen work that just gets them to say, okay, that's logically understood by me at this level that I met within the business to be effectively something that we're going to say yes, and give you the knowledge to move forward with? Is there anything that kind of comes to mind as a visual way to look at this?
Sara Anstey 19:35
Yeah, there's a couple different things. So usually, when we run these simulations, the outputs are going to be what we call your inherent risk. So that's your risk right now in terms of $1 amount, say it's, you know, 200,000, or whatever it is. If you don't implement any additional controls in your environment. And then we'll have what's called your residual risk. So say we're running a simulation of you buying the PAM solution for example. It's if you were to implement that, to not only buy it, but also maintain it at a high level of efficacy, what is your residual risk, your leftover risk going to be with that implemented? So say your your inherent risk was at 250k. And now your residual risk is only 120,000, for example, we'll present those two numbers. And those are just dollar amounts that are obviously very easy for them to digest. And then we can also have what's called an ROI multiplier. So your ROI multiplier is just going to be looking at how much did it reduce your risk. So looking at that 250 minus 150,000, over what was the actual costs, not only by that control, but also to have people install it and implement it, and then maintain it for the next year, again, at a high level of usage and efficacy. So then we'll have an ROI multiplier, and we can look and say, if it's positive, we know that there's a positive ROI on this tool, and you should buy it. And then if it's negative, we know that you're actually going to be spending more money buying and maintaining this tool than it's going to be reducing your risk. So in this case, it's not a good investment. So those are some of the outputs. And then to visualize it a little bit more, what we'll also do is typically what's called a histogram. So looking at a plot of if we were to simulate that next year, like I said, 100,000 times, we will plot out for every single simulation all 100,000, well, how much money you lost in that trial, and think about it, a lot of them are going to be zero, right? So let's say half the time and the next year, you don't get breached, you don't lose a single dollar, due to any cyber security breaches or anything like that. But there could be some that are 1,000,002 million, you know, accounting for those major data breaches, because we always want to account for the possibility that that can happen. So we'll plot all of those replications out into a graph, that way the board can actually see, hey, if we replicate this next year, 100,000 times, here's how many times you guys are losing over $500,000, due to cyber security breach, here's how many times you're losing over a million, and then we'll have the conversation with them. Are you okay with that? Because some companies, if you have only a one in 100,000 chance of losing $2 million in the next year, they're like, Yeah, we're okay with that amount of risk. That's below what our risk tolerance as an organization is. Some companies who, you know, maybe their total revenue is a million dollars would say absolutely not, you know, we need to mitigate this, we need to try to bring that down. So it's all about just having a conversation with them really, in terms of dollars and saying, What is your risk appetite? Are you okay with losing $100,000 20% of the time? And if not, we need to continue invest investing more in cybersecurity, to get that below what your acceptable level of risk is.
Stan Wisseman 22:50
Sara, thanks for, for describing your visuals. For our podcast audience, it'd be so much easier if we were able to just say, here are the examples. But you didn't feel I could understand what you're trying to convey there. We've been talking about the executive level and helping drive decisions, right, with this data. It seems to me though, you can actually take it down, right? I mean, you should be able to, to help with other levels of the organization by providing them this information to help them with their roles. Are you seeing that as well?
Sara Anstey 23:24
Yeah, yeah, absolutely. So typically, what the most effective way to use data within cybersecurity and visualize things is, is you're going to have those very high level dashboards, like we've been talking about that just show dollar amounts, they just show risk, they just show ROI for the CEOs and the executives. But then if you take it a step down, let's say maybe your CISO, well, your CISO can get a little bit more effective. And you know, technical. They can kind of understand vulnerability count, maybe they can understand a couple, you know, SIM alerts and different DLP things. And so we can take it down, maybe one more level for them, still showing things at a pretty high level, but maybe showing, Hey, your SIM had 500% increase in alerts in the past week. And then the CISO will go to, you know, their SOC and say, Hey, what's going on with our SIM? And then we want to drill down to the next level of dashboard. So now we want a specific SIM level dashboard, showing all the different alerts, all the different cases, what's been firing, when, what different rules are going off. So thinking about it as always just drilling down another level. So we'll start high level, and then say, if we're looking at a CISO, anything that's red, or anything that's kind of on fire, huge increases. Let's drill into that. And then once we drill into that, maybe we can find what's the actual outlier here, or oh, it was the past week that there's been more alerts going off. Okay, let's drill into the logs. And now we're looking at the logs. And now we're really understanding all of the source data. So it's just about continuing to think of the different perspectives all the way up from the CEO down to your Security Analyst, and what different data they're going to need to help perform their job and answer the questions that their boss is asking.
Rob Aragao 25:04
Sara, I think, first off, very insightful. I think just sharing the examples of, again, kind of going back to the Monte Carlo models, and just you know, it's like the light bulb went off. We were talking about that, you know, the proven models translated them to how we can actually apply it to cyber needs. It just, it makes sense, right, kind of going through different iterations of it to get to the point where it shows, you know, the way you want it to be presented. I get it takes a little bit of time. But then he took it all the way from there, to painting the picture visually, kind of, as you described it, right on the podcast. But then down to the level of, we can work with different team members across the board within the cyber program, and give them value into what we're actually doing as well, right, give them guidance and the different things they need to work at. So, I think it's very holistic in the approach, it's nice to see that it's starting to be better adopted, and we've had conversations in the past. And I think of one kind of iconic CISO in the security space that, I'm sorry, in the insurance space himself, that kind of started doing those things, but didn't kind of always get to where, you know, it was desired. So hearing some of these advancements is very, very nice to see. And I think the effectiveness of it, we'll start to see better, you know, translation to value and how they actually represent what they're all about in their programs going forward. So thanks for the time. Really appreciate you coming out here today.
Sara Anstey 26:23
Yeah, no, absolutely. Thanks for having me. It was super fun. And I agree, I think it's just starting to meld different mindsets, into cybersecurity. So not only having cybersecurity analysts be in the cybersecurity space now, but bringing in data analysts bringing in financial, you know, actuaries or math people, or all these different things, who can understand the cybersecurity part, but understand, you know, data or math or all of these different aspects and really just helping helping all of the the CISOs of the world and everyone in cybersecurity to get the most value out of what they're doing in their organizations. Because I think as we've all seen lately, cybersecurity is going to keep growing, it's not getting smaller. So we just need to keep bringing in new ideas and adopting things that we've seen work for other people.
Rob Aragao 27:08
No question about, it. Well, thanks again. Really appreciate it. Y
Sara Anstey 27:11
Yeah, absolutely. Thank you.
Stan Wisseman 27:13
Rob Aragao 27:15
Thanks for listening to the Reimagining Cyber podcast. We hope you enjoyed this episode. If you would like to have us cover a specific topic of interest, feel free to reach out to us. You can find out how in the Show Notes and don't forget to Subscribe. This podcast was brought to you by CyberRes, a Micro Focus line of business, where our mission is to deliver cyber resilience by engaging people process and technology to protect, detect, and evolve.