Machine learning does not deserve the hype and attention it is getting. Instead of enjoying such inflated expectations, it should be viewed as a helpful, additional security layer to augment the tools and solutions in place.
So said TK Keanini, distinguished engineer, Advanced Threat Solutions at Cisco, during his keynote address at ITWeb Security Summit 2019, in Sandton, this morning.
“Let’s talk about the big picture around machine learning. There are three high-level common techniques: supervised learning, unsupervised learning and reinforcement learning.
“With supervised learning, you know the question you are trying to ask and have examples of it being asked and answered correctly. It has a limited value in security because we don’t often find a ground truth.”
With unsupervised learning, you do not have answers, and may not fully know the questions. “You’re given a ton of data and have to figure it out. Machines help you cluster on that data. This is a very popular technique within retail: capture all the transactions across all stores and spending patterns will emerge.”
With reinforcement learning, it’s a question of trial and error behaviour which is effective in game scenarios. “It’s used in a lot of game play that’s very structured. You can train reinforcement learning because the rules are rigid and participants have an interest in following those rules. Run-books or playbooks could be useful, but not for threat detection.”
So, what did we do before machine learning?
Several things, noted Keanini. “Simple pattern matching. I have nothing against using a simple list. If you are high fidelity at the start, you will end up with a high fidelity outcome. Then there’s the statistical methods we learned in high school. These still work. Look at the figures and if you find outliers, you might want to investigate.
“Finally, there’s the rules and first order logic. Through an axiomatic method you can reason whether or not you have a problem.”
It’s about high fidelity analytics, it’s not about machine learning, he stressed.
“Understanding this is the only way we can build resilient security analytics pipelines that can withstand the targeted attacks of today.”
There are pitfalls with machine learning too, Keanini said.
“What is at stake matters. In the case of ‘because you watched Deadpool you might be interested in X-Men or The Flash’, it’s convenient, it’s cool. But, in security, if you receive an alert that says you need to take an executive off the network and quarantine his or her machine, that could be a career-limiting move. You have to be able to explain yourself, and there is nothing in machine learning that lets you do that. The efficacy of the qualifier could mathematically be at 100%, but humans could still not explain it.”
To ensure good machine learning, Keanini said there are six questions to ask a potential vendor. Firstly, how are you applying machine learning in your product and why? Next, how do you measure its effectiveness? Then ask, regarding supervised learning, what are you using for ’ground truth’?
Other questions include what non-machine learning are you using and why? What papers or open source has the vendor published regarding their analytics? And finally, for the machine learning-based assertions, what entailments are provided?
The lesson here is that rather than being used in isolation, machine learning must act as another layer to boost security. We need to harness a pipeline of hundreds of algorithms working in unison for the best hope of successful outcomes.
Finally, machine learning has several technical measures of success, but each one is helpful for a security practitioner. For machine learning to work and be useful, it needs to generate understandable outputs and work within the security context.
But there is a positive side to machines, if we let them play the instrument they should be playing in the larger band. It’s another security tool in the tool-belt.