In a very short period of time, machine learning (ML) has had a major impact on the field of cybersecurity. Machine learning has proven to be adept at finding threats in ways that traditional signatures never could, whether detecting malware, finding vulnerabilities, or recognizing when a trusted employee has been compromised by an attacker.
However, one of the drawbacks of this shift is what I like to call the Sherlock Syndrome. In short, it’s easy to start thinking of machine learning as something akin to Sherlock Holmes built into an algorithm. Using near-supernatural powers of observation, Sherlock, or in this case an algorithm, could simply look at an environment and unravel the complex chain of events at the heart of a mystery.
This is an alluring idea, but one that has its pitfalls. Many organizations have mountains of security data, and it's enticing to think that an algorithm could digest that mountain of data and spit out hidden threats. But answers in the real world are rarely so tidy and certain. In cybersecurity, many solutions will merely identify anomalies, which require additional analysis and investigation from an analyst. In short order our Sherlock algorithms can start bury analysts in anomalies, false positives, and endless investigations.
We can solve this problem if we remember that investigation isn’t just about passive observation - we can ask questions. More specifically it’s about asking the right questions at the right times. Real-world detectives will observe a crime scene in minute detail, but they also talk to witnesses and ask probing questions. The investigation has to be interactive and able to adapt based on what is learned.
At Preempt, we take this process very seriously, and it’s why we have a very different approach to machine learning than other security solutions. When a machine learning model recognizes that a user is behaving abnormally, we may not know for sure if the user is compromised or simply doing something new. However, a simple multi-factor authentication challenge can quickly separate benign behavior from something more serious. By asking the right question at the right time, we can protect data without disrupting valid end-user activity, and in the process resolve false positives before they ever reach an analyst.
You can read more on the topic of how to get the balance of preventing threats vs disrupting business in this blog by Boris Danilovich.
We then take this crucial user interaction and feed it back into the Preempt machine-learning models. The models get ongoing feedback that allow them to adapt to the unique character of the network. In addition to having supervised ML models, which are trained by data scientists, and unsupervised ML models, which detect deviations from a local baseline, we also use a hybrid technique called semi-supervised machine learning. In this case, the model is trained by real-time feedback from the end user. This means that individual detections are far more reliable, but also the models themselves become far more attuned to the environment over time.
So when it comes to your cybersecurity, maybe it's time to be a little less Sherlock Holmes and a little more Rustin Cohle, and make sure that we are asking the right questions.