The world is filled with events. Our inbox floods with events that marketers really want us to pay attention to, while news feeds flood us with events they’re trying to raise above the background noise, but then, the dog barking interrupts our consumption of that information. Our family is, meanwhile, texting us about events on the family level that may be related to events on the national or world stage … meanwhile, social media is full of garbage information about events that may or may not be real, and that you may or may not care about.
We ignore the vast majority of these inputs and move along with our day. The inputs that require our attention usually, but not always, get it. Sometimes, we fail horribly at filtering, judging importance, or both, and it has life-impacting ramifications.
This is not too far from the state of security event management when security information and event management (SIEM) was born. Disparate feeds were coming in from all over; analysts – be they dedicated security folks or systems administrators wearing an extra hat – had to guess which of dozens, hundreds or even thousands of events were a real threat to the environment.
The first step to gaining control of that environment was standardization and aggregation. Get the events into one place, with similar information available for similar events. That was where SIEM came from.
Since then, the number of events coming in has continued to rise. Companies that were seeing hundreds of events a day are now seeing hundreds an hour, and SIEM vendors had to keep up. The complexity of attacks, and the ability to detect them, has gone up, and SIEM vendors have had to keep up. Tools available for feeding the SIEM and for acting on the aggregated data have evolved … and SIEM vendors had to keep up.
At this point in time, SIEM is best characterized as a data aggregation platform with limited intelligence that aids in filtering of irrelevant events. It’s akin to turning off social media and telling those close to you to call if needed; it filters out the worst noise and elevates the most important messages.
Other tools can analyze their own data and make their own deductions, but SIEM is where you bundle all security-related events in one place, so it is the logical place to filter the noise and raise awareness. But it requires a high-volume, adaptable datastore. Vendors have reached that plateau.
Once volume issues were mastered, SIEM vendors turned to helping filter the insane volume of security event noise. Yes, it is a security event if Bob logged in on a new workstation, but it probably isn’t a noteworthy security event. Unless Bob’s new workstation is in a country where the company doesn’t do business, or Bob is still logged in on his own workstation, blissfully unaware of the other login.
So, rudimentary filters were applied to the event data in the SIEM. This got rid of a huge amount of noise. Next came filtering events that, logically, looked like notable security events, but just weren’t. Something was scanning ports on the firewall. An internal firewall. Sounds bad, unless the security tools’ logs showed that an authorized user kicked off the scan. Then, it’s a simple matter of asking the user in question, “Are you scanning X?” and moving along. It’d be better for the system to lower the importance of the event based on the fact that it is an authorized user doing the scanning, with corporate-approved tools. Then, an analyst may not need to get involved at all (my security friends are cringing because, literally, they trust no one, and would point out that authorized users can misuse tools, too – so I’ll note that here).
But the event volume kept going up, and the analysts were even more swamped. Between adding new reporting apps (HIPS alone can add thousands of reporting points to a SIEM), and increases in events on existing monitored points, it was ugly.
Enter AI. This is when machine learning (ML) started becoming the norm in SIEM. Now, we want the system to avoid using hard-and-fast rules that must be maintained, but to use knowledge it gains by watching the flow of data and an analyst’s resolution to selectively raise/lower the priority of events. Now, the events that actually bubble up to analysts are ones that the ML engine sees as fishy, and the number is greatly reduced. False positives still occur, but not nearly at the volume they did before, as the system learns how to detect them.
But wait! There’s more!
All that data in one place with ML engines crawling over it was just too much temptation for data scientists and deeply knowledgeable security personnel. So, the ability to plot a course through seemingly unconnected security events and determine that this or that was actually evidence of intruder or attacker activity spread across the system came next. This is where we are now – the ability to connect the dots on events that analysts (or the system) might have filtered out to find patterns that indicate intrusions.
But it comes with a price. It’s not cheap to set up, it’s not easy to train, and maintenance of a system that takes inputs from everywhere and tries to correlate the resulting dataset is … work. We’re not talking a little bit of data here, and, like the environment it comes from, we’re not talking a little bit of complexity. Do you need it? I would say yes. Even if you only use SIEM as a big data store, it will help you perform post mortems when (not if) you have a breach. It is worth the effort, unless you are tiny or have no Internet presence to speak of.
Keep rocking it! SEIM is another tool to consider as you move to DevSecOps dominance, but again, just a tool that requires you to keep the org safe.