As the amount of cyber security technology vendors in the market continues to grow, we have seen an enormous amount of valuable tech emerging to aide in detecting security threats. These technologies continue to produce more and more data that needs to be logged, normalized, and retained in order to make it useful. Even as organizations begin to think about how to start leveraging machine learning concepts or SOAR platforms for automation, they struggle to capitalize on their return on investment with all of these technologies because, in most cases, they have not created and defined a logging strategy that makes it easy to use the data that is produced from the technology they purchase.
Data becomes a lot more useful when it is structured, tagged, normalized, parsed, and follows a consistent data model. Being able to detect security threats isn't about buying every product on the market, sending all of those logs to a logging platform and thinking that these can just be easily used to detect a threat. A very critical component to being able to build an effective threat detection program is ensuring that all of that data is useable.
Only once data is structured and normalized are you then able to use those data sets to begin building advanced detection content across multiple data domains. Having the ability to be able to detect threat scenarios across multiple data domains, instead of just being able to detect multiple indicators of a threat within one data domain is a huge advantage. This improves the quality of the detection, hence lowers the rate of false positives, and can get you to a high enough efficacy that allows you to be able to properly use SOAR systems to automate the mitigation of threat scenarios. The only way for this to be possible is to have good practices when it comes to logging.
It is amazing to see how many companies continue to spend millions of dollars on security tools; however, aren't necessarily always improving their ability to detect or mitigate security threats. This is because often times we forget about the basics needed for that data to become useful. The only way you can properly use data from all the threat technology tools you have deployed is if you follow a consistent process for how that data gets stored.
Data normalization is not sexy, it's not fun, and it's not always easy, that is why often times those efforts get thrown under the bus. But it is mandatory if you want to be serious about detecting actual threat scenarios, or even to begin using machine learning or automation concepts to help with detecting or mitigating threats.
Being able to detect cyber threats is not just about understanding the threat, the tactics, the techniques and the procedures. It's not just all about purchasing a bunch of technology that can identify suspicious activity on your network. It's all about ensuring that the data is useable. Having a program that ensures you have the proper logging patterns and models is an extremely underinvested area, which can end up having major impacts to your ability to be effective in mitigating the risk of a cyber attack if not properly addressed.