|Philippe Owezarski (LAAS-CNRS)|
Date de l'exposé : 2 juin 2017
Analysis and improvements of unsupervised Anomaly detection tools
Traffic anomaly detection is of premier importance for network administrators as anomalies have a dramatic impact on network performances, and QoS perceived by users. It is, however, a very time consuming and costly task that often requires decision from network and security experts. For making anomaly detection autonomous, many research works started investigating the use of unsupervised machine learning techniques, and in most cases traffic clustering. Identifying the clusters corresponding to anomalous traffic classes among the full set of detected clusters still remains a challenge. This is mostly due to the nature of clustering techniques that work on traffic samples of a given duration, each cluster being classified after an uncertain post processing stage.
In this presentation, we then analyse the merits and limits of current unsupervised tools based on clustering. The presentation will specifically address:
- The problems related to the use of clustering algorithms on high speed traffic, that also are highly variable during time ;
- Some solutions and more specifically ORUNADA (Online Real-time Network Anomaly Detection Algorithm) ;
- Some elements on their development taking advantage of the Hadoop Spark and Storm libraries, and their execution on a big cluster as the Google dataproc ;
- The building of a ground truth for validating anomaly detection tools. It relies on large amount of collected data on commercial networks, to which synthetic attacks have been added.
- The ORUNADA validation. This evaluation is rich in terms of information on how an anomaly detector has to be designed and configured depending on the type of anomalies that are targeted.
This work has been supported by the European Commission as part of the FP7 ONTIC project.