A New Approach to Dimensionality Reduction for Anomaly Detection in Data Traffic

14 Jun 2016  ·  Tingshan Huang, Harish Sethu, Nagarajan Kandasamy ·

The monitoring and management of high-volume feature-rich traffic in large networks offers significant challenges in storage, transmission and computational costs. The predominant approach to reducing these costs is based on performing a linear mapping of the data to a low-dimensional subspace such that a certain large percentage of the variance in the data is preserved in the low-dimensional representation. This variance-based subspace approach to dimensionality reduction forces a fixed choice of the number of dimensions, is not responsive to real-time shifts in observed traffic patterns, and is vulnerable to normal traffic spoofing. Based on theoretical insights proved in this paper, we propose a new distance-based approach to dimensionality reduction motivated by the fact that the real-time structural differences between the covariance matrices of the observed and the normal traffic is more relevant to anomaly detection than the structure of the training data alone. Our approach, called the distance-based subspace method, allows a different number of reduced dimensions in different time windows and arrives at only the number of dimensions necessary for effective anomaly detection. We present centralized and distributed versions of our algorithm and, using simulation on real traffic traces, demonstrate the qualitative and quantitative advantages of the distance-based subspace approach.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here