search .
Papers Published
- Jin, S; Zhang, Z; Chakrabarty, K; Gu, X, Failure prediction based on anomaly detection for complex core routers,
Ieee/Acm International Conference on Computer Aided Design, Digest of Technical Papers, Iccad
(November, 2018), ACM Press [doi] .
(last updated on 2022/12/30)Abstract:
Data-driven prognostic health management is essential to ensure high reliability and rapid error recovery in commercial core router systems. The effectiveness of prognostic health management depends on whether failures can be accurately predicted with sufficient lead time. This paper describes how time-series analysis and machine-learning techniques can be used to detect anomalies and predict failures in complex core router systems. First, both a feature-categorization-based hybrid method and a changepoint-based method have been developed to detect anomalies in time-varying features with different statistical characteristics. Next, a SVM-based failure predictor is developed to predict both categories and lead time of system failures from collected anomalies. A comprehensive set of experimental results is presented for data collected during 30 days of field operation from over 20 core routers deployed by customers of a major telecom company.