Recent advances in contextual anomaly detection attempt to combine resource metrics and event logs to uncover unexpected system behaviors at run-time. This is highly relevant for critical software systems, where monitoring is often mandated by international standards and guidelines. In this paper, we analyze the effectiveness of a metrics-logs contextual anomaly detection technique in a middleware for Air Traffic Control systems.
Our study addresses the challenges of applying such techniques to a new case study with a dense volume of logs, and finer monitoring sampling rate. Guided by our experimental results, we propose and evaluate several actionable improvements, which include a change detection algorithm and the use of time windows on contextual anomaly detection.