Modern software systems expose management metrics to help track their health. Recently, it was demonstrated that correlations among these metrics allow errors to be detected and their causes localized. Prior research shows that linear models can capture many of these correlations. However, our research shows that several factors may prevent linear models from accurately describing correlations, even if the underlying relationship is linear. Common phenomena we have observed include relationships that evolve, relationships with missing variables, and heterogeneous residual variance of the correlated metrics. Usually these phenomena can be discovered by testing for heteroscedasticity of the underlying linear models. Such behaviour violates the assumptions of simple linear regression, which thus fail to describe system dynamics correctly. In this paper we address the above challenges by employing efficient variants of Ordinary Least Squares regression models. In addition, we automate the process of error detection by introducing the Wilcoxon Rank-Sum test after proper correlations modeling. We validate our models using a realistic Java-Enterprise-Edition application. Using fault-injection experiments we show that our improved models capture system behavior accurately.
Did you like this research project?
To get this research project Guidelines, Training and Code... Click Here