notice
Master Thesis Defense: Anunay Amar
Speaker: Anunay Amar
Supervisor: Dr. P. Rigby
Examining Committee: Drs. W. Shang, E. Shihab, E. Doedel (Chair)
Title: Fault Prediction and Localization with Test Logs
Date: Wednesday, June 27, 2018
Time: 14:00
Place: EV 11.119
ABSTRACT
Software testing is an integral part of modern software development. However, test runs produce 1000’s of lines of logged output that make it difficult to find the cause of a fault in the logs. This problem is exacerbated by environmental failures that distract from product faults. In this thesis we present techniques that reduce the number of log lines that testers manually investigate while still finding a maximal number of faults.
We observe that the location of a fault should be contained in the lines of a failing log. In contrast, a passing log should not contain the lines related to a failure. Lines that occur in both a passing and failing log introduce noise when attempting to find the fault in a failing log. We introduce a novel approach where we remove the lines that occur in the passing log from the failing log.
After removing these lines, we use information retrieval techniques to flag the most probable lines for investigation. We modify TF-IDF to identify the most relevant log lines related to past product failures. We then vectorize the logs and develop an exclusive version of KNN to identify which logs are likely to lead to product faults and which lines are the most probable indication of the failure.
Our best approach, FaultFlagger finds 89% of the total faults and flags only 0.5% of lines for inspection. FaultFlagger drastically outperforms the previous work CAM. We implemented FaultFlagger as a tool at Ericsson where it presents daily fault prediction summaries to testers.