Lab 5: Logic-based Software Failure Detection

The goal of this lab is to create a tool that can monitor aspects of a software application's run-time behavior (e.g., memory usage, CPU usage, I/O usage) using software sensors and then automatically determine, with a high degree of accuracy, whether the application is operating normally or abnormally. Abnormal behavior can be due to a fault (bug) in the application or to a security attack.

The technical approach involves a training phase, followed by a detection phase. In the training phase, each feature of the application is executed and the value range of each sensor is established. E.g., the sensors may show that when executing the application, heap memory usage varies between 100K and 500K and that CPU usage varies between 0% and 60%. Your solution involves creating a simple detection technique that produces a single detection predicate expression that conjoins (ANDs together) each sensor value range observed during the learning phase (hint: run the test suite from the first lab).

At run time (application deployment phase), the sensors emit a measurement every k seconds (the sampling rate was determined in the previous lab) and the detection predicate is evaluated to either TRUE or FALSE. If the detection predicate evaluates to TRUE, the application is considered to be operating normally because every sensor measurement is within the value range established during the training phase. E.g., if there are only two sensors, where Sm is the memory sensor and Sc is the CPU sensor, the predicate may look like:

(100 <= Sm <= 500) AND (0 <= Sc <= 60)