[ CS 199 Progress Report ] 0. Author Josh Hyman SID: 802-971-681 Spring 2004 1. Initial Attempt (BLAST - http://www-cad.eecs.berkeley.edu/~blast/) It was relatively simple to build BLAST from source and test the simple examples provided in the documentation. The only snag lay in the fact that compile errors were cryptic if the user had never used Ocaml before. The initial examples demonstrated BLAST's power in proving the reachability of nodes in a program's control flow graph. BLAST performs this reachability proof by building a list of predicates and their values and then testing whether a certain path in the graph is feasible. For the intended use of catching NULL pointers, it seems that BLAST is not the tool of choice. BLAST seems to be far more useful for catching temporal problems involving locking. Specifically, it is not simple to track the NULL-ness of variables. If it were possible to abstract pointer types and add a field to denote that a verify for NULL had been done, then at every point where the pointer is dereferenced that field could be checked. Building such an abstract type is possible in BLAST, but only with struct types. A major goal of BLAST is checking for state reachability with a little hand holding from the user a possible. This has the desired affect that BLAST will not come to a false conclusion based on an incorrect description source code. In effect, it guarantees that the model being checked is in fact the model presented by the code. However, this goal is in conflict with this project's objective. For this project, it is acceptable to give the model checker "hints". Specifically, it is reasonable to give the model checker function annotations to allow it to track the NULL-ness of variables. Further, BLAST has some limitations which prohibit its general use. Specifically, BLAST has trouble with recursive functions and function pointers as noted in the documentation. Also, since BLAST's complexity is a function of the number of predicates its tracking, lengthy loops can cause exponential run times. This problem could be fixed if the only variables used within the loop verifiably have no effect on the reachability of a given state. In this case, the loop could be skipped and the variables could be given random values, causing a growth in the number of possible state transitions without the loss of any legitimate edges in the graph. 2. Second Attempt (Ccured - http://manju.cs.berkeley.edu/ccured/) Ccured provides extensive, and possibly redundant runtime checking for arbitrary C code. These checks are inserted into the source just after the preprocessing stage of the compile. This being the case, direct application would pollute the compiled output with excessive NULL checking causing a significant decrease in performance. Again, the difference in our approach becomes evident. Ccured was designed to bolt onto an existing application without the programmer modifying the "clean" source code. Our goal is to modify the existing source to check for these errors. Optimally, we would use this tool to find locations where it would be necessary to explicitly check for NULL pointers. 3. Success (BLAST + Ccured) With the help of Professor Rupak Majumdar, it became apparent that if BLAST and Ccured could be used in concert, they could provide the means necessary to catch NULL pointer related bugs at compile time. In fact, it is possible that this combination may eventually be able to check at compile time for all that Ccured can check for at run time. The simplest test case of this collaboration is NULL checking. Ccured performs an analysis of the source and intelligently inserts a NULL check whenever one might be missing. The goal is then to determine the reachability of any state where one of the Ccured inserted NULL checks fail. This could be done by simply having BLAST check for the reachability of a CCURED_FAIL() function call. However, this function is called any time any Ccured check fails. Since BLAST's complexity is exponential in the number of states its tracking, any program of reasonable size would render this method unrealistic. To solve this problem, Ccured was modified to add an additional unique error state. Then BLAST was modified to loop through all of the possible error states, replacing each in turn with the unique error state, and then checking that unique state's reachability. This would then allow BLAST to only track a single state at a time, a huge performance win. This procedure also lends itself to parallelism. Since each execution of the BLAST model checker is independent, a number of threads (on an SMP machine or on a cluster) could be utilized to distribute the work. Initial tests on very simple C source code have proven successful. More complex code, such as the source for GNU getopt, has unearthed bugs within BLAST. Specifically, BLAST seems to waste significant time in meaningless (with respect to the state being checked) loops. A possible solution to this problem was mentioned above. 4. Future Plans This approach needs a significant amount of testing. However, once all of the bugs have been worked out of the design, the goal is to apply this technique to the GNU C Library at large. Also, it would be useful to build a wrapper for this tool to simply its execution, simplify BLAST"s error messages, and more obviously denote where NULL checks need to be explicitly added. This wrapper could also divide the Ccured preprocessed output into a given number of shards and invoke the BLAST model checker on each shard in parallel. At this point, only the simple NULL check is being applied. It could also be very useful to provide reachability checking for the other run-time checks that Ccured provides. Array bounds checking, for example, would be an extremely valuable check for applications concerned about buffer over- or under-flow. This sort of modification would not be difficult. It is only being put off to lower the development and testing complexity of the existing model. It is the hope that at least the all run-time NULL checks inserted by Ccured could be accurately modeled and tested at pseudo compile-time by BLAST by the end of this quarter. This goal seems feasible given the current state of the project. [ Original at http://www.joshhyman.com/opensource/bugs/progress.txt ]