A First Step Towards Automated Detection of Buffer Overrun Vulnerabilities

Author(s): David Wagner, Jeffrey Foster, Eric Brewer, Alexander Aiken
Venue: ISOC Network and Distributed System Security Symposium
Date: 2000


Language: C
Defect Type: Buffer overruns
Uses Annotations: No
Requires Annotations: No
Sound: No
Flow-sensitive: No
Context-sensitive: No

This paper proposes a method for finding buffer overrun vulnerabilities in C programs. The authors explicitly state that scalability was one of their goals and have made tradeoffs that favor scalability over accuracy. Their method looks for buffer overrun vulnerabilities in string handling specifically, and therefore will miss other buffer overrun vulnerabilities. The method works by viewing C strings as an abstract data type and object of those types. They model string buffers as a pair of length, which is the number of bytes in the buffer actually occupied by a string, and size, which is the total number of bytes in the buffer. The method tracks string buffers and the standard C library string manipulation methods such as strnlen() and strcpy() that are used on them. They examine the calls to ensure that each string operation stays within the bounds of its buffer(s).

The authors implemented their technique in a tool and used the tool on roughly a dozen different programs, but only reported results from a few programs (the ones that generated enough results to be interesting, according to the authors). The programs that they reported results for were sendmail 8.7.5, sendmail 8.9.3, and the Linux nettools package (netstat, ifconfig, etc.). The tool found several vulnerabilities in the nettools package despite a previous manual security audit, and they also found multiple bugs in both sendmail versions. The tool also reported a large number of false alarms for all three programs, and this was the authors’ main dissatisfaction with the method. One tradeoff that they made to make their tool scalable was that they didn’t consider pointer aliasing, which they conclude caused the tool to generate false positives and miss some real vulnerabilities. They propose a more detailed analysis which should reduce the false positive rate as further work. Their method has much room for improvement, but it did find quite a few bugs on large software systems that had previously had a manual audit.