CEG 499/699:
Internet Security


College of Engineering & CS
Wright State University
Dayton, Ohio 45435-0001

Buffer Overflow

 

Prabhaker Mateti

 
Abstract:  A large number of exploits have been due to sloppy software development.  Exceeding array bounds is referred to in security circles as "buffer overflow."   These  are by far the most common security problems in software. This lecture explains the stack-smashing technique, and presents a few techniques that help in avoiding the exploit.
 
This work is supported in part by NSF DUE-9951380.
  06/07/01

Table of Contents

  1. Educational Objectives
  2. Buffer Overflow
    1. A Few Recent Buffer Overflow Exploits
    2. The Buffer Overflow Error
      1. Stack Smashing
      2. Heap overflows versus stack overflows
    3. Techniques of Avoiding Buffer Overflow
      1. Modern Programming Languages
      2. Careful Use of C/C++ Library Functions
      3. Static and Dynamically Allocated Buffers
      4. Newer Libraries
      5. Compilation Solutions in C/C++
      6. Non-executable user stack area
      7. No set-user-id Programs
  3. Lab Experiment
  4. Acknowledgements
  5. References

Educational Objectives

  1. Bring awareness of how widely the buffer overflow flaw is present
  2. Show several real life examples of buffer overflow.
  3. Describe the stack smashing technique
  4. Describe several techniques of overflow exploit avoidance.

Buffer Overflow

"Quick: What's the computer vulnerability of the decade?  It's not the Y2K bug, according to computer science and security analysts, but a security weakness known as the buffer overflow. "

November 23, 1999 news.cnet.com/news/0-1003-200-1462855.html 

Buffer overflow is a common programming error.  There is not a single OS and its utilities bundle that is free from this error.  Buffer overflows have been causing serious security problems for decades. In the most famous example, the Internet worm of 1988 used a buffer overflow in fingerd.  The problem occurred several times so far in 2000. What is surprising is that a number of security oriented software such as SSH and Kerberos  also have these errors. 

A Few Recent Buffer Overflow Exploits

  1. July 26, 2000 SPS Advisory #39: Adobe Acrobat Series PDF File Buffer Overflow. Vulnerable: Acrobat 4.05J for Windows95 / 98 / NT / 2000, and others.  Acrobat overflows when reading the PDF file which has long Registry or Ordering. They are part of the font CDI system information that you can see in the PDF file generated by Acrobat. This buffer overflow overwrites the local buffer, EIP can be controled and can execute prepared code written in the font CDI system information. This overflow contains the possibility of the virus and trojans infection, sytsem destruction, intrusion, and so on. ... A patch for this problem has been released on 26 July: http://www.adobe.com/misc/pdfsecurity.html
  2. July 18, 2000: Security Advisory: Buffer Overflow in MS Outlook & Outlook Express Email Clients.  Author: Aaron Drew. Versions Affected: MS Outlook 97/2000 and MS Outlook Express 4/5.  A bug in a shared component of Microsoft Outlook and Outlook Express mailclients can allow a remote user to write arbitrary data to the stack. This bug has been found to exist in all versions of MS Outlook and Outlook Express on both Windows 95/98 and Windows NT 4. The vulnerability lies in the parsing of the GMT section of the date field in the header of an email. Bound checking on the token representing the GMT is not properly handled. This bug can be witnessed by opening an email with an exceptionally long string directly preceding the GMT specification in the Date header field.  The bug lies in the shared library INETCOMM.DLL and has been successfully exploited on Windows 95, 98 and NT with both Outlook and Outlook Express.  ...  Microsoft was notified of this bug on July 3, 2000.
  3. May 24, 2000, KDE kdm Buffer Overflow Vulnerability, vulnerable: KDE 1.1.2 in  Linux Mandrake 7.0 and Caldera OpenLinux 2.3; bugtraq id 1279.
  4. May 8, 2000 Solaris Security Roundup  www.securityportal.com/ topnews/ weekly/ solaris20000515.html  "... three weakness were discovered and discussed on BugTraq during the last week of April, as yet no patches are available from Sun. These weaknesses concern:

    Xsun Buffer Overflow
    lp -d option Buffer Overflow
    lpstat -r option Buffer Overflow"

     

The Buffer Overflow Error

The essence of this problem can be explained by the following.  The line strcpy(p, q) is a common piece of code in most systems  programs.  An example of this is: char env[32]; strcpy(env, getenv("TERM")); The strcpy(p, q) is proper only when

  1. p is pointing to a char array of size m,
  2. q is pointing to a char array of size n,
  3. m >= n,
  4. q[i] == '\0' for some i where 0 <= i <= n-1

Unfortunately, only a few programs verify that all the above hold prior to invoking strcpy(p, q).  A buffer overflow occurs when an object of size m + d is placed into a container of size m. This can happen in many situations when the programmer does not take proper care to bounds check what their functions do and what they are placing into variables inside their programs.   If n  > m in the strcpy(p, q) of above an area of memory beyond &p[m] gets overwritten.

A few other examples of such buffer overflows:

Buffer Overflow Exploits

An attacker exploits this programming mistake.  He injects cleverly constructed data / executable-code into the area beyond the declared sizes.  If the "buffer" is a local C variable, the overflow can be used to force the function to run code of an attackers' choosing. This specific variation is often called a ``stack smashing'' attack. A buffer in the heap isn't much better.  Attackers have been able to use such overflows to control other variables in the program.

Stack Smashing

Stack-smashing attacks target a specific programming fault: careless use of variables allocated on the program's run-time stack such as local variables and function arguments.  The idea is straightforward: Insert attack code (for example, code that invokes a shell) somewhere and overwrite the stack in such a way that control gets passed to the attack code.   If the program being exploited runs with root privilege, the attacker gets that privilege in the interactive session. 

The paper by Aleph One, "Smashing The Stack For Fun And Profit,"  describes the technique in great detail, and is required reading. 

Heap overflows versus stack overflows

"Heap overflows are generally much harder to exploit than stack overflows (although successful heap overflow attacks do exist). For this reason, some programmers never statically allocate buffers. Instead, they malloc() or new everything, and believe this will protect them from overflow problems. Often they are right, because there aren't many people who have the expertise required to exploit heap overflows. But dynamic buffer allocation is not intrinsically less dangerous than other approaches. Don't rely on dynamic allocation for everything and forget about the buffer overflow problem. Dynamic allocation is not a cure-all."

For more details on heap overflows, read the article "w00w00 on Heap Overflows" cited in the references.

Techniques of Avoiding Buffer Overflow

Modern Programming Languages

Most modern programming languages are essentially immune to this problem, either because they automatically resize arrays (e.g., Perl, and Java), or because they normally detect and prevent buffer overflows (e.g., Ada95 and Java). However, the C language provides no protection against such problems, and C++ can be easily used in ways to cause this problem too.

Careful Use of C/C++ Library Functions

The discussion of this section is specific to current (2000) implementations on Unix, its variants, and Windows 98/NT/2000 of the standard C libraries, often called the libc library.  The awareness of the buffer overflow is now causing revisions of the library, and extra checks by compilers.

C users must avoid using functions that do not check bounds unless they've ensured the bounds will never get exceeded. Functions to avoid in most cases include: strcpy(3), strcat(3), sprintf(3), and gets(3). These should be replaced with functions such as strncpy(3), strncat(3), snprintf(3), and fgets(3) respectively, but see the discussion below. The function strlen(3) should be avoided unless you can guarantee that there will be a terminating NUL (ascii code zero) character to find. Other functions that may permit buffer overruns include fscanf(3), scanf(3), vsprintf(3), realpath(3), getopt(3), getpass(3), streadd(3), strecpy(3), and strtrns(3).

Beware that calls to strncpy(3) and strncat(3) have somewhat surprising semantics and are hard to use correctly. E.g., the function strncpy(3) does not NUL-terminate the destination string if the source string length is greater than or equal to the destination's. So be sure to set the last character of the destination string to NUL after calling strncpy(3).  Both strncpy(3) and strncat(3) require that you pass the amount of space available. Neither provide a simple mechanism to determine if an overflow has occurred. Also note that strncpy(3) has a significant performance penalty compared to the strcpy(3), because strncpy(3) NUL-fills the remainder of the destination.

Static and Dynamically Allocated Buffers

The fact that a buffer is of a fixed length may be exploitable. The basic idea is that the attacker sets up a really long string so that, when the string is truncated, the final result will be what the attacker wanted (instead of what the developer intended). Perhaps the string is concatenated from several smaller pieces; the attacker might make the first piece as long as the entire buffer, so all later attempts to concatenate strings do nothing. Here are some specific examples:

An alternative is to dynamically (re-) allocate all strings instead of using fixed-size buffers. This general approach is recommended by the GNU programming guidelines, mainly because it permits programs to handle arbitrarily-sized inputs (until they run out of memory). However, one must be prepared for dynamic allocation to fail.  The program must be designed to be fail-safe when memory is exhausted.  The memory may be exhausted at some other point in the program than the portion where you're worried about buffer overflows. Also, since dynamic reallocation may cause memory to be inefficiently allocated, it is entirely possible to run out of memory even though there is enough virtual memory available to the program to continue. In addition, before running out of memory the program will probably use a great deal of virtual memory  easily resulting in ``thrashing'', a situation in which the system spends all its time just paging in and out. This can have the effect of a denial of service attack.

Newer Libraries

Newer libraries for C include the strlcpy(3) and strlcat(3) functions, with prototypes:

size_t strlcpy (char *dst, const char *src, size_t size);
size_t strlcat (char *dst, const char *src, size_t size);
Both strlcpy and strlcat take the full size of the destination buffer as a parameter (not the maximum number of characters to be copied) and guarantee to NUL-terminate the result (as long as size is larger than 0). The strlcpy function copies up to size-1 characters from the NUL-terminated string src to dst, NUL-terminating the result. The strlcat function appends the NUL-terminated string src to the end of dst. It will append at most size - strlen(dst) - 1 bytes, NUL-terminating the result.

One nuisance is that such newer libraries are not, by default, installed in most systems.

Compilation Solutions in C/C++

Newer compilers perform bounds-checking. Visit, e.g., http:// www-ala. doc.ic.ac.uk/ ~phjk/ BoundsChecking.html  Such tools provide one more layer of defense, but it's not wise to depend on this technique as your sole defense. There are at least two reasons for this. First, most such tools only provide partial defense against buffer overflows (and the ``complete'' defenses are generally 12-30 times slower). C and C++ were simply not designed to protect against buffer overflow. Second, for open source programs you cannot be certain what tools will be used to compile the program; using the default ``normal'' compiler for a given system might suddenly open security flaws.

StackGuard is a modification of the standard GNU C compiler, typically invoked by its driver named gcc.. StackGuard works by inserting a ``guard'' value (called a ``canary'') in front of the return address; if a buffer overflow overwrites the return address, the canary's value (hopefully) changes and the system detects this before using it. This is quite valuable, but note that this does not protect against buffer overflows overwriting other values (which they may still be able to use to attack a system).

Non-executable user stack area

It is possible to modify the kernel of any OS so that the stack segment is not executable. E.g., see Solar Designer's patch at http:// www.openwall .com/linux/ that modifies the Linux kernel 2.2.19. However, as of March 2001 this is not built into the standard Linux kernel 2.4.x. Part of the rationale for not including such a modification is that this protection is illusive; attackers can simply force the system to call other ``interesting'' locations already in the program (e.g., in its library, the heap, or static data segments). Also, sometimes Linux does require executable code in the stack, e.g., to implement signals and to implement GCC ``trampolines''.

Even in the presence of non-executable stack, Linux Torvalds, the original author and now the integrator of the Linux kernel, explains that "It's really easy. You do something like this: 1) overflow the buffer on the stack, so that the return value is overwritten by a pointer to the system() library function. 2) the next four bytes are crap (a "return pointer" for the system call, which you don't care about) 3) the next four bytes are a pointer to some random place in the shared library again that contains the string "/bin/sh" (and yes, just do a strings on the thing and you'll find it). Voila. You didn't have to write any code, the only thing you needed to know was where the library is loaded by default. And yes, it's library-specific, but hey, you just select one specific commonly used version to crash. Suddenly you have a root shell on the system. So it's not only doable, it's fairly trivial to do. In short, anybody who thinks that the non-executable stack gives them any real security is very very much living in a dream world. It may catch a few attacks for old binaries that have security problems, but the basic problem is that the binaries allow you to overwrite their stacks. And if they allow that, then they allow the above exploit. It probably takes all of five lines of changes to some existing exploit, and some random program to find out where in the address space the shared libraries tend to be loaded."

No set-user-id Programs?

An attacker targets set-user-id (suid) programs so that after the exploit he is the root, and can do arbitrary things.  So, some "people believe that if their program is not running suid root, they don't have to worry about security problems in their code, since the program can't be leveraged to achieve greater access levels. That idea has some merit, but is still a risky proposition. For one thing, you never know who is going to take your program and set the suid bit on the binary. When people can't get something to work properly, they get desperate. We've seen this sort of situation lead to entire directories of programs needlessly set setuid root."

"There can also be users of your software with no privileges at all. That means any successful buffer overflow attack will give them more privileges than they previously had. Usually, such attacks involve the network. For example, a buffer overflow in a network server program that can be tickled by outside users may provide an attacker with a login on the machine. The resulting session has the privileges of the process running the compromised network service. This type of attack happens all the time. Often, such services run as root (and generally for no good reason other than to make use of a privileged low port). Even when such services don't run as root, as soon as a cracker gets an interactive shell on a machine, it is usually only a matter of time before the machine is "owned" -- that is, the attacker gains complete control over the machine, such as root access on a UNIX box or administrator access on a Windows NT box. Such control is typically garnered by running a different exploit through the interactive shell to escalate privileges." [Quoted from http://www-4.ibm.com/ software/ developer/ library/ buffer-defend.html? dwzone=security]

Conclusion

In short, it's better to work first on developing a correct program that defends itself against buffer overflows. Then, after you've done this, by all means use techniques and tools like StackGuard as an additional safety net. If you've worked hard to eliminate buffer overflows in the code itself, then StackGuard is likely to be more effective because there will be fewer ``chinks in the armor'' that StackGuard will be called on to protect.


Lab Experiment

All work should be carried out in Operating Systems and Internet Security (OSIS) Lab, 429 Russ.   Use any of the PCs numbered 19 to 30.  No other WSU facilities are allowed. 

Objective: Understand the stack smashing buffer exploit thoroughly.

  1. Download the article by Aleph One into your own disk.  You will be extracting the source code of exploit3.c and exploit4.c files from it.
  2. Study the code of exploit3.c and exploit4.c that you extracted. 
  3. Improve the code so that there are no warning messages from gcc even after using the flags as in
        gcc -ansi -pedantic -Wall.
  4. Reduce the size of their compiled binaries by at least 5%.  Make sure no functionality is lost.
  5. Answer the question: What is the "environment"?
  6. Login as any non-root user, and run the exploit3 program..
  7. Answer the question: Why does exploit3.c run system("/bin/bash") at the end of main()?
  8. Turn in a lab report of say 2-4 pages with answers to the questions above, thoroughly describing your changes, and how you verified that there was no loss of functionality.  Attach hard copies of properly indented versions of your exploit[34].c files.  Use indent -kr.

Acknowledgements

The section on "Techniques of Avoiding Buffer Overflow" is based on "Secure Programming for Linux and Unix HOWTO" and the "The Unix Secure Programming FAQ."


References

  1. Aleph One, "Smashing The Stack For Fun And Profit," Phrack,  Vol 7, Issue 49, File 14 of 16, www.phrack. com. local copy (.txt)   A classic article.  But it has a few inaccuracies. Slides of  Mateti's lecture. Required Reading.
  2. Arash Baratloo, Navjot Singh, and Timothy Tsai, "Transparent Run-Time Defense Against Stack Smashing Attacks," Usenix 2000, http://www.bell-labs.com/org/11356/docs/usenix00/paper.html  Reference

  3. Matt Conover, and WSD, "w00w00 on Heap Overflows", January 1999, www.w00w00.org/ files/ articles/ heaptut.txt  Required Reading.
  4. Crispin Cowan, Calton Pu, Dave Maier, Heather Hinton, Jonathan Walpole, Peat Bakke, Steve Beattie, Aaron Grier, Perry Wagle and Qian Zhang, "StackGuard: Automatic Adaptive Detection and Prevention of Buffer-Overflow Attacks," 1998, www.cse.ogi.edu/DISC/ projects/ immunix/ StackGuard/ usenixsc98_html/  Reference.
  5. DilDog, "The Tao of Windows Buffer Overflow," Date unknown, www.cultdeadcow.com/ cDc_files/ cDc-351/ Worth a visit.
  6. Peter Baer Galvin, "The Unix Secure Programming FAQ: Tips on security design principles, programming methods, and testing," www.sunworld.com/ sunworldonline/ swol-08-1998/ swol-08-security.html. [Local Copy] Required Reading.
  7. mudge@l0pht.com, "Compromised -­ Buffer Overflows, from Intel to SPARC Version 8," Date unknown.  Reference.
  8. Nathan P. Smith, "Stack Smashing Vulnerabilities in the UNIX Operating System,"  1997, Southern Connecticut State University. local copy (ps).  Reference.
  9. David A. Wheeler, "Secure Programming for Linux and Unix HOWTO," April 2000, www.linuxdoc.org/ HOWTO/Secure-Programs-HOWTO.html  Reference.
06/07/01 03:29:22 PM
Open Content Copyright © 2001 pmateti@cs.wright.edu