v1.00
The purpose of this document is to educate the reader on secure programming practices. The goal is to provide the user with techniques that can be applied immediately to their job in the software industry. We will cover the process involved in designing a secure application, as well as common mistakes and flaws that the author makes in the implementation of a product.
In the hectic pace of today’s software development cycles, it is all too often that someone is brought into a project with little or no security experience. The individual is then tasked with a crucial role in the implementation of an application that has security relevance. This is by no means the fault of the individual, but usually the organization, which is rushing to produce a product. Little do they realize, that in the end this could hurt the organization more, than investing in education in the first place.
There are some basic principles, which, when applied, can alleviate exposure to the majority of security problems. It should be noted that these problems apply to any type of application, not only security products, however the fact that security products also contain many of these problems is disturbing to say the least.
Vulnerabilities are most often the result of programs being coerced to operate in a different manner than their author intended. There are usually patterns that these flaws follow, and our hope is to demonstrate how to recognize these. In many cases, it’s as simple as consistently thinking ‘what if?’, and emulating the mindset of an attacker.
A number of studies have been performed on the reliability of computer software. Researchers at the University of Wisconsin have performed testing on a number of basic utility programs that are included with different versions of the UNIX operating system [2]. In their paper, named “Fuzz Revisited: A Re-examination of the Reliability of UNIX Utilities and Services”, the researchers test these utilities by passing random input data to the programs. In an original test performed in 1990, it was found that over 40% of normal programs, and 25% of X- Window programs would fail (crash or hang). When the study was performed again in 1995, the failure rate had improved, however was still at 18-23%. It was found that the failure rate of commercial operating systems was much higher than that of freely available systems such as Linux, which rated 2nd. The failure rate of public GNU software was identified to be the lowest of all, placing 1st, at 7%.
The study reports that the problems were the result of “clumsy or confusing code that did the wrong things”. The key result being that “while these errors can certainly cause programs to crash when they are fed random streams of data, these errors are exactly the kinds of problems that can be exploited by carefully crafted streams of data to achieve malicious results.”
One thing to keep in mind from the above study, is that the problems discussed were the result of attacking the target programs with random data. The attacks did not involve studying the architecture and source code of the programs themselves. A focused attack would have resulted in much higher failure rates. Remember, it only takes a single vulnerability to compromise your software system.
More esoteric and complex attacks can be found that take advantages of multiple flaws within a combination of software subsystems, which independently are not significant, however in combination can introduce serious security concerns. These are less common than directed attacks towards a single piece of software, however do exist nonetheless. Thus it is important to stress that even what appears to be an insignificant security concern, can be proven to have significance when it appears within a complex operating environment such as modern day operating systems.
This document is currently maintained by Oliver Friedrichs of@securityfocus.com. This document is a work in progress and is expanded upon by discussions that have taken place on the SECPROG mailing list at http://www.securityfocus.com/forums/secprog/.
This section discusses secure programming techniques that apply to both UNIX and Windows operating systems. This area covers topics that are applicable to a varying number of programming languages and operating system. They are programming conventions or concepts that can be applied to computing in general.
By far one of the most common security vulnerabilities, buffer overflows run rampant in many of today's applications. Surprisingly enough, this problem is not new, and in many cases has existed in the same operating systems and applications for decades (such as some UNIX variants). The frequency with which this type of problem is being discovered and exploited in mission critical software has grown significantly within the past decade. This isn't because the problems didn't exist, but because this type of attack required a higher level of sophistication than the average attacker possessed – and most creators and users of the technology were previously not interested in exploiting it to begin with. Today, with cookie-cutter instructions on how to take advantage of these problems, and the shear increase in the number of miscreants, minimal expertise is required.
The purpose of this section is not to describe how to exploit a buffer overflow vulnerability, however it is important to understand the effects of the problem and how it impacts a running program.
A buffer overflow occurs when a piece of data is copied into a location in memory, which is not large enough to hold the piece of data. The copying succeeds, however, memory outside of the boundary of the target memory is also written over. Variables in a program are either allocated on the programs stack, or the programs heap. Therefore it is common to hear the terms stack overflow and heap overflow. While both types of overflows are possible to exploit, the stack overflow is in many cases much easier.
Lets look at an example of flawed source code which contains a buffer overflow.
void function(char *str)
{
char buffer[16];
strcpy(buffer,str);
}
void main()
{
char large_string[256];
int i;
for( i = 0; i < 255; i++)
large_string[i] = 'A';
function(large_string);
}
|
In the above program, the func() function copies a string, consisting of 255 bytes, into a buffer with a size of 16 bytes, without checking it's size, by using strcpy(). If you run this program you will get a segmentation violation.
When the func() function is executed, it's stack frame has a layout as follows, simplified for the example:
As the 255 character string is copied into the buffer, the buffer fills up after the first 16 bytes. After this, other variables on the stack are overwritten, including the function return address. This address specifies where the program is to continue execution when returning from the current function. By modifying this address, the attacker can cause the program to continue execution at a different location, including in the buffer they have just sent in. If machine code instructions exist within the buffer, and the address is overwritten to point to this code, the code will be executed.
By using the strncpy() function instead, this function could be fixed to safely copy the string, only copying as much data as the target buffer can contain.
Many functions that are prone to buffer overflows have safe counterparts, that can be passed in a size parameter to restrict how much data is called.
The strcpy() function does not perform any length validation. Use the
strncpy() function instead to restrict the length of data copied. When
using strncpy(), you need to implicitly NULL terminate the string, since
NULL termination will not occur if the length of the source buffer is
larger than or equal to the size specified.
| |||||
The strcat() function, much like strcpy(), does not perform any length
checking on the string which it will append to. Use strncat() to restrict
the length of data copied. The strncat() function only appends at most the
length specified in the size parameter, NULL terminating the resulting
string.
| |||||
|
The sprintf() function, when used to format string variables, is a common place for buffer overflows to occur. Use snprintf() to restrict the length of data printed into the buffer. The snprintf() function returns the total number of characters that are required to store the passed in data. If this is larger than the passed in size value, then nothing is written to the buffer. The user should check the return value of snprintf() to ensure that the buffer was printed to.
| |||||
|
The gets() function is inherently flawed and should never be used. gets() has no provision for specifying a length, and always lead to an overflow. It will read from standard input until a new-line or EOF is received, filling the specified buffer. No length checking is ever performed, making overflowing the buffer trivial.
The fgets() function should be used in place of the gets() function. This function will read at most ‘n - 1’ characters from the specified file stream. It will read from the file stream until a newline or EOF is received, or until ‘n – 1’ characters have been read. | |||||
|
Care needs to be taken when using these functions to read strings into fixed size buffers. Ensure that you specify a maximum length to be read into a specific buffer. This can be best described by demonstrating a vulnerable and safe version:
In the second example above, only a maximum of 255 characters can be read into the specified buffer, while in the vulnerable example, an unlimited number can be read in, overwriting the stack in the function. | |||||
There have been a few instances where overflows have occurred due to
unsafe usage of the memcpy() function. This may occur when
the length specified to the memcpy() function can be manipulated by
an outside source. It is important to ensure that the length is not
larger than the memory structure being copied into. A good example
of how an overflow like this can occur is illustrated as follows:
The above example is taken from an actual vulnerability that was present in the BIND (Berkeley Internet Name Daemon) distribution, and resulted in a number of vulnerabilities in various programs. It has been simplified for the purpose of this example. The above function copies hp->h_length number of bytes into the ‘address’ variable (which is 4 bytes). Under normal circumstances, hp->h_length will always be 4, since that is the size of an internet address. If, however, an attacker can manipulate the h_length variable, which he can, if he can spoof a fake DNS reply, he can make the length larger, and pass in more data via the hp->h_addr_list variable. This will cause more than 4 bytes to be copied into the ‘address’ variable, overflowing the variable and copying data into the stack. Always ensure that you check the length before performing such an action. For example:
In reality, this vulnerability was eventually fixed in the BIND system itself, protecting users from this specific mistake. Since Internet addresses cannot be anything but 4 bytes, it was ensured that the hp->h_length variable was always 4 bytes when received from the network. | |||||
| The C++ language contains some unique additional problems to be
conscious of. For example, the following will open the program up to a
buffer overflow attack in much the same way that the gets() function in C
does:
No length checking is performed when reading characters in the above function. Ensure that the maximum buffer size is specified by first using the cin.width() member function. |
In some cases it is necessary to validate user input, and remove characters or data that are illegitimate. A good example may be reading in a username for authentication. It is possible to strip out all invalid characters that we know of, such as high-bit characters, spaces, or numbers. The better way however, is to simply strip out everything except that which we want to allow. So, instead of guessing which characters may be dangerous and stripping them out, only allow those that we know are safe. This is usually a common mistake that is made when passing data to a second program, using a shell command.
One of the most prominent examples of this problem occurred in web based CGI application known as phf, which shipped with NCSA and Apache web servers by default. phf was one of the leading causes of internet break-ins a number of years ago. The phf program stripped out known bad characters, before passing the data to a program which was called via popen(). As it happens, it missed one character, the new-line (\n) character, represented as %0a in the HTML query. By using this character in the data that was passed to the program, an attacker could execute arbitrary commands on the target host. When parsed by the shell interpreter on the remote host, the new-line characters acted as a command separator, treating the string before the new-line as one command, and the string after the new-line as a new command. By asking for the following URL, it was possible to execute the command “cat /etc/passwd” on the target host, and view the password file in the web server’s response.
/cgi-bin/phf?Qalias=hell%0acat%20/etc/passwd%0a |
After this vulnerability was found, a fix was made to a common library function that was responsible for cleaning the input. The newline character was added to a list of characters that were removed from the input. This was fine an dandy for a period of time, until someone found out that bash, (Bourne Again Shell), which is a common unix shell interpreter, also allowed the ASCII character 255 as a command seperator. This opened up the same attack again, for any operating system that had bash as their default shell (Linux). If the fix had instead only allowed known good characters, this problem would never have reoccurred.
Incorrect |
Correct |
#define BAD "/ ;[]<>&\t"
char *query()
{
char *user_data, *cp;
/* Get the data */
user_data = getenv("QUERY_STRING");
/* Remove bad characters */
for (cp = user_data; *(cp += strcspn(cp, BAD)); )
*cp = '_';
return user_data;
}
|
#define OK "abcdefghijklmnopqrstuvwxyz\
BCDEFGHIJKLMNOPQRSTUVWXYZ\
1234567890_-.@";
char *query()
{
char *user_data, *cp;
/* Get the data */
user_data = getenv("QUERY_STRING");
/* Remove all but good characters */
for (cp = user_data; *(cp += strspn(cp, OK));)
*cp = '_';
return user_data;
}
|
In the incorrect example, only known bad characters are removed from the query string. This leaves in place any unknown dangerous characters. A safer way is to perform this would be as follows, which removes everything except known good characters.
It is recommended that you never use DNS names for any type of authentication. The only secure authentication mechanism is one that is cryptographically secure. If you must use DNS however, it is important to understand the correct way in which to verify a hostname.
Lets assume that you wish to restrict the usage of a particular network service you’re writing, to a single authorized, hard-coded, hostname. When a connection is received, we obtain the connecting address from the accept() call. With this address, we can now determine the DNS hostname of the connecting host. We pass the address into a validation function.
| Incorrect | Correct |
|---|---|
int validate(u_int32_t ipaddr, char *hostname)
{
struct inaddr ia;
struct hostent *he;
memset(&ia, 0, sizeof(ia));
ia.s_addr = ipaddr;
he = gethostbyaddr(&ia, sizeof(ia), AF_INET);
if (!he)
return 0;
if (!he->h_name)
return 0;
if (!strcmp(he->h_name, hostname))
return 1;
return 0;
}
|
int validate(u_int32_t ipaddr, char *hostname)
{
struct inaddr ia;
struct hostent *he;
int count;
memset(&ia, 0, sizeof(ia));
ia.s_addr = ipaddr;
he = gethostbyaddr(&ia, sizeof(ia), AF_INET);
if (!he)
return 0;
if (!he->h_name)
return 0;
if (strcmp(he->h_name, hostname))
return 0;
he = gethostbyname(hostname);
if (!he)
return 0;
for (count = 0; he->h_addr_list[count]; count++)
if (!memcmp(&ipaddr, he->h_addr_list[count], 4)
return 1;
return 0;
}
|
In the “Incorrect” example, a reverse lookup is performed on the passed in IP address. Once the reverse lookup is complete, the hostname that was looked up is compared to our “allowed” hostname. If they match, the authentication succeeds. This introduces a security vulnerability that makes it fairly trivial for an attacker to circumvent this security mechanism. If a valid user connects, the following will occur:
An attacker can exploit this via the following:
The “Correct” scenario solves this problem since it then performs a forward lookup on the hostname to ensure that the hostname resolves to the connecting IP address. Since a hostname can have multiple IP addresses, all of them are checked. This check makes it much more difficult for an attacker to perform the above attack. The attacker would now also need to control the DNS server that serves the allowed.com domain, that is not on their own network. They would have to compromise the remote host’s network to control the DNS server, making this attack much more difficult and almost infeasible.
Checking the return value of functions may seem logical, but in some cases, it is simply overlooked! In any security critical application it is very important that the return values of all functions are checked. Let’s look at an example of how this can impact security:
char maketemp()
{
char *name = “/tmp/tempfile”;
creat(name, 0644);
chown(name, 0, 0);
return name;
}
|
The above scenario has much more wrong with it than simply not checking the return value of functions, however lets look at the scenario related to return values. In the above example, if an attacker copies the UNIX shell, /bin/sh to /tmp/tempfile, the creat() function will fail, since the file already exists. Since the return value of creat() is never checked, the function will happily continue. The attacker now makes the file setuid. Since the return value from creat() was not checked, the program now changes the owner of the file to root, giving the attacker a setuid root shell, providing super-user access. Some operating systems may prevent this, by resetting the setuid bit when a chown is performed, however this was simply used for demonstration purposes.
Always make sure you check the return value for failure or success. On UNIX, when a system call fails, the program wide errno variable is set to indicate the cause of the error.
Device drivers introduce new security concerns into the equation. Device drivers have a unique characteristic, in that they run within the operating system’s kernel. This fact gives the device driver complete control over the operating system, and all operating system memory. Device drivers are not limited in the same fashion that individual processes are. Device drivers can access all operating system memory, and are not compartmentalized like regular system processes. A flaw or bug within a device driver can cause the entire operating system to malfunction and crash. Under UNIX, this is known as a kernel panic, and under Windows based operating systems, the blue screen. In a normal process this would cause the process to crash, leaving the operating system itself intact.
Device drivers normally expose an interface, or API, to the “user-land” portion of the operating system. It is this interface that can allow unwanted side effects, if the appropriate precautions are not taken. Systems calls, which provide the core set of functionality that the operating system exposes to the user, also suffer from this problem, however it is unlikely that you will be adding new system calls to the operating system. That being said, sometimes the standard system calls that appear in the operating system by default can have these same problems.
When writing a device driver that has outside inputs, it is imperative that extreme caution is taken to sanity check all input variables. The following are precautions that should be taken to validate input data:
An example of a very interesting historical bug is one that appeared in a BSD based console driver many years ago [4]. The driver contained a screen buffer in memory, however did not correctly validate particular keystrokes, such as the backspace key. A user could use the backspace key to backup to the beginning of the screen, and then continue this to backspace outside of the screen buffer. It was then possible to scribble on memory locations preceding the buffer.
This section discusses secure programming techniques that apply to the UNIX based operating systems. The UNIX family of operating systems contain a number of security mechanisms that do not appear under other operating systems such as Microsoft Windows based operating systems. These differences are ones that the software developer should be aware of when developing UNIX based software.
All programs that run on UNIX operating systems run under the privilege of a particular system user and group. Each process possesses a set of credentials, which are used throughout the operating system to perform access validation checks. Permissions on files, directories and system resources are all assigned based on the user intended to use them. The operating system kernel itself is the entity that performs access validation checks to determine whether a process has the privilege to access a resource. It is very important to understand how the credential system works, as many security vulnerabilities have been introduced when this is not fully understood.
Current POSIX compliant operating systems apply 3 types of credentials to each system process:
| Real UID and Real GID | The real user ID and real group ID the process is running as. A process can only change the uid and gid that they are running as if the process is running as the super-user, or if the user ID or group ID they wish to change to is the same as the effective user ID or group ID (see below). |
| Effective UID and Effective GID | Each process has an effective user ID, which under normal circumstances will be set to the same value as the real user ID initially. When a setuid file is executed however, the effective user ID is to the owner of the file. Therefore, after executing a setuid root program, the effective uid of the process will be 0 (root), while the real uid will remain that of the user’s. This is the credential which system calls will use to determine whether the process has sufficient privilege to access a resource. |
| Saved UID and Saved GID | Each process has a saved user ID and group ID. This is set to the effective user ID or effective group ID when a setuid or setgid program is executed. These saved ID’s allow the process to temporarily drop privileges (by setting the effective uid or gid to a less privileged user), and then regain those privileges in the future. |
NOTE: There is no system call to obtain or set the saved uid or gid directly. Since the saved id’s are equivalent to the effective id’s at program execution, you can determine the saved id’s by using geteuid() and getegid().
Normally when a user executes a program, the program runs with the permissions of the current user. The UNIX operating system supports the ability for a program executed by a user to run with the permissions of the file’s owner or group instead. This occurs if the file has the “setuid” or “setgid” bit set in the file permissions. If the “setuid” bit is set in the file permissions, the program will run with the permissions of the file’s owner. If the “setgid” bit is set, the program will run with the permissions of the file’s group. These two permissions can also be set simultaneously on a file, and the program will execute with both the user and group permissions of the file. Take for example the following file permissions:
-rwsr-sr-x 1 root wheel 32045 Jun 6 10:15 program |
This program has both the “setuid” and “setgid” bits set on it, indicated by the “s” permission being present instead of the execute permission. When this program is run, regardless of who runs it, it will execute with user root and group wheel permissions. This presents very interesting security concerns, since a single flaw in this program, can lead to a user elevating their privileges from a normal user to the root user (the super-user or administrator).
Specific precautions on dealing with issues in these types of programs are covered in this chapter
Setuid and setgid programs are directly affected by all of the problems discussed in this manuscript. There are some general methodologies that should be applied:
Keep privileged portions of your program as simple and small as possible
Drop privileges when you have accomplished the tasks that require those privileges. Accomplish these tasks at the beginning of the program, and then drop privileges.
Have a well defined API. This should cover all functions used internally in the program, as well as the interface exposed to the user. Look for possible security consequences in the exposed API, this is a common route an attacker takes to exploit a privileged program.
Do not trust any external inputs. Ensure that you have sufficiently validated and cleaned the data before using it.
Many of the techniques discussed here, which apply to UNIX systems, apply to both network services and privileged setuid/setgid programs
Environment variables, much like command line options, provide a method to provide arbitrary data to programs. The security concerns of environment variables apply more to local system programs than network services, since network services do not accept environment variables in the same fashion (except for a few). Environment variables can be fashioned in a number of ways to illicit unexpected response from programs.
Buffer overflow - The same problem that appears in a plethora of other situations, runs rampant in the handling of environment variables. The following is a typical example of the type of problems which have been commonly seen in the field:
…
char *s, buf[128];
if (!(s = getenv(“HOME”)))
return –1;
strcpy(buf, s);
…
|
In the above function, no consideration is taken as to the size of the “HOME” environment variable. Regardless of it’s size, it is copied into a 128 byte buffer, introducing a buffer overflow condition.
An extreme case of this problem was found by Thomas Ptacek in the FreeBSD operating system. Ptacek found a problem in the C runtime library on FreeBSD. The C runtime library is statically linked with every program on the system. As a result, every single setuid/setgid program on the system was vulnerable to a buffer overflow via the (in addition to every other program as well, but this is inconsequential).
The following text is quoted from Ptacek’s message to the Bugtraq mailing list in February of 1997:
“There is a critically important security problem in FreeBSD 2.1.5's C runtime support library that will enable anyone with control of the environment of a process to cause it to execute arbitrary code. All executable SUID programs on the system are vulnerable to this problem.
An immediately exploitable problem is evident in "startup_setrunelocale()", which, if certain environment variables are set, will copy the value of "PATH_LOCALE" directly into a 1024 byte buffer on the routine's stack. An attacker simply needs to insert machine code and virtual memory addresses into the "PATH_LOCALE" variable, enable startup locale processing, and run a SUID program.”
Inheritance - It is important to note that environment variables are commonly inherited by child processes, that are spawned by the main process (this is dependant on how the child is spawned however). Therefore, be very careful when passing environment variables to a child program. You can use the execle() and execve() system calls to run a program and specify it’s set of environment variables. To be safe, it is usually a good idea to build an environment from scratch, and specify only those variables which are required.
When a file is created by you, or a program that you are running, it is created with a default set of file permissions. These permissions are dictated by the setting of the process’s umask. This is a setting that is inherited from the login shell or parent process that executed the current process. It is important to ensure that files that are created do not have unsafe permissions, allowing unauthorized users to access them.
The umask setting can be adjusted by utilizing the umask() library call, which takes as a parameter a set of bits. This set of bits is used to clear the associated bits in the mode of the created file. Some example usages settings are:
umask(0) results in -rw-rw-rw-
|
This does not turn off any bits in the permissions of newly created files.
umask(022) results in -rw-r--r--
|
This turns off the write bits for the group and world portion of the file permission.
umask(066) results in -rw-------
|
This turns off the read and write bits for the group and world portion of the file permission.
Many vulnerabilities occur as the result of a program, not necessarily
privileged, accessing a well-known or predictable file on the file system.
A program containing this problem may open a file in the system temporary
directory, blindly writing data to the file. By utilizing symbolic links,
an attacker can often redirect this data to other files. There are 2
scenarios under which this type of attack is commonly launched:
| /tmp/program.temp |
By creating a symbolic link prior to execution, an attacker can point this file to other files that are owned by the privileged user, which the program is executing as.
| ln –s /etc/passwd /tmp/program.temp |
When the program writes to this file, the symbolic link is followed, and the file written to is actually /etc/passwd.
| ln –s /etc/passwd /tmp/program.temp |
The attacker then expects another system user to execute the program, appending to the password file, assuming the user executing the program has this privilege.
The primary difference between these 2 scenarios is that in the first example, the program being executed is setuid root, and when executed runs as the super-user, while in the second scenario, the program is only executed with the executing user’s permissions.
There are a number of solutions to these problems:
There are a number of system provided functions that can be used to provide temporary files, some of which need to be used correctly to avoid security consequences.
The tmpfile() function creates a temporary file, and returns an open handle to the file stream. tmpfile() avoids the race condition between the generation of a temporary filename, and the creation of the file. tmpfile() is defined as follows:
| FILE *tmpfile(void); |
On many operating systems, tmpfile() will use the mkstemp() function to obtain and create a temporary filename, then unlink() the file, and fdopen() the file descriptor to return a stream handle.
Benefits
Restrictions
The mkstemp() function creates a temporary file, given a template, and returns an open file descriptor to the file. mkstemp() avoids the race condition between the generation of a temporary filename, and the creation of the file. mkstemp() is defined as follows:
| int mkstemp(char *template); |
A template is passed into the mkstemp() function which gives the path to the temporary file. In the template are a series of X’s, which are filled in by the function with random values to create the random temporary file name. An example usage would be:
| fd = mkstemp(“/tmp/tempfileXXXXXX”); |
When using this function, make sure that you specified at least six trailing X’s. Some operating systems support more than six X’s to increase the randomness of the filename.
Benefits
Restrictions
The mktemp() function is used to generate a unique filename, without actually creating the file. mktemp() is defined as follows:
| char *mktemp(char *template); |
This function takes a template parameter in the same fashion as the mkstemp() function above. In the template are a series of X’s, which are filled in by the function with random values (sometimes) to create the random temporary file name. An example usage would be:
| filename = mktemp(“/tmp/tempfileXXXXXX”); |
Benefits
Restrictions
When using mktemp() it is necessary to open the temporary file once the filename has been generated. Be careful when opening the file.
| open(filename, O_WRONLY | O_CREAT, 0644); |
The above call will create the file, succeeding even if the file already exists. This is dangerous, and can be used to overwrite existing files if a race condition exists.
| open(filename, O_WRONLY | O_CREAT | O_EXCL, 0644); |
The above is a safer way to create the file, since the call will fail, if the file already exists.
There are a large number of different situations in which a race condition can occur, giving an attacker the ability to subvert access checks or file creations. There is a common pattern, which when identified, can be alleviated to secure the operation.
The problem that arises here is that in between the first and second operations, an attacker can manipulate the file, causing the permissions or status check to succeed, and the file operation to reference a different file.
This type of attack commonly utilizes symbolic links to take advantage of a program’s insecurity. Lets look at an example source code fragment, which could be present in an insecure setuid root program:
int unsafeopen(char *filename)
{
struct stat st;
int fd;
/* obtain the files status information */
if (stat(filename, &st) != 0)
return -1;
/* make sure that the file is owned by root – uid 0 */
if (st.st_uid != 0)
return -1;
fd = open(filename, O_RDWR, 0);
if (fd < 0)
return -1;
return fd;
}
|
Essentially the above function does the following:
Since these are 2 separate system calls, there is no atomicity between them, leaving a time delay between the 2 specific operations. Within this time delay, it is possible for file and system characteristics to change. An attacker can exploit this in the following fashion:
A safe version of this function would do the following:
int safeopen(char *filename)
{
struct stat st, st2;
int fd;
/* obtain the file’s status information */
if (lstat(filename, &st) != 0)
return -1;
/* make sure the file is a regular file */
if (!S_ISREG(st.st_mode))
return -1;
/* make sure that the file is owned by root – uid 0 */
if (st.st_uid != 0)
return -1;
/* open the file */
fd = open(filename, O_RDWR, 0);
if (fd < 0)
return -1;
/* now we fstat() the file, to make sure it’s the same file still! */
if (fstat(fd, &st2) != 0) {
close(fd);
return -1;
}
/* here we make sure the inode and device numbers are the
* same in the file we actually opened, compared to the file
* we performed the initial lstat() call on.
*/
if (st.st_ino != st2.st_ino || st.st_dev != st2.st_dev) {
close(fd);
return -1;
}
return fd;
}
|
The above function uses lstat() instead of stat(). This returns the status of the link, if the specified filename happens to be a symbolic link. It then opens the file, and obtains the status of the open file descriptor. The inode and device numbers of the status information are compared (they are unique between files), and if they are not found to be identical, the function is aborted.
When implementing a network service that exposes an interface to the outside world, an inherent risk is present. While the developer may have taken all precautions to ensure that their code is correct and free from the obvious flaws, outside factors such as vulnerable operating system library calls can introduce unknown vulnerabilities into the service. UNIX systems possess the ability to limit this exposure by limiting a program’s view of the operating system. This is achieved by using the chroot system call.
The semantics of this call are as follows:
| int chroot(const char *path); |
The chroot system call is used by a program to alter it’s view of the current file system. Note that this only limits file system access, and the process still has access to other parts of the operating system (system and network calls for example). When used, the path argument specified to this call will be the new root directory of the filesystem. Once this has been performed, the process can no longer access files or directories outside of this new root directory. Only the super-user can execute the chroot system call.
Once you have used the chroot system call, you need to change to the new root directory. An example usage is given here:
if (chroot(“/jail”) < 0 || chdir("/") < 0)
perror(“Failure setting new root directory”);
|
In many situations, once you have used the chroot call, it is also wise to drop super-user privileges. This will limit the exposure if a vulnerability does exist in your program. If a user is able to compromise your program, and break into the limited environment, he has a much greater chance of breaking out of this environment if he has super-user privileges.
A user with super-user access within the chrooted environment has access to key operating system functionality which can allow them to break out of the chroot environment:
Some operating systems provide protection from the above scenarios by implementing a mechanism called securelevels. This mechanism causes the operating system to run at a higher security level. This can prevent even the super-user from breaking out of the chroot environment. You should never expect this to be the case however, and should always expect a worst-case configuration.
It is common for a program to drop privileges and run as the “nobody” user after performing a chroot (and after peforming all tasks which require super-user privileges). While some programs do this, it is important to remember that even when running as the “nobody” user, the process can affect the actions of other processes if they also run as the “nobody” user. It is suggested that a special account be created for the process to run as, and that this account not be used for any other purpose.
Ensure that the chroot environment is free from any setuid or setgid programs that may allow an attacker to escalate their privileges if compromised.
Most programs require privilege to obtain access to a system resource that can only be accessed by the super-user, or other specific accounts. In a network service, this is often required to allocate a privileged TCP or UDP port, which a normal user cannot bind to. A good example of this is a program like BIND (Berkeley Internet Name Daemon), which needs to bind to port 53 to serve domain name queries. In a local privileged program, this is normally required to access other restricted system resources, such as memory, disk, or system configuration information.
While this privilege is normally only required upon initialization, there are many very large programs, consisting of hundreds of thousands of lines of complex source code, which never drop their privileges (today this usually this occurs more in commercial programs).
Sometimes a program is given such a privilege without any for-thought. In a number of past situations this has led to privileged programs containing trivially exploitable vulnerabilities. When asked about the reasoning behind this, it was found that this was to overcome some simple file access or system access restrictions, since the developer could not see any other solution.
It is important that any privileged program be designed to allocate all resources upon initialization, and then drop these privileges. Many privileged programs in the OpenBSD operating system were redesigned with this goal in mind.
To drop privileges, the process needs to set it’s effective user ID and group ID’s to those of the less privileged user. This is accomplished by using the seteuid() and setegid() system calls to drop them temporarily, and the setuid() and setgid() system calls to drop them permanently (this is explained in more detail below in the setuid program section).
WARNING: When dropping privileges, ensure that you first change the group ID of the process (if this is necessary). If you set the user ID first, the program is no longer running as the super-user, and therefore does not have sufficient privilege to change the group ID! This will mean that the group ID privilege is not dropped, and anyone exploiting a vulnerability in the subsequent program will be able to obtain the permissions of the privileged group ID.
It is important that you check the return values from the setuid and segid calls. When you are dropping privileges, and these calls fail, the privileges will not be dropped. If the return value is not checked, and appropriate action taken, the program will continue operation as the privileged user.
To drop privileges in a network service, you must choose a user for the program to run as once those privileges are dropped. In many cases, the “nobody” user is chosen, as this user has minimal access to the operating system in general.
int drop()
{
struct passwd *pep = getpwnam("nobody");
if (!pep)
return –1;
if (setgid(pep->pw_gid) < 0)
return –1;
if (setuid(pep->pw_uid) < 0)
return –1;
return 0;
}
|
When a privileged setuid or setgid program is executed, the process’s real user ID (uid) and real group ID (gid) remain set to the executing user’s uid and gid, however the effective user ID (euid), effective group ID (egid), saved user ID and saved group ID are set to that of the files owner and group (the privileged user). You can drop privileges temporarily or permanently, depending on your goal.
To drop privileges permanently, the program needs to set the euid, egid, saved uid and saved gid to the real uid and gid. The setuid() and setgid() calls will set both the real, effective and saved ID’s to the specified value. Since the real uid and gid in a setuid program are those of the unprivileged user, the following example will set all 3 id’s to the current real ID’s.
if (setgid(getgid()) < 0) return –1; if (setuid(getuid)) < 0) return –1; |
To temporarily drop privileges, with the intention of regaining them in the future, set the effective ID’s to the desired values. This works since the effective ID’s are used to perform system call and permission checks. This will ensure that the saved ID’s are preserved, allowing you to revert back to them in the future. Ensure that you store the values of the ID’s for future use however (unless you know that it will always be super-user).
struct passwd *pep = getpwnam("nobody");
uid_t saved_uid;
gid_t saved_gid;
if (!pep)
return –1;
saved_uid = geteuid();
saved_gid = getegid();
if (setegid(pep->pw_gid) < 0)
return –1;
if (seteuid(pep->pw_uid) < 0)
return –1;
/* perform desired unprivileged operations then revert back */
if (setegid(saved_gid) < 0)
return –1;
if (seteuid(saved_uid) < 0)
return –1;
|
Random number generation has been an issue without an easy solution ever since random numbers were needed. Most operating systems provide a pseudo-random number generator library call, which is appropriate for some purposes, however remember the word “pseudo” in the name. Some operating systems offer built-in random number generators, usually via a device driver that provides random data. This is most often accomplished via a kernel driver that mixes and hashes various events and variables on the system. We will cover some well known methods for obtaining random data in various operating systems.
Current Linux operating systems provide the /dev/random and /dev/urandom devices. These devices provide random numbers based on various system states, that are collected and then hashed to produce a random number.
It is claimed that both /dev/random
and /dev/urandom are secure enough to use
in generating cryptographic keys, challenges, and other applications where
secure random numbers are requisite. It should not be possible to predict the
next random number from these sources.
The difference between the two is that
/dev/random can run out of random bytes and the
reader must wait for more to become available. This can occur if not
enough activity is present on the system to allow generation of additional
random data, and can sometimes take a long time for new data to become
available.
/dev/random is high quality entropy,
generated from measuring the inter-interrupt times etc. It blocks until enough
bits of random data are available.
/dev/urandom is similar, but when the store
of entropy is running low, it'll return a cryptographically strong hash of what
there is. This isn't as secure, but it's enough for most applications.
To use these devices, simply open the device name and perform a read call on the device for the desired number of bytes.
The OpenBSD kernel uses the mouse interrupt timing, network data interrupt latency, inter-keypress timing and disk IO information to fill an entropy pool. Random numbers are available for kernel routines and are exported via devices to userland programs. OpenBSD exposes the device /dev/random to userland programs requiring random numbers.
The following is taken from the OpenBSD manual page:
The various random devices produce random output data with different random qualities. Entropy data is collected from system activity (like disk and network device interrupts and such), and then run through various hash or message digest functions to generate the output.
| /dev/random | This device is reserved for future support of hardware random generators. |
| /dev/srandom | Strong random data. This device returns reliable random data. If sufficient entropy is not currently available (i.e., the entropy pool quality starts to run low), the driver pauses while more of such data is collected. The entropy pool data is converted into output data using MD5. |
| /dev/urandom | Same as above, but does not guarantee the data to be strong. The entropy pool data is converted into output data using MD5. When the entropy pool quality runs low, the driver will continue to output data. |
| /dev/prandom | Simple pseudo-random generator. |
| /dev/arandom | As required, entropy pool data re-seeds an ARC4 generator, which then generates high-quality pseudo-random output data. The arc4random(3) function in userland libraries seeds itself from this device, providing a second level of ARC4 hashed data. |
It is common for one program to require the execution of another. When this is required in a privileged program, such as one that is setuid, or in a network service, it is important to do this very carefully.
system() popen() |
Both of these functions execute the specified program by utilizing the UNIX system’s shell interpreter, /bin/sh. By using this, a wide range of potential security problems are unnecessarily introduced. Instead, use the execl or execv system calls.
Core files are generated by UNIX programs when an exception occurs. These exceptions normally occur when the program memory or stack is corrupted, or invalid memory or misaligned structures are accessed. This situation occurs do the program flaws, introduced by the developer. When the exception occurs, the operating system writes the memory of the currently executing program to a disk file, usually called ‘core’ or ‘program.core’, where program is the name of the program which was executing. The normal use of the core is to analyze the state of the program when it crashed, and assist in determining where the problem occurred. Two security problems arise out of the creation of this core file.
A good example of where the first problem occurred in the past was in the FTP server which shipped with the Solaris operating system. A flaw existed whereby an attacker could connect to the FTP server, and issue the PASV command before any other command, causing the FTP server to crash. Upon crashing, the FTP server would dump it’s memory contents into a core file, located in the system root directory. By analyzing this file, a user with local system access could extract password file hashes, which could then be cracked to obtain usernames and passwords.
If your program will contain security critical information in memory, it is wise to disable the creation of core files upon an exception. You can use the setrlimit() function call to accomplish this.
| int setrlimit(int resource, const struct rlimit *rlp); |
By using this function, with a resource type of RLIMIT_CORE, you can set the size of the core file that is created (in bytes). If you specify a size of 0, this prevents a core file from being created.
int nocore()
{
struct rlimit rlp;
rlp->rlim_cur = 0;
rlp->rlim_max = 0;
return(setrlimit(RLIMIT_CORE, &rlp));
}
|
This section discusses secure programming techniques that are specific to Windows operating systems. We focus primarily on Windows NT, however many of the topics also apply to other version of Windows, such as Windows 95 and Windows 98. All topics also apply to Windows 2000.
Much of Windows NT security is based on Access Control Lists (ACL’s for short). ACL’s can be applied to many different types of objects, used in various parts of the system. Some examples of these objects include:
ACL’s define who the owner of an object is, and who can access the object. In many cases, when creating a new object, the ACL’s applied to the object are not safe. It is up to the implementer to ensure that the secure attributes are correctly set. There are generally 4 different types of security information that apply to an object. When setting or retrieving the security information on an object, it is important to be familiar with these.
| OWNER_SECURITY_INFORMATION |
This option specifies the owner of an object. In many cases, this will default to the user who initially created the object.
| GROUP_SECURITY_INFORMATION |
This option indicates the group permissions applied to the object. This defines which groups can access the object.
| DACL_SECURITY_INFORMATION |
This option sets the “discretionary” access control list on the specified registry key.
| SACL_SECURITY_INFORMATION |
This option sets the “system” access control list on the specified registry key.
Windows NT provides a function called SetSecurityInfo() that can be used to apply security descriptors to any of the objects shown above. This function is used as follows:
DWORD SetSecurityInfo( HANDLE handle, // handle to the object SE_OBJECT_TYPE ObjectType, // type of object SECURITY_INFORMATION SecurityInfo, // type of security information to set PSID psidOwner, // pointer to the new owner SID PSID psidGroup, // pointer to the new primary group SID PACL pDacl, // pointer to the new DACL PACL pSacl // pointer to the new SACL ); |
The SetSecurityInfo() function will only set one of the ACLs (owner, group, DACL, or SACL), not all of them, and you must specify in the SecurityInfo variable, which ACL you wish to set. You must also specify in the ObjectType variable, which type of object you have specified a handle to (registry key, file, etc).
In addition to this function, there are also a number of object specific functions that can be used to apply security descriptors to their respective types of objects. The registry for example, has it’s own set of functions that can be used to set ACLs on registry key, essentially duplicating the purpose of the SetSecurityInfo() function.
One of the biggest problems present is that when most objects are created, they possess insecure ACLs, usually granting “Everyone” access to the object. It is important that the application developer is aware of this, and takes appropriate precautions to restrict access.
It is quite common to use the Windows registry to store vital configuration information, that is required for normal operation of your program. Sometimes vital security information is stored in the registry as well. More often than not, however, registry keys are not sufficiently protected to prevent an attacker from reading or modifying the data stored in the registry. Instead of using the SetSecurityInfo() functions, there are also specific functions for managing ACLs on registry keys. There are 2 functions present within Windows to support the viewing and setting of registry key permissions.
LONG RegGetKeySecurity( HKEY hKey, // open handle of key to set SECURITY_INFORMATION SecurityInformation, // descriptor contents PSECURITY_DESCRIPTOR pSecurityDescriptor, // address of descriptor for key LPDWORD lpcbSecurityDescriptor // address of size of buffer and descriptor ); |
LONG RegSetKeySecurity( HKEY hKey, // open handle of key to set SECURITY_INFORMATION SecurityInformation, // descriptor contents PSECURITY_DESCRIPTOR pSecurityDescriptor // address of descriptor for key ); |
It is very important to ensure that you adequately protect your program’s files with ACLs. There are two primary scenarios that are common in Windows NT application development.
1. Application is a system wide application, available to all users.Cannot limit access to the application, and it’s directories, since all users require the ability to access them.
Limit access to all application generated files – those that are created for each user. This will ensure that only the user who created the files will have access to them. Ensure that only the user has access to read and write to these files.
Limit access to the application, and it’s directories. No other users require access to the application. These ACLs can usually be applied during the installation process.
Limit access to all application generated files – those that are created for each user. This will ensure that only the user who created the files will have access to them. Ensure that only the user has access to read and write to these files.
There are 2 functions available in Windows NT to support the viewing and setting of file permissions.
BOOL GetFileSecurity( LPCTSTR lpFileName, // address of string for file name SECURITY_INFORMATION RequestedInformation, // requested information PSECURITY_DESCRIPTOR pSecurityDescriptor, // address of security descriptor DWORD nLength, // size of security descriptor buffer LPDWORD lpnLengthNeeded // address of required size of buffer ); |
BOOL SetFileSecurity( LPCTSTR lpFileName, // address of string for filename SECURITY_INFORMATION SecurityInformation, // type of information to set PSECURITY_DESCRIPTOR pSecurityDescriptor // address of security descriptor ); |
The Windows NT CryptoAPI library provides functions that can be used to obtain random data. The CryptoAPI exports a function called CryptGenRandom() that will fill a specified buffer with the required number of bytes of random data.
BOOL WINAPI CryptGenRandom( HCRYPTPROV hProv, DWORD dwLen, BYTE pbBuffer, ); |
The following is quoted from the Microsoft MSDN and is Copyright 2000, Microsoft Corporation.
“The data produced by this function is cryptographically random. It is far more random than the data generated by the typical random number generator such as the one shipped with your C compiler.
This function is often used to generate random initialization vectors and salt values.
All software random number generators work in fundamentally the same way. They start with a random number, known as the seed, and then use an algorithm to generate a pseudo-random sequence of bits based on it. The most difficult part of this process is to get a seed that is truly random. This is usually based on user input latency, or the jitter from one or more hardware components.
If an application has access to a good random source, it can fill the pbBuffer buffer with some random data before calling CryptGenRandom. The CSP then uses this data to further randomize its internal seed. It is acceptable to omit the step of initialize the pbBuffer buffer before calling CryptGenRandom.”
A network protocol defines the format of communication between networking devices. When designing a network protocol, there are many factors involved that have security implications. Take the following security concerns into account when considering designing a network protocol:
It is possible for an attacker to passively monitor a network, examining all data traveling on the network.
It is possible for an attacker to intercept traffic, modify data, and resend the modified traffic.
It is possible for an attacker to forge packets from another host.
It is possible for an attacker to send packets that do not conform to the protocol specifications.
The above items are all concerns that apply to a network protocol. By combining data confidentiality (encryption), authentication, and correct input validation, most of these concerns can be alleviated.
Designing and implementing a protocol are two completely different tasks. The implementation portion consists of following many of the guidelines and examples set forth in this document. The design portion, however, requires some forethought into the purpose of the proposed protocol.
Authentication within network protocols can occur at varying degrees, as listed below:
Authentication via account login
This authentication usually occurs initially upon session startup. Services such as POP, IMAP, NNTP, telnet, ftp all perform authentication by entering a username and password after connecting to the service. In many cases, user and password information is sent in the clear across the network, exposing it to eavesdroppers.
Cryptographic authentication
This type of authentication usually utilizes public key cryptography to authenticate a user or client. A secret-key algorithm can also be used if both parties are configured to know the common key.
Protecting protocol data from prying eyes is very important if your protocol is designed to carry sensitive information. The SSL protocol, which serves as the basis for all e-commerce transactions on the World Wide Web is a good example of a protocol that does this.
Manually reviewing thousands of lines of source code can be a very tedious process. A number of tools exist which can assist in the discovery of security problems in UNIX and Windows source code. These are usually source code analysis tools which can be run on source code, and provide output designating potentially dangerous calls.
In addition to tools which assist in the discovery of security vulnerabilities, a number of tools also exist which aid in preventing the exploitation of these vulnerabilities. While these tools do not solve the problem of faulty and insecure source code, they can assist in the prevention of attacks if there is no alternative. These tools come in several varieties, ranging from those that provide compiler add-ons to prevent buffer overflows, to run-time libraries which catch dangerous library calls.
ITS4
“The Software Security Group at RST designs, analyzes, and tests security-critical software. We developed ITS4 to help automate source code review for security. ITS4 is a simple tool that statically scans C and C++ source code for potential security vulnerabilities. It is a command-line tool that works across Unix environments, and will also run on Windows if you have CygWin installed.”
StackGuard from Immunix
“StackGuard is a compiler approach for defending programs and systems against "stack smashing" attacks. Stack smashing attacks are the most common form of security vulnerability. Programs that have been compiled with StackGuard are largely immune to stack smashing attack. Protection requires no source code changes at all. When a vulnerability is exploited, StackGuard detects the attack in progress, raises an intrusion alert, and halts the victim program.”
Libsafe from Bell Labs
http://www.bell-labs.com/org/11356/libsafe.html
“The exploitation of buffer overflow vulnerabilities in process stacks constitutes a significant portion of security attacks in recent years. We present a new method to detect and handle such attacks. In contrast to previous work, our method does not require any modification to the operating system and works with existing binary programs. Our method does not require access to the source code of defective programs, nor does it require recompilation or off-line processing of binaries. Furthermore, it can be implemented on a system-wide basis transparently. Our solution is based on a middleware software layer that intercepts all function calls made to library functions that are known to be vulnerable. A substitute version of the corresponding function implements the original functionality, but in a manner that ensures that any buffer overflows are contained within the current stack frame, thus, preventing attackers from 'smashing' (overwriting) the return address and hijacking the control flow of a running program.”
The tools listed below can assist in the prevention of security vulnerabilities and protection of Windows based software. It should be noted that the professionalism of the below tools vary greatly, and caution should be taken when implementing and utilizing them.
BOWall
http://developer.nizhny.ru/bo/eng/BOWall
“BOWall implements protection against buffer overflow attacks for programs executed Windows NT 4.0 files. Two protection methods are provided:
Monitor vulnerable functions. Potentially vulnerable DLLs are updated, replacing functions such as strcpy, wstrcpy, strncpy, wstrncpy, strcat, wcscat, strncat, wstrncat, memcpy, memmove, sprintf, swprintf, scanf, wscanf., gets, getws, fgets, fgetws. The update consists of an addition of a integrity checking code that checks the local variable frame base pointer.
Prevent execution of dynamic library functions from data and stack memory. This method updates the exported vulnerable DLL functions by the addition of code that checks the address of a given function call. If the address of a call belongs to the data or stack then program execution is blocked.”
| [1] | Smashing the Stack for Fun and Profit By Aleph One http://www.securityfocus.com/data/library/P49-14.txt |
| [2] | Fuzz Revisited: A Re-examination of the Reliablity of UNIX Utilities
and Services ftp://grilled.cs.wisc.edu/technical_papers/fuzz-revisited.ps.Z |
| [3] | The Unix Secure Programming FAQ, Tips on security design principles,
programming methods, and testing By Peter Galvin http://www.sunworld.com/sunworldonline/swol-08-1998/swol-08-security.html |
| [4] | Communications with Theo de Raadt (deraadt@openbsd.org) |
| [5] | Thomas Ptacek Bugtraq message regarding FreeBSD C runtime library
overflow |
The following individuals have contributed to this document by taking part in the discussions on the SECPROG mailing list at http://www.securityfocus.com/: