Monday, 12 March 2012

Closing file handles on Windows

While using Visual Studio 2010 I ran into a really annoying issue that prevented the IDE from copying files to the output directory, because the files were locked:

Unable to copy file '...bin\Debug\[ProjectName].config'. Access to the path '...bin\Debug\[ProjectName].config' is denied."

This behaviour seems to be a not so rare bug in Visual Studio, as can be seen from this Stackoverflow post.

Having tried many of the suggested solutions and still getting the error occasionally, I decided wasting my time by manually finding out what process is using the directory and killing it via the Sysinternal Process Explorer was too much of a hassle.

So I wrote a small one-liner that uses the handle.exe program from the Sysinternal Suite to automatically find and kill all handles on the project-dir.

To run this "script" you need the Linux utilities awk and sed on your path. (which you can either get via Cygwin or from GnuWin32). Moreover it writes the commands to really close the handles to a Powershell-script that will be executed to close the handles.

Warning: This script requires administrative privileges to run (otherwise handle.exe will not work), and forcefully closing handles can cause system instability. If you are unsure about closing the handles just delete the | .\kill_handles.ps1-part in the first listing - this will store all commands that will really close the handles in a file called .\kill_handles.ps1 but will not immediately run that file.

Because of some peculiarities of awk on Windows three files are needed, as directly using the awk-parts did not work on Windows.

This code is the main script that just runs all commands in order - it should be safest to replace [path_to_project_dir] with the full path - otherwise handle.exe might pick up directories whose handles you do not want to close. The same goes is true for the [project_name] - this command is used to just extract the lines needed from the handle output - using the folder name of the solution should normally suffice - to check if the output looks good just delete everything behind that command and analyze the output to make sure.

handle.exe [path_to_project_dir]\bin | awk '/[project_name]/' | awk -f extract_pid_hex.awk | sed 's/://g' | awk -f write_handle_commands.awk ] kill_handles.ps1 | .\kill_handles.ps1

This snippet should be put into an awk-file called extract_pid_hex.awk - it extracts the process- and handle-ids from the handle.exe output.

{print $3 " " $6}

The following snippet is an awk-file that writes the commands to close handles. This file should be named write_handle_commands.awk.

{print "handle.exe -p " $1 " -c " $2 " -y"}

After running these commands I at least could always delete the bin directory of the project and rebuild it again without any errors.

If anyone knows a way of doing this only using powershell I would be glad to know how - it might be a good way to finally getting to know it.

Saturday, 3 March 2012

C-strings and high-level languages

When using languages like Python or Java it seems easy to forget that most of these languages still use the C-level APIs of the operating system.

Due to learning C via the great book Learn C the hard way by Zed Shaw, in conjunction with a really great seminar on application security I could confirm a sneaky bug in Python 3 that one should be aware of:

Strings in Python that contain a null-byte[1] can lead to a security vulnurability when dealing with files:

Python 3

Say you have two files - one called file and one called file.txt.

Contents of the files:


% cat file
-rwxrwxrwx 1 root root   0 2012-03-02 19:17 file
-rwxrwxrwx 1 root root  35 2012-03-02 19:15 file.txt

% cat file.txt 
not important

Assume that the data in file is very important and should be kept secret, whereas file.txt is save to be read by the user via a program. When implementing the check by looking at the file extension the following code will fail:


In [3]: file_name = "file\0.txt"

In [4]: print(file_name)
file.txt

In [5]: file_name
Out[5]: 'file\x00.txt'

In [6]: if file_name.endswith(".txt"):
    for line in open(file_name):
        print(line)
   ...:         
total 1

-rwxrwxrwx 1 root root   0 2012-03-02 19:17 file

-rwxrwxrwx 1 root root  35 2012-03-02 19:15 file.txt

This happens, because Python 3 opens the file using the C APIs of the operating system (Linux Mint in my case) and thus will open the file file instead of file.txt because the null-byte terminates the string prematurely.

Java

Just as another example - the following Java code is affected as well, when using the Open-JDK on Linux[2]:


import java.io.FileReader;
import java.io.Reader;
import java.io.IOException;

public class ZeroPoison {
    public static void main (String [] args)
    {
        String fileName = "file\0.txt";
        try {
            Reader fileReader = new FileReader(fileName);
            for (int c; (c = fileReader.read() ) != -1; ) {
                System.out.print( (char) c);
            }
        }
        catch (IOException e) {
        }
    }
}

Prevention

As with many problems, the solution is to validate your input - in Python 3 you can simply choose to replace every occurrence of the null-byte like this:

safe_string = unsafe_string.replace("\0", "")

This will replace the null-byte even if it was added in hexadecimal (\x00) or octal (\000) notation.

Apart from that it might be possible to check if allowing the null-byte is desired functionality in Python 3 or not - if I find the time I might look into it.

Python 2.7 and PyPy

However, neither Python 2.7, nor PyPy will open a file with a null-byte in the file-name, but quit with the following error messages:


Traceback (most recent call last):
  File "zero_poison.py", line 4, in 
    for line in open(string):
TypeError: file() argument 1 must be encoded string without NULL bytes, not str


[1]: In C the null-byte is used to terminate a string thus cutting it of at the position of the null-byte. Languages that followed C abstracted away the need to insert a null-byte (or maybe use a different mechanism to determine how long a string is / when it ends - the Wikipedia-article on strings has a good overview here).

[2]: This Java code is probably pretty awful, because I have not written any Java in quite a while - but it should get the point across.