Sunday, 11 November 2012

Install matplotlib on Ubuntu 12.10 for python 3

Install matplotlib Ubuntu 12.10

This is a follow-up post for installing matplotlib for Python 3, now that matplotlib 1.2 has been released and officially supports Python 3: it will describe how to install all tools necessary to setup a virtualenvironment, and how to create one that will host the installation of matplotlib and numpy.

Install distribute & pip

Installation instructions for the current version can be found at the Python Package Index.

Normally the following code will suffice:


wget http://python-distribute.org/distribute_setup.py
sudo python3 distribute_setup.py

This will finish installing distribute which will allow us to install pip.


# Careful to use the correct easy_install if you have it installed
# for python2 as well!
sudo easy_install pip

Now you have pip installed and can proceed to install virtualenv and virtualenvwrapper (a number of scripts that make using virtual environments a lot easier).

Install virtualenv & virtualenvwrapper

To install these two packages make sure to use the correct version of pip if you have it installed for Python 2 and Python 3:


sudo pip install virtualenv
sudo pip install virtualenvwrapper

Now you have to make sure virtualenvwrapper is loaded and is using Python 3. To do this just add the following two lines to your bashrc (or zshrc) file:


VIRTUALENVWRAPPER_PYTHON='/usr/bin/python3'
source /usr/local/bin/virtualenvwrapper.sh

Installing matplotlib

First create a new virtual environment for matplotlib:


mkvirtualenv <name_of_environment>

The virtual environment will normally be automatically activated (you can normally see this when the shell prompt is prepended with (<name_of_environment>)<rest_of_your_prompt>:).

If it is not activated simply issue the following command:


workon <name_of_environment>

Now before installing matplotlib we need to install a few more dependencies: to install numpy we need Python header files and to compile numpy we need a c-compiler.

To be able to display plots using the TkAgg back-end we will also need the Tk-libraries, as well as the tk-headers and libpng headers.

All these dependencies can be installed with apt-get:


sudo apt-get install python3-dev build-essential python3-tk tk-dev libpng12-dev

Then you can install the dependencies for matplotlib and matplotlib itself:


pip install numpy
# using <package>==<version> syntax makes sure to install exactly the
# specified version.
pip install matplotlib==1.2

This should allow you to run the following script and produce a simple linear graph:


import matplotlib.pyplot as plt
plt.plot([1,2,3,4])
plt.ylabel('some numbers')
plt.show()

Wednesday, 22 August 2012

Installing Matplotlib on Python3

Having recently tried to install matplotlib in a virtual environment for Python 3, I encountered some problems and decided to document the process I had to take to make sure it works:

In retrospective it actually was really easy and can be achieved with the following few steps [1]:


sudo apt-get install python3-tk tk tk-dev
pip install numpy
pip install git+https://github.com/matplotlib/matplotlib.git

This will install the necessary packages for matplotlib:

  • numpy for fast mathematical computations.
  • tk for displaying the generated plots.

After these steps the following small program should display a very simple linear graph:


import matplotlib
matplotlib.use("TkAgg")
import matplotlib.pyplot as plt
plt.plot([1,2,3,4])
plt.ylabel('some numbers')
plt.show()

[1]Assuming you have already installed pip and are running Ubuntu or Linux Mint.

Monday, 9 April 2012

Populating the vim error-list from a file

When using vim to compile code it is possible to let it parse the messages from the built tool and create a error-list that can be used to jump to the correct files (and most often the correct line).

When using a tool such as grep or the Windows Powershell Select-Pattern it is possible to obtain output that resembles a certain form:

filename:linenumber: found-string
filename2:linenumber2: found-string

This can be used in vim to create an error-list as well, that can be used to jump between these matches incredibly fast - the only things that need configuration are the grep-command, the makeprg-command and the errorformat 1.

Adjusting grep

To obtain all the necessary information from grep it is necessary to issue the command with the following parameters:

grep -Hn pattern file
-H

Will make grep print the file the match was found in.

-n

Will print the line-number of the match.

Adjusting makeprg

Instead of compiling a program, the makeprg command should simply echo back the current buffer that should be used to create the error-list - this can easily be achieved by simply using the echo command.

Therefore the following command must be used in vim.

set makeprg=cat\ %

When running the command make in vim now it will simply echo the currently active buffer and try to match the pre-defined errorformats against it 2.

Adjusting the errorformat

Running make after adjusting the makeprg-command might already work - if not a new errorformat can be set with the following text:

set errorformat+=%f:%l:%m
%f

Will identify the token that represents the file-name.

%l

Will identify the token that represents the line-number.

%m

Will identify the token that represents the error-message 3.

Due to the += the new errorformat will simply be appended to the already defined ones and not overwrite former lists.


  1. It is also possible to use the vimgrep-command - especially on Windows that seemed to be very slow and using the built-in commands and this method proved much faster.

  2. To see the pre-defined formats the command set errorformat? can be used.

  3. To see more possible tokens that can be used check :h errorformat.

Monday, 12 March 2012

Closing file handles on Windows

While using Visual Studio 2010 I ran into a really annoying issue that prevented the IDE from copying files to the output directory, because the files were locked:

Unable to copy file '...bin\Debug\[ProjectName].config'. Access to the path '...bin\Debug\[ProjectName].config' is denied."

This behaviour seems to be a not so rare bug in Visual Studio, as can be seen from this Stackoverflow post.

Having tried many of the suggested solutions and still getting the error occasionally, I decided wasting my time by manually finding out what process is using the directory and killing it via the Sysinternal Process Explorer was too much of a hassle.

So I wrote a small one-liner that uses the handle.exe program from the Sysinternal Suite to automatically find and kill all handles on the project-dir.

To run this "script" you need the Linux utilities awk and sed on your path. (which you can either get via Cygwin or from GnuWin32). Moreover it writes the commands to really close the handles to a Powershell-script that will be executed to close the handles.

Warning: This script requires administrative privileges to run (otherwise handle.exe will not work), and forcefully closing handles can cause system instability. If you are unsure about closing the handles just delete the | .\kill_handles.ps1-part in the first listing - this will store all commands that will really close the handles in a file called .\kill_handles.ps1 but will not immediately run that file.

Because of some peculiarities of awk on Windows three files are needed, as directly using the awk-parts did not work on Windows.

This code is the main script that just runs all commands in order - it should be safest to replace [path_to_project_dir] with the full path - otherwise handle.exe might pick up directories whose handles you do not want to close. The same goes is true for the [project_name] - this command is used to just extract the lines needed from the handle output - using the folder name of the solution should normally suffice - to check if the output looks good just delete everything behind that command and analyze the output to make sure.

handle.exe [path_to_project_dir]\bin | awk '/[project_name]/' | awk -f extract_pid_hex.awk | sed 's/://g' | awk -f write_handle_commands.awk ] kill_handles.ps1 | .\kill_handles.ps1

This snippet should be put into an awk-file called extract_pid_hex.awk - it extracts the process- and handle-ids from the handle.exe output.

{print $3 " " $6}

The following snippet is an awk-file that writes the commands to close handles. This file should be named write_handle_commands.awk.

{print "handle.exe -p " $1 " -c " $2 " -y"}

After running these commands I at least could always delete the bin directory of the project and rebuild it again without any errors.

If anyone knows a way of doing this only using powershell I would be glad to know how - it might be a good way to finally getting to know it.

Saturday, 3 March 2012

C-strings and high-level languages

When using languages like Python or Java it seems easy to forget that most of these languages still use the C-level APIs of the operating system.

Due to learning C via the great book Learn C the hard way by Zed Shaw, in conjunction with a really great seminar on application security I could confirm a sneaky bug in Python 3 that one should be aware of:

Strings in Python that contain a null-byte[1] can lead to a security vulnurability when dealing with files:

Python 3

Say you have two files - one called file and one called file.txt.

Contents of the files:


% cat file
-rwxrwxrwx 1 root root   0 2012-03-02 19:17 file
-rwxrwxrwx 1 root root  35 2012-03-02 19:15 file.txt

% cat file.txt 
not important

Assume that the data in file is very important and should be kept secret, whereas file.txt is save to be read by the user via a program. When implementing the check by looking at the file extension the following code will fail:


In [3]: file_name = "file\0.txt"

In [4]: print(file_name)
file.txt

In [5]: file_name
Out[5]: 'file\x00.txt'

In [6]: if file_name.endswith(".txt"):
    for line in open(file_name):
        print(line)
   ...:         
total 1

-rwxrwxrwx 1 root root   0 2012-03-02 19:17 file

-rwxrwxrwx 1 root root  35 2012-03-02 19:15 file.txt

This happens, because Python 3 opens the file using the C APIs of the operating system (Linux Mint in my case) and thus will open the file file instead of file.txt because the null-byte terminates the string prematurely.

Java

Just as another example - the following Java code is affected as well, when using the Open-JDK on Linux[2]:


import java.io.FileReader;
import java.io.Reader;
import java.io.IOException;

public class ZeroPoison {
    public static void main (String [] args)
    {
        String fileName = "file\0.txt";
        try {
            Reader fileReader = new FileReader(fileName);
            for (int c; (c = fileReader.read() ) != -1; ) {
                System.out.print( (char) c);
            }
        }
        catch (IOException e) {
        }
    }
}

Prevention

As with many problems, the solution is to validate your input - in Python 3 you can simply choose to replace every occurrence of the null-byte like this:

safe_string = unsafe_string.replace("\0", "")

This will replace the null-byte even if it was added in hexadecimal (\x00) or octal (\000) notation.

Apart from that it might be possible to check if allowing the null-byte is desired functionality in Python 3 or not - if I find the time I might look into it.

Python 2.7 and PyPy

However, neither Python 2.7, nor PyPy will open a file with a null-byte in the file-name, but quit with the following error messages:


Traceback (most recent call last):
  File "zero_poison.py", line 4, in 
    for line in open(string):
TypeError: file() argument 1 must be encoded string without NULL bytes, not str


[1]: In C the null-byte is used to terminate a string thus cutting it of at the position of the null-byte. Languages that followed C abstracted away the need to insert a null-byte (or maybe use a different mechanism to determine how long a string is / when it ends - the Wikipedia-article on strings has a good overview here).

[2]: This Java code is probably pretty awful, because I have not written any Java in quite a while - but it should get the point across.

Wednesday, 29 February 2012

PyQt Signals and Slots

To capture events generated by GUI-elements in PyQt the signal and slots mechanism from Qt is used:

Signals: signals are emitted when a user interacts with a Qt widget (e.g. the user clicks a button).

Slots: slots can be connected to signals and act upon receiving the signal.

It is possible to connect multiple slots to one signal.

With PyQt there are three different ways one can connect a slot to a signal:

The classic way:

To connect a slot to a signal in the classical Qt way the QObject's connect method is used:

object.connect(signal_widget, signal, slot_widget, slot)


import sys
from PyQt4 import QtCore, QtGui

def slot_function():
    """This function will be triggered when the button is clicked."""
    print("The function was triggered.")

if __name__ == '__main__':
    application = QtGui.QApplication(sys.argv)
    window = QtGui.QWidget()
    window.size = QtCore.QSize(20, 20)
    # This button will emit the signal when it is clicked.
    signal_button = QtGui.QPushButton("Click here", window)

    # Connect the button-click signal to the slot_function we defined earlier.
    QtCore.QObject.connect(signal_button, QtCore.SIGNAL("clicked()"), slot_function)
    window.show()
    application.exec_()

The signal that is emitted is specified via the QtCore.SIGNAL()-method. In this example no slot is specified, because our self-defined function shall be executed when the signal is received. By specifying a slot it is possible to declare which slot the slot_object will execute upon receiving the signal.

There are, however two more ways that were introduced by PyQt: New style signal and slots

New style signals and slots:

Connecting a signal to a slot with the new style system is easy and only uses the object that will emit the signal:

object.signal.connect(function)

In the example above the QtCore.QObject.connect(...)-method would be replaced with:

signal_button.clicked.connect(slot_function)

Moreover it is also possible to autoconnect slots using Qt - this is especially helpful when using the Qt-Designer, but it can easily be activated by hand as well.

Autoconnecting slots and signals:

However, it will only work when the slots that shall be autoconnected are either defined in the same object that will emit the signal or in a child-object.

To autoconnect the objects the slots need to adhere to the following form:

on_<object name>_<signal name>(<signal parameters>)

To adjust the example to this signals/slots connection the MainWindow containing the button and the slot-function are moved into a class:


import sys
from PyQt4 import QtCore, QtGui 
class MainWindow(QtGui.QMainWindow):
    """A new window for the application."""
    def __init__(self):
        super(MainWindow, self).__init__()
        self.size = QtCore.QSize(20, 20)
        # This button will emit the signal when it is clicked.
        self.signal_button = QtGui.QPushButton("Click here", self)
        # This name is used to identify the  in the function.
        self.signal_button.setObjectName("signal_button")
        # This will wire up the signals and slots depending on names.
        QtCore.QMetaObject.connectSlotsByName(self)

    @QtCore.pyqtSlot()
    def on_signal_button_clicked(self):
        """This function will be triggered when the button is clicked."""
        print("The function was triggered.")

def main():
    """docstring for main"""
    application = QtGui.QApplication(sys.argv)
    window = MainWindow()
    window.show()
    application.exec_()

if __name__ == '__main__':
    main()

This is all there is to autoconnecting slots and signals - when using the Qt Designer to create GUIs all the setup (setting an object name and using connectSlotsByName) are called inside the generated code of pyuic4 so you don't have to do this yourself.

Tuesday, 28 February 2012

Find files by extension on Windows and Linux

Using Windows at work I am kind of forced to find new ways to do familiar things:
today I needed to find a way to only find files with certain extensions.
Doing this on a Unix-based system is pretty easy when using the command line and the find-command:
find . -name "*.ext1" -o -name ".ext2"
or:
find . -regex ".*\(ext1\|ext2\)"
However, Windows does not contain the standard Linux tools, so I had the chance to try the same in the Powershell and succeeded with the following command:
Get-ChildItem -Include @("*.ext1", "*.ext2") -Rec
Get-ChildItem takes a number of strings it matches again that can be passed with the @(string, ...) construct and the -Rec flag makes Get-ChildItem also search in sub-directories.
For anyone not wanting to use the shell, to find files with a certain extension in the Windows Explorer simply use the search toolbar and the following construct:
typ:=.ext1 OR typ:=.ext2
Make sure to capitalize the OR, or it will not work.
Moreover, the = sign before the dot makes sure that the extension is matched exactely, whereas using a * would also match partial matches - e.g.: *.h would also match all files that end in .html, whereas =.h only matches .h files.