Open Source Software for Science: Python

Matlab claims to be "The Language of Technical Computing". It really is a nice collection of software. I learned how to program using Matlab, in order to settle a dispute relating to my research. As it turns out, I no longer use Matlab. There are two reasons: 1) It's expensive, and 2) Python is much more pleasant to work with. You can find some websites advocating Python that give a far more compelling argument than I can in what little spare time I had today to update this page.

I think it is important to use and contribute to open source software, especially in the pursuit of science funded by the public. This speaks more broadly to the ridiculous state of scientific publishing, where we actually pay somebody to publish our results, and then pay again to get access to those publications. I think it is important to maintain some degree of independence, and open source software can play an important part in that goal. Two books that played an important role shaping my opinions on open source are Cryptonomicon by Neal Stephenson and The Cathedral and the Bazaar by Eric S. Raymond.

Software packages I use in place of Matlab

Python: The basic programming language. It is easy to learn and use, much more compact and readable than C/C++, is powerful, comes with many useful libraries, and works with several full featured Graphical User Interface (GUI) toolkits, like Qt, wxWidgets, GTK and TK. Learning Python by Mark Lutz and David Ascher is an excellent introduction and reference. If you can handle Matlab, you can handle Python.

IPython: An interactive python environment, similar to the Matlab command line. Note to windows users: You may need to turn off your system beep with older versions of IPython, due to some unfortunate behavior of the readline library. Newer versions of IPython, use pyreadline, do not have this problem.

NumPy: The NumPy project provides efficient arrays and matrices for Python. NumPy can use the very efficient ATLAS/BLAS/LAPACK linear algebra routines, and has a syntax similar enough to Matlab. NumPy-1.0 was released on October 25, 2006.

SciPy: Here is a nugget from the SciPy website: "SciPy is an Open Source library of scientific tools for Python. It gathers a variety of high level science and engineering modules together as a single package. SciPy provides modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, genetic algorithms, ODE solvers, special functions, and more." Many of the modules are well established scientific computing libraries, written in FORTRAN or C. SciPy requires NumPy, and should not feel too foreign to an experienced Matlab user.

Matplotlib: I saved my favorite for last, I personally contribute to this project. Matplotlib, along with IPython and NumPy, provides an environment for making plots, with a syntax very similar to Matlab. In addition, Matplotlib is an object oriented programming library that works well with many GUI toolkits, including Qt (versions 3 and 4), wxWidgets, GTK, Tk. I once wrote a program in one day to monitor the output of a CCD X-ray detector and plot the integrated intensity of various regions of interest as a function of time. Matplotlib provides lots of examples, and the publication quality output is, in my opinion, far superior to Matlab.

Together, Python/IPython/NumPy/SciPy/Matplotlib is an excellent set of tools for scientific research.

You should familiarize yourself with the license for each of these packages. I can not speak for any individual project, but the open source software community strives to produce free (as in speech and beer) software that is unencumbered by the restrictions of the GPL. The GPL requires that if your project uses code that is covered by the GPL, then your project must also be released under the GPL. The BSD and Python licenses do not include such restrictions. As a result, you can use Python in your project, but you do not have to distribute that project under the conditions of the Python license. Many groups use python-based tools to develop their commercial products.

contact me: darren dot dale <nospam> at cornell dot edu
revised: Thursday November 16 2006