After reading this post about equations every computer scientist should know, I got to thinking. I love clear, plain-English explanations of powerful mathematical concepts. Not the what, but the why. I’d like to share my understanding of the binomial formula:

This is usually stated “N choose K”, and provides a way to calculate the number of ways in which you may select (or draw) K items from a collection of N items in total, where you don’t care about the order in which you draw them.

Now, if we have N elements, there are N ways to choose a first element; and, if you don’t put that first element back, N-1 ways to choose a second; and so on. That means the total number of ways to draw N elements (counting the order in which we draw them) is N x (N-1) x … x 1. We call this N!, “N factorial”.

But, we don’t care about drawing N elements. We actually care about just K elements. We don’t care about the remaining (N-K) elements. For example, if we are trying to figure out 5 choose 2, we draw from five choices, and then from four, but that’s it. No need to worry about the other (5-2) = 3 draws. So, instead of calculating 5x4x3x2x1, we calculate just 5×4 by “removing” 3x2x1, by dividing. That is, we calculate (5x4x3x2x1)/(3x2x1), or, 5!/(5-2)!. That’s where (N-K)! in the denominator comes from.

Finally, when we draw K elements, we don’t care the order in which we draw them. Following our example, with 5 choose 2, if we calculate 5×4 = 20 choices, we double-count the cases where we choose the same two items in a different order. That means we have to “remove” (divide by) the ways in which we choose K elements in different orders. This is K! ways.

So, there’s the prize: N choose K = N! / [K!x(N-K)!]

I’ve got a spanking new install of Kubuntu 11.10, and I need to get it set up for Python data hacking.  Sure, I could spring for an Enthought Python Distrubution, but where would be the masochism in that?

Inspired by Zed, let’s do this the hard way.

The linux distro comes with Python 2.7.2. Perfect! Or, use Pythonbrew to set up a local Python build that you want to use. I presume you know how to get to the command line, as well as how to edit text files using emacs, vim, pico, whatever.

Let’s get some tools:

sudo apt-get install git gfortran g++

We need to compile against Python headers and get setuptools and pip:

sudo apt-get install python-dev python-pip

Let’s isolate our Python distro from carnage:

sudo apt-get python-virtualenv
sudo pip install virtualenvwrapper

Now these lines to your ~/.bashrc:

source /usr/local/bin/virtualenvwrapper.sh
export WORKON_HOME=$HOME/.virtualenvs

Now open a new terminal and establish a virtual environment, say “py27”:

mkvirtualenv py27
workon py27

We need some math libraries (ATLAS + LAPACK):

sudo apt-get install libatlas-base-dev liblapack-dev

Ok, now to install and build all the scientific python hotness:

pip install numpy scipy

For matplotlib, we need lots of libraries. This one is dependency heavy. Note we can ask Ubuntu what we need, what’s installed, and what is not:

apt-cache depends python-matplotlib | awk '/Depends:/{print $2}' | xargs dpkg --get-selections

Easiest thing to do is just build all the dependencies (just say yes if it asks to install deps of matplotlib instead of python-matplotlib):

sudo apt-get build-dep python-matplotlib

Ok, now this should work:

pip install matplotlib

Now, of course, we need the IPython interpreter. Don’t settle for 0.11!

pip install -e git+https://github.com/ipython/ipython.git#egg=ipython
cd ~/.virtualenvs/py27/src/ipython
python setupegg.py install

Note, you may need to sudo rm /usr/bin/ipython.py if there is a conflict.

Ok, let’s beef up the IPython interpreter. Note the pip commands FAIL. This is ok. We’ll do it by hand.

sudo apt-get install qt4-dev-tools

pip install sip
cd ~/.virtualenvs/py27/build/sip
python configure.py
make
sudo make install

pip install pyqt
cd ~/.virtualenvs/py27/build/pyqt
python configure.py
make
sudo make install

# clean up
cd ~/.virtualenvs/py27/
rm -rf build

Just a few more things, you won’t be disappointed.

sudo apt-get install libzmq-dev
pip install tornado pygments pyzmq

Alright, let’s get pandas. It’s under heavy development (Wes is a beast); so lets pull the latest from git.

pip install nose cython
pip install -e git+https://github.com/wesm/pandas#egg=pandas

# we should get statsmodels too
pip install -e git+https://github.com/statsmodels/statsmodels#egg=statsmodels

Btw, you’ll note this git stuff goes into your ~/.virtualenvs/py27/src directory, if you want to git pull and update.

OK! Phew! For the grand finale:

Run the amazing qtconsole:

ipython qtconsole --pylab=inline

Or the even more amazing WEB BROWSER:

ipython notebook --pylab=inline

Launch a browser and point to http://localhost:8888/. For kicks, try opening one of Wes’s tutorial workbooks, here. You may have to fiddle a bit, but it should work.

Enjoy!

© 2014 Adam Klein's Blog Suffusion theme by Sayontan Sinha, modified by Adam :)