My friend was kind enough to give me a 64 gig solid-state disk drive to throw into my development box. I was curious about hard disk speed compared to what I’ve been running: a four-disk RAID5 array of Seagate Barracuda 7200 320GB 7200 RPM SATA (3.0Gb/s) on an Areca ARC-1210 PCI-Express x8 SATA II (3.0Gb/s) Controller Card. Just checked my Newegg order history: the two hard drives are from January 2006, the other two from Sept 2006. I didn’t know they had gotten so old! Haven’t had any failures yet, though.

A few tests (sdb5 = raid5, sda5 = sdd):

sudo hdparm -tT /dev/sdb5

/dev/sdb5:
 Timing cached reads:   15534 MB in  1.99 seconds = 7788.19 MB/sec
 Timing buffered disk reads: 456 MB in  3.01 seconds = 151.39 MB/sec

sudo hdparm -tT /dev/sda5

/dev/sda5:
 Timing cached reads:   14776 MB in  1.99 seconds = 7407.07 MB/sec
 Timing buffered disk reads: 558 MB in  3.00 seconds = 185.80 MB/sec

Interesting: the cached throughput on the RAID controller is very good, beating my motherboard controller by a sound 400MB/sec. The reads on the SDD are faster by about 30MB/sec. What about random access seek times? I found some C code to do random seeks. Save the code as seeker.c, and compile with gcc seeker.c -o seeker.

#define _LARGEFILE64_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <time.h>
#include <signal.h>
#include <sys/fcntl.h>
#include <sys/ioctl.h>
#include <linux/fs.h>

#define BLOCKSIZE 4096
#define TIMEOUT 30

int count;
time_t start;

void done()
{
    time_t end;

    time(&end);

    if (end < start + TIMEOUT) {
        printf(".");
        alarm(1);
        return;
    }

    if (count) {
      printf(".\nResults: %d seeks/second, %.2f ms random access time\n",
         count / TIMEOUT, 1000.0 * TIMEOUT / count);
    }
    exit(EXIT_SUCCESS);
}

void handle(const char *string, int error)
{
    if (error) {
        perror(string);
        exit(EXIT_FAILURE);
    }
}

int main(int argc, char **argv)
{
    char buffer[BLOCKSIZE];
    int fd, retval;
    unsigned long numblocks;
    off64_t offset;

    setvbuf(stdout, NULL, _IONBF, 0);

    printf("Seeker v2.0, 2007-01-15, "
           "http://www.linuxinsight.com/how_fast_is_your_disk.html\n");

    if (argc != 2) {
        printf("Usage: seeker <raw disk device>\n");
        exit(EXIT_SUCCESS);
    }

    fd = open(argv[1], O_RDONLY);
    handle("open", fd < 0);

    retval = ioctl(fd, BLKGETSIZE, &numblocks);
    handle("ioctl", retval == -1);
    printf("Benchmarking %s [%luMB], wait %d seconds",
           argv[1], numblocks / 2048, TIMEOUT);

    time(&start);
    srand(start);
    signal(SIGALRM, &done);
    alarm(1);

    for (;;) {
        offset = (off64_t) numblocks * random() / RAND_MAX;
        retval = lseek64(fd, BLOCKSIZE * offset, SEEK_SET);
        handle("lseek64", retval == (off64_t) -1);
        retval = read(fd, buffer, BLOCKSIZE);
        handle("read", retval < 0);
        count++;
    }
    /* notreached */
}

You need an accurate blocksize; for me, that was 4096.

$ sudo dumpe2fs /dev/sdb5 | grep 'Block size'
dumpe2fs 1.41.14 (22-Dec-2010)
Block size:               4096

So the results?

RAID5:

Seeker v2.0, 2007-01-15, http://www.linuxinsight.com/how_fast_is_your_disk.html
Benchmarking /dev/sdb5 [420909MB], wait 30 seconds..............................
Results: 85 seeks/second, 11.73 ms random access time

SDD:

Seeker v2.0, 2007-01-15, http://www.linuxinsight.com/how_fast_is_your_disk.html
Benchmarking /dev/sda5 [27235MB], wait 30 seconds..............................
Results: 5429 seeks/second, 0.18 ms random access time

The SSD is two orders of magnitude improvement over the RAID5. Not surprising, but still pretty awesome.

Finally, I wanted to migrate my /home directory onto the SSD. This process worked for me:

sudo mkdir /mnt/ocz
sudo mount /dev/sda5 /mnt/ocz
sudo find . -depth -print0 | sudo cpio --null --sparse -pvd /mnt/ocz
sudo mv /home /home_old
sudo mkdir /home
sudo mount /dev/sda5 /home

Don’t forget to add the mount to /etc/fstab so it stays there! Best to use the UUID of the device, so:

11:29 ~ $ sudo blkid
[sudo] password for adam:
/dev/sda5: LABEL="Linux_OCZ" UUID="..." TYPE="ext4"

So /etc/fstab gets another entry, e.g.:

# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# /home => ocz ssd disk
UUID=...        /home           ext4    errors=remount-ro 0       1

Sadly, this is often the case:

X11

But not today! I wanted to get my dual monitors set up with my good ole’ GTS 250 and got a cringe-inducing nvidia-settings error:

“Failed to set MetaMode (1) ‘DFP-0: nvidia-auto-select@1680×1050 +0+0, DFP-1: nvidia-auto-select @1680×1050 +0+0 (Mode 1680×1050, id: 52) on X screen 0.”

Could a recently-released, updated driver solve this issue? Yes.

From the command line:

sudo add-apt-repository ppa:ubuntu-x-swat/x-updates
sudo apt-get update
sudo apt-get install nvidia-current

Next: reboot. Update additional packages per any suggestions.

No tears!

I’ve got a spanking new install of Kubuntu 11.10, and I need to get it set up for Python data hacking.  Sure, I could spring for an Enthought Python Distrubution, but where would be the masochism in that?

Inspired by Zed, let’s do this the hard way.

The linux distro comes with Python 2.7.2. Perfect! Or, use Pythonbrew to set up a local Python build that you want to use. I presume you know how to get to the command line, as well as how to edit text files using emacs, vim, pico, whatever.

Let’s get some tools:

sudo apt-get install git gfortran g++

We need to compile against Python headers and get setuptools and pip:

sudo apt-get install python-dev python-pip

Let’s isolate our Python distro from carnage:

sudo apt-get python-virtualenv
sudo pip install virtualenvwrapper

Now these lines to your ~/.bashrc:

source /usr/local/bin/virtualenvwrapper.sh
export WORKON_HOME=$HOME/.virtualenvs

Now open a new terminal and establish a virtual environment, say “py27”:

mkvirtualenv py27
workon py27

We need some math libraries (ATLAS + LAPACK):

sudo apt-get install libatlas-base-dev liblapack-dev

Ok, now to install and build all the scientific python hotness:

pip install numpy scipy

For matplotlib, we need lots of libraries. This one is dependency heavy. Note we can ask Ubuntu what we need, what’s installed, and what is not:

apt-cache depends python-matplotlib | awk '/Depends:/{print $2}' | xargs dpkg --get-selections

Easiest thing to do is just build all the dependencies (just say yes if it asks to install deps of matplotlib instead of python-matplotlib):

sudo apt-get build-dep python-matplotlib

Ok, now this should work:

pip install matplotlib

Now, of course, we need the IPython interpreter. Don’t settle for 0.11!

pip install -e git+https://github.com/ipython/ipython.git#egg=ipython
cd ~/.virtualenvs/py27/src/ipython
python setupegg.py install

Note, you may need to sudo rm /usr/bin/ipython.py if there is a conflict.

Ok, let’s beef up the IPython interpreter. Note the pip commands FAIL. This is ok. We’ll do it by hand.

sudo apt-get install qt4-dev-tools

pip install sip
cd ~/.virtualenvs/py27/build/sip
python configure.py
make
sudo make install

pip install pyqt
cd ~/.virtualenvs/py27/build/pyqt
python configure.py
make
sudo make install

# clean up
cd ~/.virtualenvs/py27/
rm -rf build

Just a few more things, you won’t be disappointed.

sudo apt-get install libzmq-dev
pip install tornado pygments pyzmq

Alright, let’s get pandas. It’s under heavy development (Wes is a beast); so lets pull the latest from git.

pip install nose cython
pip install -e git+https://github.com/wesm/pandas#egg=pandas

# we should get statsmodels too
pip install -e git+https://github.com/statsmodels/statsmodels#egg=statsmodels

Btw, you’ll note this git stuff goes into your ~/.virtualenvs/py27/src directory, if you want to git pull and update.

OK! Phew! For the grand finale:

Run the amazing qtconsole:

ipython qtconsole --pylab=inline

Or the even more amazing WEB BROWSER:

ipython notebook --pylab=inline

Launch a browser and point to http://localhost:8888/. For kicks, try opening one of Wes’s tutorial workbooks, here. You may have to fiddle a bit, but it should work.

Enjoy!

© 2014 Adam Klein's Blog Suffusion theme by Sayontan Sinha, modified by Adam :)