MCMC on the Parallella

Introduction

A very, very, very long time ago, I backed an interesting looking Kickstarter project called Parallella. The idea, somewhat inspired by the Raspberry Pi project, was to enable a small hardware company, Adapteva, to create a small (credit card sized), cheap, single board computer. The difference with the Raspberry Pi would be that this computer would have a more powerful CPU, and would also have an on-board FPGA and a 16 (or 64) core co-processor, called the Epiphany. In fact, the whole project is really a way to kick-start the production and development of the Epiphany chip, but that isn’t necessarily the main interest of the backers, many of whom are just geeky hobbyists like myself. The project was funded (I backed it at the $99 level), but then went fairly quiet for around 2 years(!) until I got a message from UK Customs telling me I had to pay import duty on my board… The board arrived earlier this year, but I’ve had lots of problems getting it going, so it’s only now that I have used it enough to warrant writing a blog post.

The tag-line of the Kickstarter project was “A Supercomputer For Everyone“, and that certainly attracted my interest. I do a lot of computationally intensive work, and have dabbled, on-and-off, with parallel, multi-core, and distributed computing for over 15 years. To have a serious parallel computer for not much more than the cost of two Raspberry Pis was just too good an opportunity to pass up on. The Parallella board looks very much like a Raspberry Pi, and is similar to set up and use, but generally a lot more flaky and fiddly – if you are new to single board computers, I would strongly recommend getting up to speed with a Raspberry Pi model B+ first! The Pi is just a bit simpler, easier, more robust, and has a large, well-organised, supportive user community. I had various issues with faulty/incompatible adaptor cables, cheap SD cards, and a home router which refused to give it an IP address… I won’t bore everyone with all of the details, but I am now in a position where I can get it to boot up and run well enough to be able to begin to put the thing through its paces.

There is a recommended Linux distro for the Parallella, based on Linaro, which, like Raspbian, is a Debian derived Linux for ARM. Just like the Pi (B+), you write the OS to a micro SD card and then use that as the boot media. Typing cat /proc/cpuinfo reveals two cores of an ARM (v7) processor. The distro comes with various dev tools pre-installed (git, make, gcc, etc.), as well as some example code to get you started, so it’s actually pretty quick to get going with once the thing boots and runs.

Benchmarking C code on the ARM chip

To start off with I’m just looking at the dual core ARM processor – I will start and look at the Epiphany chip as time permits. To begin with, I benchmarked the processor using my standard C-based Gibbs sampling script. For completeness, the source code of gibbs.c is given below:

/*
gcc -O4 gibbs.c -lgsl -lgslcblas -lm -o gibbs
time ./gibbs > datac.tab
*/

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
 
void main()
{
  int N=50000;
  int thin=1000;
  int i,j;
  gsl_rng *r = gsl_rng_alloc(gsl_rng_mt19937);
  double x=0;
  double y=0;
  printf("Iter x y\n");
  for (i=0;i<N;i++) {
    for (j=0;j<thin;j++) {
      x=gsl_ran_gamma(r,3.0,1.0/(y*y+4));
      y=1.0/(x+1)+gsl_ran_gaussian(r,1.0/sqrt(2*x+2));
    }
    printf("%d %f %f\n",i,x,y);
  }
}

This code requires the GSL, which isn’t installed by default, but a simple sudo apt-get install libgsl0-dev soon fixes that (I love Debian-derived Linux distros). The code runs in about 75 seconds on (a single core of) the Parallella. This is about 10 times slower than my (very fancy) laptop. However, it’s about twice as fast as the Pi (which only has one core), and around the same speed as my (Intel Atom based) netbook. So, given that the Parallella has two cores, this very superficial and limited experiment suggests that the ARM chip on the Parallella is around 4 times as powerful as the ARM chip in the Raspberry Pi. Of course, this is ignoring the FPGA and the Epiphany chip, but even without these, you are arguably getting more bang for your buck (and more importantly, more bang for your Watt) with the Parallella than you do with the Pi (though without the FPGA and Epiphany, there wouldn’t be much in it).

Scala code

These days I use Scala in preference to C whenever I possibly can. So, I was also interested to see how well Scala runs on this board. Scala is JVM based, and the Parallella distro has the OpenJDK JVM pre-installed, so if you are an sbt person (and if you use Scala you should be), it’s just a matter of copying the sbt launcher to ~/bin and you are good to go. Again, for completeness, a simple (non-idiomatic!) Scala program equivalent to the C version above is given below:

/*
Gibbs2.scala

time sbt run > gibbs.dat
*/

object Gibbs2 {
 
    import java.util.Date
    import scala.math.sqrt
    import breeze.stats.distributions._
 
    def main(args: Array[String]) {
        val N=50000
        val thin=1000
        var x=0.0
        var y=0.0
        println("Iter x y")
        for (i <- 0 until N) {
            for (j <- 0 until thin) {
                x=new Gamma(3.0,y*y+4).draw
                y=new Gaussian(1.0/(x+1),1.0/sqrt(2*x+2)).draw
            }
            println(i+" "+x+" "+y)
        }
    }
 
}

An appropriate build.sbt file for resolving dependencies is given below:

name := "Gibbs2"

version := "0.1"

scalacOptions ++= Seq("-unchecked", "-deprecation", "-feature")

libraryDependencies  ++= Seq(
            "org.scalacheck" %% "scalacheck" % "1.11.4" % "test",
            "org.scalatest" %% "scalatest" % "2.1.7" % "test",
            "org.scalanlp" %% "breeze" % "0.10",
            "org.scalanlp" %% "breeze-natives" % "0.10"
)

resolvers ++= Seq(
            "Sonatype Snapshots" at "https://oss.sonatype.org/content/repositories/snapshots/",
            "Sonatype Releases" at "https://oss.sonatype.org/content/repositories/releases/"
)

scalaVersion := "2.11.1"

This Scala version of the code takes around 30 minutes to complete on the Parallella! This is pretty dire, but entirely expected. Scala runs on the JVM, and the OpenJDK JVM is known to perform very poorly on ARM architectures. If you want to run computationally intensive JVM code on ARM, you have no choice but to get the Oracle JDK8 for ARM. Although this isn’t real “free” software, it can be downloaded and used for free for personal use.

Using Oracle’s JVM

Download and unpack Oracle’s JVM somewhere sensible on your system. Then update your sbt launcher script to point at the new java binary location. You may also want to add a javaHome line to your sbt build script. Once this is done, using sbt is just as normal, but you get to use Oracle’s pretty fast ARM JVM. Running the above example with the new JVM takes around 4 minutes 40 seconds on my Parallella. This is more than 5 times faster than the OpenJDK JVM, and it is a similar story on the Raspberry Pi as well. But it still isn’t blazingly fast. On Intel this code runs within a factor of 2 of the C version. Here it is closer to a factor of 4 slower. But again, this is very similar to the situation on the Pi. The Oracle JVM is good, but for whatever reason, the ARM JVM doesn’t get as close to the speed of native code as it does on x86.

A simple Monte Carlo example in Scala

I recently gave a talk on using Scala for statistical computing, and for that I prepared a talk and some code examples. One of the examples was a simple Monte Carlo example, coded in several different ways, including one version which would automatically exploit multiple cores. There is a github repo containing the slides of the talk and all of the code examples. Running the Monte Carlo example (with 10^6 iterations) using the OpenJDK JVM produces the following results:

$ sbt run
[info] Set current project to monte-carlo (in build file:/home/linaro/git/statslang-scala/monte-carlo/)
[info] Running MonteCarlo 
Running with 1000000 iterations
Idiomatic vectorised solution
0.99498
time: 37922.050389ms
Fast efficient (serial) tail call
0.999875
time: 25327.705376ms
Parallelised version
0.99997
time: 15633.49492ms
Done
[success]

Again, using the OpenJDK leads to poor performance, but the important thing to note is that the multi-core version is nearly twice as fast as the single threaded version, due to the Parallel ARM chip having 2 cores. Re-running with the Oracle JVM leads to much better results:

$ sbt run
[info] Set current project to monte-carlo (in build file:/home/linaro/git/statslang-scala/monte-carlo/)
[info] Running MonteCarlo 
Running with 1000000 iterations
Idiomatic vectorised solution
1.002605
time: 8214.310728ms
Fast efficient (serial) tail call
0.997675
time: 2410.760695ms
Parallelised version
0.999525
time: 1800.324405ms
Done
[success] 

Here, the Oracle JVM is almost 10 times faster than the OpenJDK JVM, and again the parallel version runs significantly faster than the serial version.

A Parallella gotcha to be aware of here is that I got weird certificate errors when cloning my github repos that turned out to be the clock. Like the Pi, the Parallella doesn’t have an RTC chip built in, but unlike the Pi, it doesn’t always seem to automatically sync when an internet connection comes up. Manually running ntpdate fixed the date and then git worked fine…

Summary

The Parallella packs a lot of processing power in a very small, low power form-factor. Even just using the main ARM chip, it is a better starting point for computationally intensive workflows than the Raspberry Pi (provided that you can get it to work!). If you can find a way to exploit the Epiphany chip, things will look even better. Java and other JVM languages (eg. Scala) are fine on the Parallella, but only if you use Oracle’s JVM, otherwise you are going to have very poor performance. My next task is to have a play with the Epiphany examples that are bundled with the Parallella distro. At the moment it seems like the best way to use the Epiphany chip is via C+OpenCL. That involves a degree of masochism I tend not to subject myself to very often these days, so it may have to wait until I’m in an appropriate mood!

Advertisements

Programming Minecraft Pi edition using the Python API

This post will give a very quick introduction to programming the special edition of Minecraft for the Raspberry Pi using the Python API. It will be a very short post, covering the very basics, since the topic has been covered well elsewhere.

First, you need a Raspberry Pi running Minecraft Pi Edition. Follow the instructions on that site to get the game itself up and running. The game itself is rather unremarkable – it is a cut down version of the pocket edition. What makes it interesting is the fact that it has an API, and this API can be a good way to illustrate some basic programming concepts to kids. My kids are now quite interested in writing games in Scratch, I think because you can do interesting stuff on screen with very little code. But I’ve struggled to get them excited about Python, and other “real” programming languages, I think because the start-up costs are much higher. The Minecraft Pi API also allows you to build stuff on screen with very little Python code, so I’m hoping that this will turn out to be a good way to get kids interesting in real programming languages.

Once the game is running, hit Escape to get your mouse pointer back and run a terminal. If you’ve just unpacked the game in your home directory, the Python API won’t be in the system Python path, so you’ll need to add it using a command like:

export PYTHONPATH=/home/pi/mcpi/api/python/mcpi

though this may need editing depending on the location of your copy of minecraft. You may want to add this to your .profile or .bashrc so that you don’t have to re-enter it. Once the path is set correctly, just run python from the command prompt, and start interacting. The following session should get you started:

import minecraft
import block
import math

mc=minecraft.Minecraft.create()

mc.postToChat("Hello")
mc.setBlock(10,10,10,block.STONE)

Then fly to (10,10,10) to check to see if there’s a stone block there.

Once that’s all working, there’s loads of stuff on-line that I don’t want to repeat here – start with Minecraft API basics, or dive in at MCPIPY.

I’ll just record here a few functions I’ve found useful for building stuff. First, drawing a circle. When I first learned to program as a kid, I did this by drawing a regular n-sided polygon for large n. However, that’s fairly unsatisfactory for various reasons. It occurs to me that it makes more sense to trace out the first one eighth of a circle pixel by pixel using Pythagoras, and then fill in the other 7 eighths via symmetry. I guess this is Computer Graphics 101, but I’ve never studied graphics… Anyway, a function that seems to work OK is below:

def circle(cx,cy,cz,r,b):
  x=r
  y=0
  while (x>=y):
    mc.setBlock(cx+x,cy,cz+y,b)
    mc.setBlock(cx+x,cy,cz-y,b)
    mc.setBlock(cx-x,cy,cz+y,b)
    mc.setBlock(cx-x,cy,cz-y,b)
    mc.setBlock(cx+y,cy,cz+x,b)
    mc.setBlock(cx+y,cy,cz-x,b)
    mc.setBlock(cx-y,cy,cz+x,b)
    mc.setBlock(cx-y,cy,cz-x,b)
    y=y+1
    x=int(round(math.sqrt(r*r-y*y)))

Once you have a way of drawing circles, cylinders are easy:

def cylinder(cx,cy,cz,r,h,b):
  if (h>0):
    circle(cx,cy,cz,r,b)
    cylinder(cx,cy+1,cz,r,h-1,b)

Note that I’ve expressed this recursively rather than as an iteration, just because. It turns out that cylinders are useful for building castle turrets… Speaking of recursion, a straightforward function to draw a Menger sponge is:

def menger(x,y,z,l):
  if (l==0):
    mc.setBlock(x,y,z,block.STONE)
  else:
    s=3**(l-1)
    menger(x+0*s,y+0*s,z+0*s,l-1)
    menger(x+0*s,y+0*s,z+1*s,l-1)
    menger(x+0*s,y+0*s,z+2*s,l-1)
    menger(x+0*s,y+1*s,z+0*s,l-1)
    #menger(x+0*s,y+1*s,z+1*s,l-1)
    menger(x+0*s,y+1*s,z+2*s,l-1)
    menger(x+0*s,y+2*s,z+0*s,l-1)
    menger(x+0*s,y+2*s,z+1*s,l-1)
    menger(x+0*s,y+2*s,z+2*s,l-1)
    menger(x+1*s,y+0*s,z+0*s,l-1)
    #menger(x+1*s,y+0*s,z+1*s,l-1)
    menger(x+1*s,y+0*s,z+2*s,l-1)
    #menger(x+1*s,y+1*s,z+0*s,l-1)
    #menger(x+1*s,y+1*s,z+1*s,l-1)
    #menger(x+1*s,y+1*s,z+2*s,l-1)
    menger(x+1*s,y+2*s,z+0*s,l-1)
    #menger(x+1*s,y+2*s,z+1*s,l-1)
    menger(x+1*s,y+2*s,z+2*s,l-1)
    menger(x+2*s,y+0*s,z+0*s,l-1)
    menger(x+2*s,y+0*s,z+1*s,l-1)
    menger(x+2*s,y+0*s,z+2*s,l-1)
    menger(x+2*s,y+1*s,z+0*s,l-1)
    #menger(x+2*s,y+1*s,z+1*s,l-1)
    menger(x+2*s,y+1*s,z+2*s,l-1)
    menger(x+2*s,y+2*s,z+0*s,l-1)
    menger(x+2*s,y+2*s,z+1*s,l-1)
    menger(x+2*s,y+2*s,z+2*s,l-1)

So drawing M3 at (10,10,10) can be done with:

menger(10,10,10,3)

I know that other people wrote code for this as soon as the API was released, but I haven’t looked to see how their code compares to mine. In any case, this isn’t code golf – I could have written the function much more succinctly – I was trying to write it in a simple pedagogic style here, explicitly enumerating the 20 component blocks and commenting out the 7 “missing” blocks… Note that M3 is 27x27x27 blocks, which is a good size for Minecraft on the Pi. Generating M4 actually runs OK, though it takes a while. But this is 81x81x81, which is a struggle for Minecraft Pi to render satisfactorily.

Setting up a Minecraft server on a Raspberry Pi

I’ve recently set up a Minecraft server on a Raspberry Pi. There’s lots of information on line describing how to do this, but I still had some problems, in part due to a lot of information on-line being out-of-date, in part due to the fact that running a Minecraft server is right on the limit of what a Pi is capable of, and in part due to the fact that I don’t really know much about Minecraft…

Just to be clear, this is about running a Minecraft SERVER, not the game client. The game client doesn’t work well on the Pi, as it doesn’t have enough memory. There is a special game client for the Pi, the Minecraft Raspberry Pi Edition, which is free, and programmable, but is creative mode only, and has no monsters. That is not what this post is about. You can run a server on the Pi (for free) which you can use from a standard game client (which is not free). That is what this post is about.

Useful on-line information I found useful includes the Gamepedia tutorial, this Forum thread, and this how-to-geek article. It is worth having a quick read through these before continuing.

It is important to understand that there are lots of different Minecraft servers out there, most of which are Java based, but not all. There are potential advantages to not using the standard vanilla Mojang reference server, as some servers are lighter weight, and hence could potentially run better on the Pi. However, lots of servers “out there” are not compatible with the latest (1.7.x) versions of the game client, so it’s probably best to get the vanilla server up and running first, before exploring other possibilities. Note that I’m assuming a Revision 2 Model B Pi with 512MB RAM. I don’t imagine that you will have a good experience with 256MB RAM. You should use raspi-config to allocate as little RAM as possible to the GPU, and you should overclock the Pi as much as you dare.

The Mojang server is a Java server, so you need a fast JVM installed on the Pi. The OpenJDK JVM is too slow – you need the Oracle JVM. Lot’s of info on the web refers to Oracle’s developer preview of Java 8, but that isn’t necessary now, as Oracle’s Java 7 is now a standard part of Raspbian. So just install that:

sudo apt-get update
sudo apt-get install oracle-java7-jdk

Use java -version to make sure it has worked. Then just pull the server jar, stick it into an appropriately named directory, and run the server from that directory with a command like:

java -Xms256M -Xmx384M -jar minecraft_server.1.7.2.jar nogui

The first time you run this it will take an age (possibly up to half an hour). Subsequent starts will be much quicker (15 seconds). Once it is up and running, try connecting to the address/name of the Pi from the machine you usually use for running the Minecraft game client. If the connection fails due to a protocol error, this usually means that the server is too old to cope with the latest version of the game client, so you need to find a new server. If it fails due to a timeout, then it means that your Pi isn’t running fast enough. Tweak settings. Overclock more, etc. There are various settings in the “server-properties” file that you can tweak to improve the speed at which the server runs. Disabling “nether” and dropping the view distance down (to, say, 5) seem to be particularly effective. Again, you can find out more about this by googling around. My current server-properties looks like this:

#Minecraft server properties
#Sat Nov 30 14:23:39 UTC 2013
generator-settings=
op-permission-level=4
allow-nether=false
level-name=world
enable-query=false
allow-flight=false
announce-player-achievements=true
server-port=25565
level-type=DEFAULT
enable-rcon=false
force-gamemode=false
level-seed=
server-ip=
max-build-height=256
spawn-npcs=true
white-list=true
spawn-animals=true
hardcore=false
snooper-enabled=true
texture-pack=
online-mode=true
resource-pack=
pvp=true
difficulty=1
enable-command-block=false
player-idle-timeout=0
gamemode=0
max-players=10
spawn-monsters=true
generate-structures=true
view-distance=5
spawn-protection=16
motd=Pi server

If you’ve never set up a server before and are having trouble, it may be easier to set things up on another machine and then copy things across to the Pi when you get it working. I actually got mine up and running on a fast Ubuntu laptop first, which saved a lot of time.

I’m still running the Mojang server. It’s generally fine, but gives lots of warnings about “Not keeping up”, and riding horses doesn’t really work very well. If anyone has suggestions for another server that will work as a swap-in replacement and run better on the Pi, please do leave a comment. I’d be interested in something that runs a bit faster, and copes better with riding horses, smashing blocks, etc. But my kids will not be pleased if they have to re-build their world…

A new personal blog

Welcome to my new personal blog site. Unfortunately my old personal blog was hosted on Posterous, which is about to shut down. So I’ve created this new blog on wordpress.com, and I’ve imported my Posterous blog. This has done a basic job of copying my old posts across, but the formatting and cross-linking is all messed up, and not all of the media has copied across correctly. I’ll try and fix some of the worst formatting issues as time permits. But be warned that any post older than this one is likely to look a bit messed up…

 

 

The Raspberry Pi as a simple DIY NAS

Network-attached_storage (NAS) devices are becoming increasingly popular. I thought it would be interesting to try using a Pi with a USB HDD to make a simple DIY NAS as backup for my home network. After plugging in the HDD I used “dmesg” to check the device and then I formatted it with a command like

% sudo mkfs.ext3 -L ‘freecom2tb’ /dev/sda1

I use EXT3 only because I’m old enough to remember when EXT4 was experimental, and not for any good reason (actually, I’m old enough to remember when EXT2 was considered somewhat avant-garde, but that’s another matter…). Once formatted, it can be mounted temporarily with a command like

% sudo mount -t ext3 /dev/sda1 /mnt

For a more permanent solution, first create a mount point for it with a command like

% sudo mkdir /mnt/sda1

and then add a line to the fstab with a command like

% sudo echo “/dev/sda1 /mnt/sda1 ext3 defaults 0 0″ >> /etc/fstab

Then the drive should mount on /mnt/sda1 automatically on boot. Once the storage device is enabled, you need to consider how to make it available over the network. If SSH is enabled on the Pi, then simple “scp” is supported already. For more “live” file system access, NFS or Samba can be used. As I want to use this for backup, “rsync” is all I want, which can be installed with

% sudo apt-get install rsync

That’s it. The Pi is now sitting on the network acting as an rsync destination. It’s not very fast, but it seems to work quite reliably, which is probably fine as a backup solution.

 

Controlling a USB robotic arm with a Raspberry Pi

My son was recently given a Maplin robotic arm kit with USB interface by a neighbour of ours who had received it as an unwanted gift. Over the Xmas break I got a chance to help my son build it (it only took a couple of hours, and was reasonably straightforward). Once we’d built it, we tested it out using a laptop booted into Windows with the supplied Windows software. The Windows software is OK for manual control, but the “programming” software is really quite dire… Clearly, what is needed here is a way to drive it from a Python session running on the Raspberry Pi. It turns out to be very easy to do this.

First, get hold of PyUSB (version >= 1.0.0), and note that the version currently in the Raspbian repo is too old. Unpack PyUSB and build it in the usual python way using

sudo python setup.py install

Once you are ready to go, plug in the arm and switch it on. Use

lsusb

to check that the Pi has detected the arm correctly.

I expected to have a bit of hassle figuring out how to drive the arm from python, but it turns out that very clear instructions for this arm are given in the article “Skutter: Part 2” in Issue 3 of the MagPi magazine, so just following those instructions gets the arm up and running in a few minutes. Very cool.

Imag0115

How to rename the default account on the Raspberry Pi

The standard Raspbian OS image for the Raspberry Pi comes with a default account called “pi” (with UID 1000, and password “raspberry”). One of the first things you should do before putting the Pi on the internet is to change the password to something more secure. However, you may also prefer a different username. This is a question which has come up on the Raspberry Pi StackExchange site. The simplest thing to do is to create a new account with the desired username, then grant it sudo privileges, and then lock the “pi” account. However, sometimes it really is desirable to actually rename the “pi” account (eg. because you want it to have the UID 1000). You can do this, but it is very easy to mess up, locking yourself out of your Pi, so here is a method that I have found to work well. But BE CAREFUL! YMMV…

It is very tricky to rename an account while you are logged in to it, so first enable the root account with

% sudo passwd root

Use a secure password, even if you intend to lock the root account again later. Then log out and log back in as root. The rest supposes a desired username of “myuname” – replace with whatever you want.

# usermod -l myuname pi
# usermod -m -d /home/myuname myuname

Then log out and log back in again as “myuname”. If you are still using the default password of “raspberry” on this account, do

% passwd

and change password to something more secure. That should be it. Test carefully! “sudo” users seem to get updated OK, but check that your renamed account works and really does have “sudo” privileges before disabling the root account.

Should you prefer to disable the root account, do

% sudo passwd -l root

Technically, this just locks the password – it doesn’t completely disable the account. But that’s probably what you want.