Docker on Snappy Ubuntu Core on a Raspberry Pi 2

In a previous post I gave a quick introduction to Snappy Ubuntu Core on the Raspberry Pi 2. This was based on the very early version of Ubuntu Snappy Core that was released around the time of the Rasberry Pi 2. There were various problems with that early release that have since been fixed, so in this post I want to give a quick update on the status of Snappy on the Pi, especially in relation to Docker.

If you have an early version of Snappy for the Pi 2, you should manually download a new image and re-flash your SD card. I did not have any luck in using the Snappy upgrade mechanism (which is somewhat ironic). You can download a new image here. I used ubuntu-15.04-snappy-armhf-rpi2.img for this post.

You can check if you have an old or new version of Snappy by running snappy list. That gave an error with the old version, but should work fine with the new version. The new version also sets the system clock using the internet, so date should return the correct time for any system that is connected to the internet. As before, Snappy runs an SSH server from first boot (ubuntu/ubuntu), so can be happily run completely headless.

Once logged in the snappy guides should more-or-less work. Commonly used Snappy commands are indicated below.

snappy --help
snappy list --help
snappy info
snappy list
snappy list -a
snappy list -u
snappy search docker
sudo snappy update ubuntu-core
sudo reboot
sudo snappy install docker

There is now a Snappy framework for Docker which can be easily installed, as indicated above. If you are a Docker person, this is essentially the only Snappy framework/app you need, as you can do everything else with Docker images.

The main issue with Docker on the Pi 2 (and on ARM more generally), is that the vast majority of images on DockerHub are for x86, and therefore will not work on ARM. For the Pi, the best bet is to search for images containing the text rpi or armhf. Again, some indicative Docker commands are given below.

docker search rpi
docker info
docker images
docker pull hypriot/rpi-java
docker pull resin/rpi-raspbian
docker images
docker run -i -t hypriot/rpi-java /bin/bash
docker ps
docker ps -a

Some useful base images include resin/rpi-raspbian which is a minimal Raspbian base, and hypriot/rpi-java, which includes OpenJDK7. With these (or other) base images, it is easy to layer on top with your own Dockerfile to create your own custom images. As discussed previously on this blog, OpenJDK is very slow relative to Oracle’s JVM on ARM. I’ve not yet found a public image containing Oracle’s JVM (perhaps due to licensing issues), but it’s easy enough to roll-your-own from a minimal base image. You can follow the standard Docker User Guides for information on how to build your own images.

Getting started with Snappy Ubuntu Core on the Raspberry Pi 2

[UPDATE: Note that the version of Snappy described in this post is now obsolete – I have a new post describing a newer version which works better, and slightly differently.]

This post consists of a few notes which may be helpful for people trying to get started with Snappy Ubuntu Core on the new Raspberry Pi 2. First of all note that Ubuntu requires ARM7 so it won’t run on any model of Raspberry Pi prior to the Raspberry Pi 2 (model B), released February 2015.

The Ubuntu Core image can be downloaded from the Raspberry Pi downloads page, and written to uSD card in the usual way, just as you would a Raspbian image. From my Ubuntu laptop, I use a command like:

% sudo dd bs=1M if=pi-snappy.img of=/dev/mmcblk0

Be very careful to get the device name correct for of, as this command will completely trash the output device.

Once you have an image on a uSD card, you can insert it into your Pi 2 and boot it up as usual. If you have a keyboard and display hooked up you can log in on the console, but note that Ubuntu Core can be used headless from first boot via ssh. The default username and password are both ubuntu. I ssh in to my device with a command like ssh ubuntu@raspi08.home. Ubuntu Core uses the new containerised “snappy” system for managing packages, so “apt” doesn’t work. To get an idea of how snappy works, you can read through the snappy tour from Canonical, but note that much of this tour won’t actually work on the Pi 2 right now. Here’s a console session:

ubuntu@localhost:~$ snappy info
release: ubuntu-core/devel
ubuntu@localhost:~$ snappy versions
Part         Tag   Installed  Available  Fingerprint     Active  
ubuntu-core  edge  2          -          f442b1d8d6db3f  *              
This command needs root, please run with sudo
ubuntu@localhost:~$ snappy search docker
No matching packages found: docker

You will note that Docker isn’t currently available for Ubuntu Core on the Pi 2, though hopefully that will change soon. However, at the time of writing you may also find that the final command bombs with a certificate error. This turns out to be due to the fact that the system time is incorrect. You can verify this by running date. You can manually set it with a command like

sudo date -s "Sat Feb  7 09:57:32 GMT 2015"

where you paste in the output from running date on a system which does have the correct time. You can also automate this by instead using a command like

sudo date -s "`ssh username@linuxserver date`"

where username and linuxserver are replaced appropriately. At the time of writing there are very few Snappy packages available, but one useful package is webdm. You can search for it with snappy search webdm and install it with

sudo snappy install webdm

Running snappy info will confirm that it has installed correctly. This runs a web based package manager on port 4200, so, for example, I can connect to this from a web browser on the local network using the URL http://raspi08.home:4200/. This allows the browsing of available frameworks and apps.

There currently isn’t any app or framework for Java, but manually downloading and installing Oracle’s JDK8 for ARM works fine, and runs code at the same speed as using the JVM which ships with Raspbian. It would be very easy to package up the JDK8 as a Snappy app or framework, but I guess that there are licensing issues, so I’ll leave that to others to sort out! You can find out more about how Snappy works by reading Canonical’s snappy guides.

I quite like the Snappy system, and running Ubuntu Core on a Pi 2 is potentially a great way to learn about Cloud computing in a very cheap, simple and safe way. However, we need a few key apps and frameworks before it will become genuinely useful. Ubuntu Core certainly isn’t about to replace Raspbian as the main OS for the Raspberry Pi 2 any time soon.

Benchmarking MCMC codes on the Raspberry Pi 2


In the previous post I looked at running some MCMC codes in C and Scala on the Parallella. In that post I explained how the Parallella was significantly faster than the Raspberry Pi, and how it represented better “bang for buck” than the Raspberry Pi for computationally intensive MCMC codes. However, since that post was written, the Raspberry Pi 2 has been released. This board has a much better processor than the old Pi, and double the RAM, for the same price. This changes things, considerably. The processor is an ARM7 quad core. Each core is around twice as fast as the single core on the original Pi, and there are 4 of them. In this post I will re-run the codes from the previous post and compare against the Parallella.

Gibbs sampler in C

I’m using the new Raspbian image for the Pi 2. This includes gcc by default, but not the GSL library. This can be installed with sudo apt-get install libgsl0-dev. Then the file gibbs.c from the previous post will compile and run. On the Pi 2 this runs in around 75 seconds – very similar to the time on the Parallella, and around twice as fast as all of the previous Raspberry Pis.

Gibbs sampler in Scala

The Raspbian image ships with Oracle’s fast and efficient ARM-optimised JVM by default, so there’s no issue with installing Java at all. As usual, installing “sbt” is a simple matter of copying the launcher script (and jar) into your ~/bin directory. Then the Scala version of the Gibbs sampler can be run with a command like time sbt run > /dev/null. Again, it runs in around 4 minutes 40 seconds, just like on the Parallella. So, the ARM cores on the Parallella and the Pi 2 have very similar performance. However, the Parallella ARM chip has just two cores, whereas the Pi 2 is quad core.

Parallel Monte Carlo in Scala

Again, as for the previous post, I next ran the Monte Carlo example from this github repo. This gives output like:

$ sbt run
[info] Set current project to monte-carlo (in build file:/home/pi/src/git/statslang-scala/monte-carlo/)
[info] Running MonteCarlo 
Running with 1000000 iterations
Idiomatic vectorised solution
time: 6768.504487ms
Fast efficient (serial) tail call
time: 2473.331672ms
Parallelised version
time: 1391.2828ms

Here again we see that the single threaded versions run in a similar time to the Parallella (and around twice as fast as the old Pis), but that the parallelised version runs significantly faster on the Pi 2 than on the Parallella (due to having 4 ARM cores rather than 2).


For my test MCMC codes, the cores on the Pi 2 are around twice as fast as the single core on the old Raspberry Pis, and a similar speed to the cores on the Parallella. However, multi-threaded codes run faster still, due to there being 4 cores on the Pi 2 (versus 2 on the Parallella and one on the old Pis). Furthermore, the Pi 2 is the same price as the old Pis (which are still being sold), and around a quarter of the price of the cheapest Parallella. So for standard single and multi-threaded codes running on the ARM cores, the Pi 2 wins hands down in terms of “bang for buck”, and is sufficiently quick and cheap that it starts looking like a credible platform for building cheap clusters for compute-intensive jobs like MCMC. Now to be fair to the Parallella, really the whole point of it is that it has a multi-core Epiphany co-processor that I’ve not been using or factoring in to the discussion at all so far. That said, the Pi 2 is so much cheaper than the Parallella (not to mention, less “fragile”), that I suspect that even for codes which effectively exploit the Epiphany chip it is unlikely that the Parallella will outperform the Pi 2 in terms of “bang for buck”. Now “bang per watt” is another matter entirely, and the Parallella may well outperform the Pi 2 in that regard if efficient use can be made of the Epiphany chip. But development time costs money too, and it’s really not clear that it’s going to be easy for me to run my multi-threaded Scala codes effectively on the Epiphany chip any time soon. So the Pi 2 currently looks like a real winner from my personal perspective.

MCMC on the Parallella


A very, very, very long time ago, I backed an interesting looking Kickstarter project called Parallella. The idea, somewhat inspired by the Raspberry Pi project, was to enable a small hardware company, Adapteva, to create a small (credit card sized), cheap, single board computer. The difference with the Raspberry Pi would be that this computer would have a more powerful CPU, and would also have an on-board FPGA and a 16 (or 64) core co-processor, called the Epiphany. In fact, the whole project is really a way to kick-start the production and development of the Epiphany chip, but that isn’t necessarily the main interest of the backers, many of whom are just geeky hobbyists like myself. The project was funded (I backed it at the $99 level), but then went fairly quiet for around 2 years(!) until I got a message from UK Customs telling me I had to pay import duty on my board… The board arrived earlier this year, but I’ve had lots of problems getting it going, so it’s only now that I have used it enough to warrant writing a blog post.

The tag-line of the Kickstarter project was “A Supercomputer For Everyone“, and that certainly attracted my interest. I do a lot of computationally intensive work, and have dabbled, on-and-off, with parallel, multi-core, and distributed computing for over 15 years. To have a serious parallel computer for not much more than the cost of two Raspberry Pis was just too good an opportunity to pass up on. The Parallella board looks very much like a Raspberry Pi, and is similar to set up and use, but generally a lot more flaky and fiddly – if you are new to single board computers, I would strongly recommend getting up to speed with a Raspberry Pi model B+ first! The Pi is just a bit simpler, easier, more robust, and has a large, well-organised, supportive user community. I had various issues with faulty/incompatible adaptor cables, cheap SD cards, and a home router which refused to give it an IP address… I won’t bore everyone with all of the details, but I am now in a position where I can get it to boot up and run well enough to be able to begin to put the thing through its paces.

There is a recommended Linux distro for the Parallella, based on Linaro, which, like Raspbian, is a Debian derived Linux for ARM. Just like the Pi (B+), you write the OS to a micro SD card and then use that as the boot media. Typing cat /proc/cpuinfo reveals two cores of an ARM (v7) processor. The distro comes with various dev tools pre-installed (git, make, gcc, etc.), as well as some example code to get you started, so it’s actually pretty quick to get going with once the thing boots and runs.

Benchmarking C code on the ARM chip

To start off with I’m just looking at the dual core ARM processor – I will start and look at the Epiphany chip as time permits. To begin with, I benchmarked the processor using my standard C-based Gibbs sampling script. For completeness, the source code of gibbs.c is given below:

gcc -O4 gibbs.c -lgsl -lgslcblas -lm -o gibbs
time ./gibbs >

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
void main()
  int N=50000;
  int thin=1000;
  int i,j;
  gsl_rng *r = gsl_rng_alloc(gsl_rng_mt19937);
  double x=0;
  double y=0;
  printf("Iter x y\n");
  for (i=0;i<N;i++) {
    for (j=0;j<thin;j++) {
    printf("%d %f %f\n",i,x,y);

This code requires the GSL, which isn’t installed by default, but a simple sudo apt-get install libgsl0-dev soon fixes that (I love Debian-derived Linux distros). The code runs in about 75 seconds on (a single core of) the Parallella. This is about 10 times slower than my (very fancy) laptop. However, it’s about twice as fast as the Pi (which only has one core), and around the same speed as my (Intel Atom based) netbook. So, given that the Parallella has two cores, this very superficial and limited experiment suggests that the ARM chip on the Parallella is around 4 times as powerful as the ARM chip in the Raspberry Pi. Of course, this is ignoring the FPGA and the Epiphany chip, but even without these, you are arguably getting more bang for your buck (and more importantly, more bang for your Watt) with the Parallella than you do with the Pi (though without the FPGA and Epiphany, there wouldn’t be much in it).

Scala code

These days I use Scala in preference to C whenever I possibly can. So, I was also interested to see how well Scala runs on this board. Scala is JVM based, and the Parallella distro has the OpenJDK JVM pre-installed, so if you are an sbt person (and if you use Scala you should be), it’s just a matter of copying the sbt launcher to ~/bin and you are good to go. Again, for completeness, a simple (non-idiomatic!) Scala program equivalent to the C version above is given below:


time sbt run > gibbs.dat

object Gibbs2 {
    import java.util.Date
    import scala.math.sqrt
    import breeze.stats.distributions._
    def main(args: Array[String]) {
        val N=50000
        val thin=1000
        var x=0.0
        var y=0.0
        println("Iter x y")
        for (i <- 0 until N) {
            for (j <- 0 until thin) {
                x=new Gamma(3.0,y*y+4).draw
                y=new Gaussian(1.0/(x+1),1.0/sqrt(2*x+2)).draw
            println(i+" "+x+" "+y)

An appropriate build.sbt file for resolving dependencies is given below:

name := "Gibbs2"

version := "0.1"

scalacOptions ++= Seq("-unchecked", "-deprecation", "-feature")

libraryDependencies  ++= Seq(
            "org.scalacheck" %% "scalacheck" % "1.11.4" % "test",
            "org.scalatest" %% "scalatest" % "2.1.7" % "test",
            "org.scalanlp" %% "breeze" % "0.10",
            "org.scalanlp" %% "breeze-natives" % "0.10"

resolvers ++= Seq(
            "Sonatype Snapshots" at "",
            "Sonatype Releases" at ""

scalaVersion := "2.11.1"

This Scala version of the code takes around 30 minutes to complete on the Parallella! This is pretty dire, but entirely expected. Scala runs on the JVM, and the OpenJDK JVM is known to perform very poorly on ARM architectures. If you want to run computationally intensive JVM code on ARM, you have no choice but to get the Oracle JDK8 for ARM. Although this isn’t real “free” software, it can be downloaded and used for free for personal use.

Using Oracle’s JVM

Download and unpack Oracle’s JVM somewhere sensible on your system. Then update your sbt launcher script to point at the new java binary location. You may also want to add a javaHome line to your sbt build script. Once this is done, using sbt is just as normal, but you get to use Oracle’s pretty fast ARM JVM. Running the above example with the new JVM takes around 4 minutes 40 seconds on my Parallella. This is more than 5 times faster than the OpenJDK JVM, and it is a similar story on the Raspberry Pi as well. But it still isn’t blazingly fast. On Intel this code runs within a factor of 2 of the C version. Here it is closer to a factor of 4 slower. But again, this is very similar to the situation on the Pi. The Oracle JVM is good, but for whatever reason, the ARM JVM doesn’t get as close to the speed of native code as it does on x86.

A simple Monte Carlo example in Scala

I recently gave a talk on using Scala for statistical computing, and for that I prepared a talk and some code examples. One of the examples was a simple Monte Carlo example, coded in several different ways, including one version which would automatically exploit multiple cores. There is a github repo containing the slides of the talk and all of the code examples. Running the Monte Carlo example (with 10^6 iterations) using the OpenJDK JVM produces the following results:

$ sbt run
[info] Set current project to monte-carlo (in build file:/home/linaro/git/statslang-scala/monte-carlo/)
[info] Running MonteCarlo 
Running with 1000000 iterations
Idiomatic vectorised solution
time: 37922.050389ms
Fast efficient (serial) tail call
time: 25327.705376ms
Parallelised version
time: 15633.49492ms

Again, using the OpenJDK leads to poor performance, but the important thing to note is that the multi-core version is nearly twice as fast as the single threaded version, due to the Parallel ARM chip having 2 cores. Re-running with the Oracle JVM leads to much better results:

$ sbt run
[info] Set current project to monte-carlo (in build file:/home/linaro/git/statslang-scala/monte-carlo/)
[info] Running MonteCarlo 
Running with 1000000 iterations
Idiomatic vectorised solution
time: 8214.310728ms
Fast efficient (serial) tail call
time: 2410.760695ms
Parallelised version
time: 1800.324405ms

Here, the Oracle JVM is almost 10 times faster than the OpenJDK JVM, and again the parallel version runs significantly faster than the serial version.

A Parallella gotcha to be aware of here is that I got weird certificate errors when cloning my github repos that turned out to be the clock. Like the Pi, the Parallella doesn’t have an RTC chip built in, but unlike the Pi, it doesn’t always seem to automatically sync when an internet connection comes up. Manually running ntpdate fixed the date and then git worked fine…


The Parallella packs a lot of processing power in a very small, low power form-factor. Even just using the main ARM chip, it is a better starting point for computationally intensive workflows than the Raspberry Pi (provided that you can get it to work!). If you can find a way to exploit the Epiphany chip, things will look even better. Java and other JVM languages (eg. Scala) are fine on the Parallella, but only if you use Oracle’s JVM, otherwise you are going to have very poor performance. My next task is to have a play with the Epiphany examples that are bundled with the Parallella distro. At the moment it seems like the best way to use the Epiphany chip is via C+OpenCL. That involves a degree of masochism I tend not to subject myself to very often these days, so it may have to wait until I’m in an appropriate mood!