Setting up a standalone Apache Spark cluster of Raspberry Pi 2

In the previous post I explained how to install Apache Spark in “local” mode on a Raspberry Pi 2. In this post I will explain how to link together a collection of such nodes into a standalone Apache Spark cluster. Here, “standalone” refers to the fact that Spark is managing the cluster itself, and that it is not running on top of Hadoop or some other cluster management solution.

I will assume at least two Raspberry Pi 2 nodes on the same local network, with identical Spark distributions installed in the same directory of the same user account on each node. See the previous post for instructions on how to do this. I will use two such nodes, with Spark installed under the user account spark.

First, you must decide on one of your nodes to be the master. I have two nodes, raspi08 and raspi09. I will set up raspi08 as the master. The spark account on the master needs to be able to SSH into the same account on all of the other nodes without the need to provide a password, so it makes sense to begin by setting up passwordless SSH. Log in to the spark account on the master and generate SSH keys with:

ssh-keygen

Just press return when asked for a password to keep it password free. Copy the identity to each other node. eg. to copy it to raspi09 I use:

ssh-copy-id spark@raspi09

You will obviously need to provide a password at this point. Once the identity is copied, SSH into each node to ensure that you can indeed connect without the need for a password.

Once passwordless SSH is set up and working, log into the master node to configure the Spark installation. Within the conf/ directory of the Spark installation, create a file called slaves and enter a list of all nodes you want to have as “workers”. eg. mine looks like:

raspi08
raspi09

Note that in my case raspi08 is listed as a worker despite also acting as master. That is perfectly possible. If you have plenty of nodes you might not want to do this, as the Pi 2 doesn’t really have quite enough RAM to be able to do this well, but since I have only two nodes, it seems like a good idea here.

Also within the conf/ directory, take a copy of the environment template:

cp spark-env.sh.template spark-env.sh

and then edit spark-env.sh according to your needs. I solved some obtuse Akka errors about workers not being able to connect back to the master by hardcoding the IP address of the Master node into the config file:

SPARK_MASTER_IP=192.168.1.177

You can find out the IP address of your node by running ifconfig. You should also set the memory that can be used by each worker node. This is a bit tricky, as RAM is a bit tight on the Pi 2. I went for 512MB, leaving nearly half a gig for the OS.

SPARK_WORKER_MEMORY=512m

Once you are done, the environment template (but not the slaves list) needs to be copied to each worker. eg.

scp spark-env.sh @raspi09:spark-1.3.0-bin-hadoop2.4/conf/

You shouldn’t need to supply a password…

At this point, you should be ready to bring up the cluster. You can bring up the master and workers all in one go with:

sbin/start-all.sh

When the master comes up, it starts a web service on port 8080. eg. I connect to it from any machine on the local network by pointing my browser at: http://raspi08:8080/

If a web page comes up, then the master is running. The page should give other diagnostic information, including a list of workers that have been brought up and are registered with the master. You can also access a lot of debugging info. If everything seems to look OK, make a note of the URL for the Spark master, which is displayed in large text at the top of the page. It should just be spark://192.168.1.177:7077 where the IP address is the IP address of your master node.

Try bringing up a spark shell on the master with:

bin/spark-shell --master spark://192.168.1.177:7077

Once the shell comes up, go back to your web browser and refresh the page to see the connection. Go back to the shell and try a simple test like:

sc.textFile("README.md").count

As usual, it is Ctrl-D to exit the shell. To bring down the cluster, use

sbin/stop-all.sh

Once you are happy that everything is working as it should, you probably want to reduce the amount of diagnostic debugging info that is echoed to the console. Do this by going back into the conf/ directory and copying the log4j template:

cp log4j.properties.template log4j.properties

and then editing log4j.properties. There is a line near the beginning of the file:

log4j.rootCategory=INFO, console

Change INFO to WARN so it reads:

log4j.rootCategory=WARN, console

Then when you next bring up the cluster, everything should be a bit less noisy.

That’s it. You’ve built a Spark cluster! Note that when accessing files from Spark scripts (and applications) it is assumed that the file exists in the same directory on every worker node. For testing purposes, it is easy enough to use scp before running the script to copy the files to all of the workers. But that is obviously somewhat unsatisfactory in the long term. Another possibility is to set up an NFS file server and mount it at the same mount point on each worker. Then make sure that any files you access are shared via the NFS file server. Even that solution isn’t totally satisfactory, due to the slow interconnect on the Pi 2. Ultimately, it would be better to set up a proper distributed file system such as Hadoop’s HDFS on your cluster and then share files via HDFS. That is how most production Spark clusters are set up. I may look at that in another post, but in the meantime check out the Spark standalone documentation for further information.

Benchmarking MCMC codes on the Raspberry Pi 2

Introduction

In the previous post I looked at running some MCMC codes in C and Scala on the Parallella. In that post I explained how the Parallella was significantly faster than the Raspberry Pi, and how it represented better “bang for buck” than the Raspberry Pi for computationally intensive MCMC codes. However, since that post was written, the Raspberry Pi 2 has been released. This board has a much better processor than the old Pi, and double the RAM, for the same price. This changes things, considerably. The processor is an ARM7 quad core. Each core is around twice as fast as the single core on the original Pi, and there are 4 of them. In this post I will re-run the codes from the previous post and compare against the Parallella.

Gibbs sampler in C

I’m using the new Raspbian image for the Pi 2. This includes gcc by default, but not the GSL library. This can be installed with sudo apt-get install libgsl0-dev. Then the file gibbs.c from the previous post will compile and run. On the Pi 2 this runs in around 75 seconds – very similar to the time on the Parallella, and around twice as fast as all of the previous Raspberry Pis.

Gibbs sampler in Scala

The Raspbian image ships with Oracle’s fast and efficient ARM-optimised JVM by default, so there’s no issue with installing Java at all. As usual, installing “sbt” is a simple matter of copying the launcher script (and jar) into your ~/bin directory. Then the Scala version of the Gibbs sampler can be run with a command like time sbt run > /dev/null. Again, it runs in around 4 minutes 40 seconds, just like on the Parallella. So, the ARM cores on the Parallella and the Pi 2 have very similar performance. However, the Parallella ARM chip has just two cores, whereas the Pi 2 is quad core.

Parallel Monte Carlo in Scala

Again, as for the previous post, I next ran the Monte Carlo example from this github repo. This gives output like:

$ sbt run
[info] Set current project to monte-carlo (in build file:/home/pi/src/git/statslang-scala/monte-carlo/)
[info] Running MonteCarlo 
Running with 1000000 iterations
Idiomatic vectorised solution
1.00136
time: 6768.504487ms
Fast efficient (serial) tail call
1.000705
time: 2473.331672ms
Parallelised version
1.0007
time: 1391.2828ms
Done
[success] 

Here again we see that the single threaded versions run in a similar time to the Parallella (and around twice as fast as the old Pis), but that the parallelised version runs significantly faster on the Pi 2 than on the Parallella (due to having 4 ARM cores rather than 2).

Summary

For my test MCMC codes, the cores on the Pi 2 are around twice as fast as the single core on the old Raspberry Pis, and a similar speed to the cores on the Parallella. However, multi-threaded codes run faster still, due to there being 4 cores on the Pi 2 (versus 2 on the Parallella and one on the old Pis). Furthermore, the Pi 2 is the same price as the old Pis (which are still being sold), and around a quarter of the price of the cheapest Parallella. So for standard single and multi-threaded codes running on the ARM cores, the Pi 2 wins hands down in terms of “bang for buck”, and is sufficiently quick and cheap that it starts looking like a credible platform for building cheap clusters for compute-intensive jobs like MCMC. Now to be fair to the Parallella, really the whole point of it is that it has a multi-core Epiphany co-processor that I’ve not been using or factoring in to the discussion at all so far. That said, the Pi 2 is so much cheaper than the Parallella (not to mention, less “fragile”), that I suspect that even for codes which effectively exploit the Epiphany chip it is unlikely that the Parallella will outperform the Pi 2 in terms of “bang for buck”. Now “bang per watt” is another matter entirely, and the Parallella may well outperform the Pi 2 in that regard if efficient use can be made of the Epiphany chip. But development time costs money too, and it’s really not clear that it’s going to be easy for me to run my multi-threaded Scala codes effectively on the Epiphany chip any time soon. So the Pi 2 currently looks like a real winner from my personal perspective.