My first idea was to get a few Beaglebones so that I could be building multiple versions at any given time. This worked well, but it still took 20+ hours to get a testable output. If only there was a way to get these Beaglebones to help each other out......ah, but there is!
Enter distcc. Distcc is nice in that it is simply a wrapper around your existing build commands. This means it is easy to setup and use. Distcc works by inserting itself between the make script and the compiler (gcc or g++), which it accomplishes through the standard CC and CXX command line options available through make. By redirecting gcc and g++ commands through distcc, you can get a parallel build across machines using the normal make -j option.
Overview
- Configure hardware
- Install image
- Install packages
- Configure Main build client
- Configure compile hosts
- Run
Hardware
- 8 Beaglebone Blacks (2 Rev. A6A, 1 Rev. B, 3 Rev. C, 2 Element 14 Rev. C)
- 16 port Ethernet Switch (N-tron 516TX)
- 50 Watt 5V DC Power Supply (Omron or Astrodyne)
- DHCP Server (Asus RT-N56U)
Software Installation
Use the default operating system image or download a Debian console eMMC flasher such as BBB-eMMC-flasher-debian-7.7-console-armhf-2014-10-29-2gb.img.xz found at the elinux.org site.Add the required tools:
>sudo apt-get install distcc distcc-pump g++ make
Compile Hosts Configuration
Choose one Beaglebone to set aside as the main build client. Ideally this would be the Beaglebone with the most memory (RAM and filesystem). The other seven "workers" will be referred to as compile hosts. In order to simplify the calling of distcc on the main build client, we need to give our workers hostnames. Run the following three commands on each of them:> sudo nano /etc/hostname
> sudo nano /etc/hosts
> sudo hostname boris
In the 'hostname' and 'hosts' files, replace the default hostname (usually "beaglebone") with the desired hostname, which in the example above would be "boris".
A reboot will force an update to the hostname, but before we do that, we need to tell distcc to start a service at boot and to allow local network traffic. Edit the first few lines of the /etc/default/distcc config file:
to look something like this:> sudo nano /etc/default/distcc
# Defaults for distcc initscript # sourced by /etc/init.d/distcc # # should distcc be started on boot? # # STARTDISTCC="true" STARTDISTCC="true" # # Which networks/hosts should be allowed to connect to the daemon? # You can list multiple hosts/networks separated by spaces. # Networks have to be in CIDR notation, f.e. 192.168.1.0/24 # Hosts are represented by a single IP Adress # # ALLOWEDNETS="127.0.0.1" ALLOWEDNETS="192.168.0.0/24"
Reboot to commit the changes.
Main Build Client Configuration
Like we set up on the workers, we need to give the master a hostname. Two of the commands are identical:> sudo nano /etc/hostname
> sudo hostname pluto
but we need to add the workers, in addition to the master hostname, to the hosts file:
> sudo nano /etc/hosts
Which, if your master unit is named "pluto", will look something like this:
127.0.0.1 localhost 127.0.1.1 pluto 192.168.0.106 droopy 192.168.0.71 astro 192.168.0.232 dogbert 192.168.0.9 scooby 192.168.0.100 underdog 192.168.0.139 snoopy 192.168.0.180 goofy 192.168.0.41 peabody 192.168.0.20 brian
That's it! We are now ready to run distcc.
Run distcc
Before we run distcc, we need to setup a couple of environment variables.The first variable is handy if you are having trouble talking to workers, or would like more feedback from distcc:
> export DISTCC_VERBOSE=1
I found the extra output helpful in diagnosing issues.
The following command is necessary any time the list of client names changes. The order is important, but really only matters if you include the master in the worker list. Due to the limited resources on the Beaglebone, I chose not to allow normal building on the master.
> export DISTCC_POTENTIAL_HOSTS='astro dogbert snoopy underdog droopy scooby goofy peabody brian'
We can now call distcc, but instead of calling it directly, we are going to use the distcc-pump tool. Distcc-pump parses the "DISTCC_POTENTIAL_HOSTS" variable and automatically configures and starts the appropriate distcc services.
Most of the websites that show examples for distcc show something like this:
distcc-pump make -j12 CC="distcc"which works, but when compiling large projects that are written in both C and C++, like Qt, that command only compiles SOME of the code on the cluster. Needless to say, this is bad and the master Beaglebone dies a quick death due to memory loss.
The workaround took me a while to figure out, but is really simple. Because distcc is smart enough to figure out which compiler to use, call this line instead:
> distcc-pump make -j12 CC=distcc CXX=distcc
Now that you have a working compiler, you can play around with the -j12 parameter to get the best results. Many distcc examples claim that this parameter could be very large, but that does not work for compiling a large project like Qt. The main choke point is the amount of RAM and the huge memory requirements for the include server.
Building Qt and Optimizations
Like I alluded to earlier, I did manage to compile Qt on the cluster, but it wasn't "clean". After a half hour of compiling, the master Beaglebone runs out of memory. It's not that big of a deal, because you can just run the distcc-pump command again, but I wanted to see if I could help it out.To do this, I created 1 Gb of swap space on a micro SD card. This helped, but only extended the build time to an hour before the memory was full and the Beaglebone started spending most of it's time paging memory.
The ultimate solution was to use a master with more memory. I chose a Wandboard i.MX6 paired with a SATA HD, which worked well.
Here are the resultant compile times for various configurations:
Single Beaglebone ~19-20 hours Beaglebone Master + 7 Beaglebone Workers ~5-6 hours Beaglebone Master + 9 Beaglebone Workers ~5 hours Wandboard Master + 8 Beaglebones <4 hours