Home > Linux, Software management, Testing > Use GNU Parallel to Speed Up Script Execution on Multiple Cores and/or Machines

Use GNU Parallel to Speed Up Script Execution on Multiple Cores and/or Machines

I attended BarCamp Chiang Mai 5 last week-end, and a lot of sessions were related to project management, business apps and web development, but there were also a few embedded systems related sessions dealing with subjects such as Arduino (Showing how to blink an LED…) and IOIO board for Android, as well as some Linux related sessions.

The most useful talk I attended was about “GNU Parallel”, a command line tool that can dramatically speed up time-consuming tasks that can be executed in parallel, by spreading tasks across multiple cores and/or local machines on a LAN. This session was presented by the developer himself (Ole Tange).

This tool is used for intensive data processing tasks such as DNA sequencing analysis (Bioinformatics), but it might be possible to find a way to use GNU Parallel to shorten the time it takes to build binaries. Make is already doing a good job at distributing compilation tasks to several cores, and distcc can be used to extend that to multiple machines, so parallel could potentially used at the beginning of the build (e.g. decompressing multiple files) and after the build (e.g. compressing multiple packages). I haven’t figured out a good way to use it for this task yet, but today, I’ll just give an introduction to GNU Parallel, some examples and links to useful resources.

There is a “parallel” utility available in Ubuntu repository (apt-get install moreutils). I tried and it did not work using the samples for a good reason:

In February 2009 I tried getting parallel added to the package moreutils. The author never replied to the email or the two reminders, but in June 2009 moreutils chose to add another program called parallel. This choice leads to some confusion even today.

So if you want to give GNU parallel a try, you’ll have to download the source and install it. I installed “Chiang Mai” release (22/6/2012):

wget http://mirrors.ispros.com.bd/gnu/parallel/parallel-20120622.tar.bz2
tar xjvf parallel-20120622.tar.bz2 
cd parallel-20120622/
./configure
make
make install

You’d better make sure you’re going to use the correct version a parallel and not the one from moreutils:

/usr/local/bin/parallel --version
GNU parallel 20120622
...

The best way to get started is probably to watch the 3 tutorial YouTube videos. There is also a manpage, but it’s pretty massive as it would be 53 pages long if printed. GNU parallel has a specific syntax (the same as xargs) so you may need to spend some time to familiarize yourself with this tool .

In the first video (shown below), Ole gives the instructions to install parallel and compares the time taken to compress multiple files directly with gzip vs the time taken with parallel + gzip on a dual core machine and multiple machines.

Using “gzip -1 *” on some log files takes about about 20 seconds, but after decompressing the files, and using GNU parallel:

ls | parallel gzip -1

it only takes about 10 seconds as it uses both cores. Decompressing the files takes about the same time whether you use parallel or not, since this is mainly bound to disk I/O.

It’s also possible to display tasks progress on each core. This example shows how to re-compress gz files into bz2 using parallel and outputting the progress and ETA:

ls *.gz | parallel -j+0 --eta 'zcat  {} | bzip2 -9

“-j+0″ tells to distribute 1 job per core, and “–eta” is used to show the progress:

Computers / CPU cores / Max jobs to run
1:local / 2 / 2

Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
ETA: 1s 1left 0.40avg  local:2/15/99%/0.5s

You can also use parallel on multiple machines. First, you need to install GNU parallel on all computers and setup your machines so that they can be accessed by ssh without password. (using public/private keys instead).

The task is the same, but the command line becomes a little more complicated:

ls *.gz |  time /usr/local/bin/parallel -j+0 --eta -S192.168.0.101,: --transfer --return {.}.bz2 --cleanup 'zcat {} | bzip2 -9 >{.}.bz2'

“-S” is used to specify the list of servers (“:” being the localhost), “transfer” is to transfer the files, “return” is to copy back the bz2 to the localhost, and “cleanup” will delete the working files from the other machines.

Computers / CPU cores / Max jobs to run
1:192.168.0.101 / 2 / 2
2:local / 1 / 1

Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
192.168.0.101:2/0/66%/0.0s  local:1/0/33%/0.0s 
Computer:jobs running/jobs completed/%of started jobs
ETA: 2s 1left 2.75avg  1:1/1/15%/39.0s  2:0/11/84%/3.5s 

If you are interested in this tool, you can start watching the video below (basically what I have written above). There are also 2 other video tutorials which are linked at the end of the video.

I’ll try to get hold of the presentation slides used at Barcamp Chiang Mai. In the meantime, if you want to know more about this tool, you can check the examples in the manpage or simply visit GNU Parallel page.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter

  1. kcg
    July 2nd, 2012 at 16:26 | #1

    Anybody tried ppmake[1] or pvmgmake[2] already? Still don’t have a time for it, but it’ll probably help a lot using cluster of pandas to do compilation on native ARM…
    [1]: http://sourceforge.net/projects/ppmake/
    [2]: http://pvmgmake.sourceforge.net/

  2. July 2nd, 2012 at 17:33 | #2

    @ kcg
    I never used ppmake/pvmmake, but I did use distcc, and I found it relatively easy to setup at the time.
    If you have a cluster of pandas, use “make -j3″ and install distcc on all pandas, you should get pretty fast build times.

  3. kcg
    July 2nd, 2012 at 20:16 | #3

    @ cnxsoft
    Hi, yes, thanks, distcc looks like a nice solution except that I compile Haskell and not C/C++ so I really need distributed make.

  1. No trackbacks yet.