Facebook Zstandard “zstd” & “pzstd” Data Compression Tools Deliver High Performance & Efficiency

Ubuntu 16.04 and – I assume – other recent operating systems are still using single-thread version of file & data compression utilities such as bzip2 or gzip by default, but I’ve recently learned that compatible multi-threaded compression tools such as lbzip2, pigz or pixz have been around for a while, and you can replace the default tools by them for much faster compression and decompression on multi-core systems. This post led to further discussion about Facebook’s Zstandard 1.0 promising both smaller and faster data compression speed. The implementation is open source, released under a BSD license, and offers both zstd single threaded tool, and pzstd multi-threaded tool. So we all started to do own little tests and were impressed by the results. Some concerns were raised about patents, and development is still work-in-progess with a few bugs here and there including pzstd segfaulting on ARM. Zlib has 9 levels of compression, while Zstd has 19, so Facebook has tested all …

Support CNX Software – Donate via PayPal or become a Patron on Patreon

Compress & Decompress Files Faster with lbzip2 multi-threaded version of bzip2

Bzip2 is still one of the most commonly used compression tools in Linux, but it only works with a single thread, and I’ve been made aware that lbzip2 allows multi-threaded bzip2 compressions which should lead to much better performance on multi-core systems. lbzip2 was not installed by default in my Ubuntu 16.04 machine, but it’s easy enough to install: I have cloned mainline linux repository on my machine, so let’s see how long it takes to compress the directory with bzip2 (one core compression): 9 minutes and 22 seconds. Now let’s repeat the test with lbzip2 using all 8 cores from my AMD FX8350 processor: 2 minutes 32 seconds. Almost 4x times, not bad at all. It’s not 8 times faster because you have to take into account I/Os, and at the beginning the system is scanning the drive, using all 8-core but not all full throttle. The files were also stored in a hard drive, so I’d assume the …

Support CNX Software – Donate via PayPal or become a Patron on Patreon

Use GNU Parallel to Speed Up Script Execution on Multiple Cores and/or Machines

I attended BarCamp Chiang Mai 5 last week-end, and a lot of sessions were related to project management, business apps and web development, but there were also a few embedded systems related sessions dealing with subjects such as Arduino (Showing how to blink an LED…) and IOIO board for Android, as well as some Linux related sessions. The most useful talk I attended was about “GNU Parallel”, a command line tool that can dramatically speed up time-consuming tasks that can be executed in parallel, by spreading tasks across multiple cores and/or local machines on a LAN. This session was presented by the developer himself (Ole Tange). This tool is used for intensive data processing tasks such as DNA sequencing analysis (Bioinformatics), but it might be possible to find a way to use GNU Parallel to shorten the time it takes to build binaries. Make is already doing a good job at distributing compilation tasks to several cores, and distcc can be …

Support CNX Software – Donate via PayPal or become a Patron on Patreon