Compress & Decompress Files Faster with lbzip2 multi-threaded version of bzip2

Bzip2 is still one of the most commonly used compression tools in Linux, but it only works with a single thread, and I’ve been made aware that lbzip2 allows multi-threaded bzip2 compressions which should lead to much better performance on multi-core systems.

Tar with lbzip2 on a 8-core Processor - Click to Enlarge
Tar with lbzip2 on an 8-core Processor – Click to Enlarge

lbzip2 was not installed by default in my Ubuntu 16.04 machine, but it’s easy enough to install:


I have cloned mainline linux repository on my machine, so let’s see how long it takes to compress the directory with bzip2 (one core compression):


9 minutes and 22 seconds. Now let’s repeat the test with lbzip2 using all 8 cores from my AMD FX8350 processor:


2 minutes 32 seconds. Almost 4x times, not bad at all. It’s not 8 times faster because you have to take into account I/Os, and at the beginning the system is scanning the drive, using all 8-core but not all full throttle. The files were also stored in a hard drive, so I’d assume the performance difference should be even more noticeable from an SSD.

We can see both files are about the same size as they should be:


I’m not exactly sure why there’s about 771 KB difference as both tools offer the same compression.

That was for compression. What about decompression? I’ll decompress the lbzip2 compressed file with bzip2 first:


2 minutes and 49 seconds. Now let’s decompress the bzip2 compressed file with lbzip2:


45 seconds! Again the performance difference is massive.

If you want tar to always use lbzip2 instead of bzip2, you could create an alias:


Please note that this will cause a conflict (“Conflicting compression options”) when you try to compress files using -j /–bzip2 or -J, –xz options, so instead of tar, you may want to create another alias, for example tarfast.

lbzip2 is not the only tool to support multi-threaded bzip2 compression, as pbzip2 is another implementation. However, one report indicates that lbzip2 may be twice as fast as pbzip2 to compress files (decompression speed is about the same), which may be significant if you have a backup script…

tkaiser also tested various compression algorithms (gzip, pbzip2, lz4, pigz) for a backup script for Orange Pi boards running armbian, and measured overall performance piping his eMMC through the different compressors to /dev/null:


pigz looks the best solution here (25.2 MB/s) compared to pbzip2 (15.2 MB/s). lbzip2 has not been tested, and could offer an improvement over pigz both in terms of speed and compression based on the previous report, albeit actual results may vary depending on the CPU used.

Support CNX Software - Donate via PayPal or cryptocurrencies, become a Patron on Patreon, or buy review samples
Subscribe
Notify of
guest
30 Comments
oldest
newest most voted
Advertisements