Dan
Published

Tue 10 June 2014

←Home

pxz - Parallelized LZMA compression

Awhile back Tim posted about pigz. I'm the "ops guy" and in the spirit of #devops (and brotherhood) I must admit I use it fairly often now. I quickly discovered pbzip2 but figurd everyone else had, too.

But speeding up LZMA? That's a potential game changer. Let's take pxz out for a spin on an 8 core 32GB SSD server

The field

First I backed up a random website + SQL dump. This provides lots of random compressed images and structured SQL data to re-compress. Usually this is to be avoided.

Installing pxz on a recent Ubuntu was trivial: apt-get install pxz. I piped the output to /dev/null to further reduce I/O on the 112MB/s SSD.

15 minutes with xz

$ time xz -zcv random.files.tar > /dev/null
random.files.tar (1/1)
100 %     547.2 MiB / 1,063.4 MiB = 0.515   1.2 MiB/s      15:14

real    15m14.476s
user    15m8.258s
sys     0m4.781s

3 minutes with pxz

$ time pxz -zcv random.files.tar > /dev/null
context size per thread: 25169920 B
8 threads: [7 2 6 0 4 5 1 3 6 7 5 4 3 2 1 0 3 0 1 7 4 2 6 5 1 6 7 5 4 0 2 3 1 2 0 4 7 3 5 6 4 3 2 0 1 ]
real    2m47.175s
user    16m42.338s
sys     0m18.357s

Note the user times for both runs. How interesting. The parallelism of pxz shows sixteen minutes of work completed in under three minutes over all eight processors. Since most cores are usually idle, even running nice pxz will give you massive returns and not interrupt other I/O.

Go Top
comments powered by Disqus