JPEG acceleration with libjpeg-turbo

2011-04-24, 16:16 by Jonas Wallden in Development

Pike, the language environment in which Roxen products execute, includes a very common open-source JPEG library called libjpeg. This library implements decoding and encoding of JPEG images and has been used for years and years. If you work with RXML you have probably used <cimg src="..." format="jpeg" /> at some time or another to generate image thumbnails.

Recently I stumbled upon a variant called libjpeg-turbo which has been designed as a drop-in replacement for libjpeg. The developers claim 2-4x faster execution due to large amounts of hand-written SSE assembly for Intel x86 and x86_64 architectures. The library has been adopted by Chromium and Firefox and that's naturally a stamp of approval that greatly reduces any fears of incompatibilities or future abandonment.

In addition to Pike we also ship ImageMagick with Roxen CMS and Roxen Editorial Portal. ImageMagick is used primarily with Roxen EP to offload scaling of images in news feeds. This is a task which can strain even the fastest server; it can take many seconds per image to generate the set of medium- and low-res thumbnails that EP needs. Luckily it runs as independent sub-processes that take advantage of all cores in your machine (up to an admin-defined limit), but obviously parallelization doesn't reduce latency for a single image. With that in mind, since ImageMagick also relies on libjpeg we have another candidate for the turbo library.

As of this week I've successfully compiled and integrated the libjpeg-turbo library for Pike and ImageMagick in all of our Mac OS X and RHEL 4/5 builds. Initial benchmarks on a Core i5 iMac shows that the 2-4x speed improvement was overly conservative. Have a look at some benchmark numbers of Image.JPEG.decode() and Image.JPEG.encode() in Pike:

3888 x 2600 pixel image (12 MB compressed)
Action libjpeg libjpeg-turbo
Decode 1.28 sec 0.26 sec
Encode (Q=95) 1.14 sec 0.14 sec
Encode (Q=75) 0.86 sec 0.09 sec
Encode (Q=25) 0.75 sec 0.07 sec
640 x 480 pixel image (0.2 MB compressed)
Action libjpeg libjpeg-turbo
Decode 0.024 sec 0.005 sec
Encode (Q=95) 0.030 sec 0.004 sec
Encode (Q=75) 0.026 sec 0.003 sec
Encode (Q=25) 0.024 sec 0.002 sec

In this particular test the replacement is 5-12x faster than the old implementation! Of course this represents RAM-based throughput only so ImageMagick will not see improvements on the same scale, but nevertheless it's very impressive! I can also add that all output was byte-for-byte identical with both libraries.

The updated library will be included in future builds of Roxen 5.x for the platforms mentioned earlier.

 

You need to log in to post comments.

 

1   Arjan van Staalduijnen

2011-05-12 09:00

Wow, that's an impressive gain! I've often been wondering if there would be any gain in using libraries which use the gpu for encoding and decoding images on the Roxen platform, while this library already provides an impressive improvement on the cpu itself and even is completely compatible. I'm curious to find out what the impact when it's running on our platform.

2   Arjan van Staalduijnen

2011-05-12 09:06

As for some gpu-supported enhancement, the page at http://shader.kaist.edu/sslshader/ describes a research project where a group of people achieved a 2 to 4 times gain in SSL performance by offloading RSA, AES and HMAC-SHA1 calculations to a gpu. Could be some nice food for thought for more of these enhancements.

3   Arjan van Staalduijnen

2011-06-01 15:46

Now you mention replacing image libraries... This could enable support for animated PNGs in Pike, similar to animated GIFs http://animatedpng.com/ ... seems to be supported by at least Firefox and Opera (and people are asking for it in Chrome, if not already there?)

4   Manithetical

2012-08-03 22:34

This is a bit unrelated but how exactly did you compile ImageMagick with libjpeg-turbo on Mac OS X.

5   Jonas Wallden

2012-08-20 12:56

The library is a drop-in replacement for libjpeg so I don't think I did much aside from having the library search path point to the correct location. Nowadays we build it automatically in our build system so I don't have a minimized build command handy, but if you experience any issues I can dig deeper and hopefully find some clues.

Sep 23, 2017

Categories

Community Update (1)
Customers (0)
Development (10)
New sites (1)

Latest comments

The library is a drop-in replacement for libjpeg so I don't think I did much aside from having the library search path point to the correct location. Nowadays we build it automatically in our build system so I don't have a minimized build command handy, but if you experience any issues I can dig deeper and hopefully find some clues.
This is a bit unrelated but how exactly did you compile ImageMagick with libjpeg-turbo on Mac OS X.
Now you mention replacing image libraries... This could enable support for animated PNGs in Pike, similar to animated GIFs http://animatedpng.com/ ... seems to be supported by at least Firefox and Opera (and people are asking for it in Chrome, if not already there?)
As for some gpu-supported enhancement, the page at http://shader.kaist.edu/sslshader/ describes a research project where a group of people achieved a 2 to 4 times gain in SSL performance by offloading RSA, AES and HMAC-SHA1 calculations to a gpu. Could be some nice food for thought for more of these enhancements.
Wow, that's an impressive gain! I've often been wondering if there would be any gain in using libraries which use the gpu for encoding and decoding images on the Roxen platform, while this library already provides an impressive improvement on the cpu itself and even is completely compatible. I'm curious to find out what the impact when it's running on our platform.