glmark2

Merge lp:~linaro-graphics-wg/glmark2/image-readers into lp:glmark2/2011.11

image-readers
Merge into trunk

Proposed by Alexandros Frantzis on 2012-06-27

Status:	Merged
Merged at revision:	226
Proposed branch:	lp:~linaro-graphics-wg/glmark2/image-readers
Merge into:	lp:glmark2/2011.11
Diff against target:	33152 lines (+32505/-119) 80 files modified android/jni/Android.mk (+14/-1) android/jni/Android.ndk.mk (+14/-1) src/image-reader.cpp (+386/-0) src/image-reader.h (+77/-0) src/libjpeg-turbo/README (+290/-0) src/libjpeg-turbo/README-turbo.txt (+361/-0) src/libjpeg-turbo/config.h (+137/-0) src/libjpeg-turbo/jaricom.c (+153/-0) src/libjpeg-turbo/jcapimin.c (+292/-0) src/libjpeg-turbo/jcapistd.c (+161/-0) src/libjpeg-turbo/jcarith.c (+925/-0) src/libjpeg-turbo/jccoefct.c (+449/-0) src/libjpeg-turbo/jccolext.c.inc (+114/-0) src/libjpeg-turbo/jccolor.c (+599/-0) src/libjpeg-turbo/jcdctmgr.c (+642/-0) src/libjpeg-turbo/jchuff.c (+1026/-0) src/libjpeg-turbo/jchuff.h (+47/-0) src/libjpeg-turbo/jcinit.c (+76/-0) src/libjpeg-turbo/jcmainct.c (+293/-0) src/libjpeg-turbo/jcmarker.c (+666/-0) src/libjpeg-turbo/jcmaster.c (+624/-0) src/libjpeg-turbo/jcomapi.c (+106/-0) src/libjpeg-turbo/jconfig.h (+58/-0) src/libjpeg-turbo/jcparam.c (+649/-0) src/libjpeg-turbo/jcphuff.c (+831/-0) src/libjpeg-turbo/jcprepct.c (+354/-0) src/libjpeg-turbo/jcsample.c (+527/-0) src/libjpeg-turbo/jctrans.c (+399/-0) src/libjpeg-turbo/jdapimin.c (+395/-0) src/libjpeg-turbo/jdapistd.c (+277/-0) src/libjpeg-turbo/jdarith.c (+761/-0) src/libjpeg-turbo/jdatadst-tj.c (+188/-0) src/libjpeg-turbo/jdatasrc-tj.c (+182/-0) src/libjpeg-turbo/jdcoefct.c (+749/-0) src/libjpeg-turbo/jdcolext.c.inc (+104/-0) src/libjpeg-turbo/jdcolor.c (+529/-0) src/libjpeg-turbo/jdct.h (+184/-0) src/libjpeg-turbo/jddctmgr.c (+288/-0) src/libjpeg-turbo/jdhuff.c (+808/-0) src/libjpeg-turbo/jdhuff.h (+234/-0) src/libjpeg-turbo/jdinput.c (+471/-0) src/libjpeg-turbo/jdmainct.c (+514/-0) src/libjpeg-turbo/jdmarker.c (+1364/-0) src/libjpeg-turbo/jdmaster.c (+601/-0) src/libjpeg-turbo/jdmerge.c (+455/-0) src/libjpeg-turbo/jdmrgext.c.inc (+156/-0) src/libjpeg-turbo/jdphuff.c (+668/-0) src/libjpeg-turbo/jdpostct.c (+290/-0) src/libjpeg-turbo/jdsample.c (+496/-0) src/libjpeg-turbo/jdtrans.c (+152/-0) src/libjpeg-turbo/jerror.c (+252/-0) src/libjpeg-turbo/jerror.h (+314/-0) src/libjpeg-turbo/jfdctflt.c (+168/-0) src/libjpeg-turbo/jfdctfst.c (+224/-0) src/libjpeg-turbo/jfdctint.c (+283/-0) src/libjpeg-turbo/jidctflt.c (+242/-0) src/libjpeg-turbo/jidctfst.c (+368/-0) src/libjpeg-turbo/jidctint.c (+389/-0) src/libjpeg-turbo/jidctred.c (+398/-0) src/libjpeg-turbo/jinclude.h (+91/-0) src/libjpeg-turbo/jmemmgr.c (+1151/-0) src/libjpeg-turbo/jmemnobs.c (+109/-0) src/libjpeg-turbo/jmemsys.h (+198/-0) src/libjpeg-turbo/jmorecfg.h (+404/-0) src/libjpeg-turbo/jpegcomp.h (+26/-0) src/libjpeg-turbo/jpegint.h (+401/-0) src/libjpeg-turbo/jpeglib.h (+1213/-0) src/libjpeg-turbo/jquant1.c (+860/-0) src/libjpeg-turbo/jquant2.c (+1293/-0) src/libjpeg-turbo/jsimd.h (+98/-0) src/libjpeg-turbo/jsimddct.h (+102/-0) src/libjpeg-turbo/jutils.c (+179/-0) src/libjpeg-turbo/jversion.h (+31/-0) src/libjpeg-turbo/simd/jsimd.h (+666/-0) src/libjpeg-turbo/simd/jsimd_arm.c (+670/-0) src/libjpeg-turbo/simd/jsimd_arm_neon.S (+2159/-0) src/texture.cpp (+58/-106) src/texture.h (+17/-6) src/wscript_build (+2/-2) wscript (+3/-3)
To merge this branch:	bzr merge lp:~linaro-graphics-wg/glmark2/image-readers
Related bugs:	Link a bug report
Related blueprints:	Support loading texture data from JPEG files (Medium)

Reviewer	Review Type	Date Requested	Status
Jesse Barker		2012-06-27	Approve on 2012-06-27
Review via email: mp+112294@code.launchpad.net

Description of the change

Add support for loading texture data from JPEG files, plus some refactoring of the image reading infrastructure.

Revision history for this message

Jesse Barker (jesse-barker) wrote on 2012-06-27:

In the function that fills in the "struct jpeg_source_mgr" function pointer for skip_input_data, while I normally object to C casting in C++, it seems especially odd in this case as there's a mix of C and C++ casts in this function. Aside from that, the code looks great.

review: Approve

Revision history for this message

Alexandros Frantzis (afrantzis) wrote on 2012-06-27:

Thanks, I copied that from the jpeg source code and didn't pay attention to the casts. Fixed in 232.

lp:~linaro-graphics-wg/glmark2/image-readers updated on 2012-06-27

232. By Alexandros Frantzis on 2012-06-27: ImageReader: Remove C casts.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

The diff has been truncated for viewing.

Subscribers

People subscribed via source and target branches

to all changes:

Linaro Graphics Working Group

Sturm Flut

glmark2 developers

 === modified file 'android/jni/Android.mk'
 --- android/jni/Android.mk	2012-06-21 12:57:43 +0000
 +++ android/jni/Android.mk	2012-06-27 16:20:24 +0000
@@ -24,6 +24,18 @@
  include $(CLEAR_VARS)
++LOCAL_MODULE := libglmark2-jpeg
++LOCAL_CFLAGS := -Werror -Wall -Wextra -Wno-error=attributes \
++                -Wno-error=unused-parameter -Wno-error=unused-function -Wno-error=unused-variable
++LOCAL_C_INCLUDES := $(LOCAL_PATH)/src/libjpeg-turbo/
++LOCAL_SRC_FILES := $(subst $(LOCAL_PATH)/,,$(wildcard $(LOCAL_PATH)/src/libjpeg-turbo/simd/*.c)) \
++                   $(subst $(LOCAL_PATH)/,,$(wildcard $(LOCAL_PATH)/src/libjpeg-turbo/simd/*.S)) \
++                   $(subst $(LOCAL_PATH)/,,$(wildcard $(LOCAL_PATH)/src/libjpeg-turbo/*.c))
++
++include $(BUILD_STATIC_LIBRARY)
++
++include $(CLEAR_VARS)
++
  LOCAL_CPP_EXTENSION := .cc
  LOCAL_MODULE := libglmark2-ideas
  LOCAL_CFLAGS := -DGLMARK_DATA_PATH="" -DUSE_GLESv2 -Werror -Wall -Wextra\
@@ -41,7 +53,7 @@
  LOCAL_MODULE_TAGS := optional
  LOCAL_MODULE := libglmark2-android
--LOCAL_STATIC_LIBRARIES := libglmark2-matrix libglmark2-png libglmark2-ideas
++LOCAL_STATIC_LIBRARIES := libglmark2-matrix libglmark2-png libglmark2-ideas libglmark2-jpeg
  LOCAL_CFLAGS := -DGLMARK_DATA_PATH="" -DGLMARK_VERSION="\"2012.06\"" \
                  -DUSE_GLESv2 -Werror -Wall -Wextra -Wnon-virtual-dtor \
                  -Wno-error=unused-parameter
@@ -49,6 +61,7 @@
  LOCAL_C_INCLUDES := $(LOCAL_PATH)/src \
                      $(LOCAL_PATH)/src/libmatrix \
                      $(LOCAL_PATH)/src/scene-ideas \
++                    $(LOCAL_PATH)/src/libjpeg-turbo \
                      $(LOCAL_PATH)/src/libpng \
                      external/zlib
  LOCAL_SRC_FILES := $(filter-out src/canvas% src/main.cpp, \
 === modified file 'android/jni/Android.ndk.mk'
 --- android/jni/Android.ndk.mk	2012-06-21 12:57:43 +0000
 +++ android/jni/Android.ndk.mk	2012-06-27 16:20:24 +0000
@@ -20,6 +20,18 @@
  include $(CLEAR_VARS)
++LOCAL_MODULE := libglmark2-jpeg
++LOCAL_CFLAGS := -Werror -Wall -Wextra -Wno-error=attributes \
++                -Wno-error=unused-parameter -Wno-error=unused-function -Wno-error=unused-variable
++LOCAL_C_INCLUDES := $(LOCAL_PATH)/src/libjpeg-turbo/
++LOCAL_SRC_FILES := $(subst $(LOCAL_PATH)/,,$(wildcard $(LOCAL_PATH)/src/libjpeg-turbo/simd/*.c)) \
++                   $(subst $(LOCAL_PATH)/,,$(wildcard $(LOCAL_PATH)/src/libjpeg-turbo/simd/*.S)) \
++                   $(subst $(LOCAL_PATH)/,,$(wildcard $(LOCAL_PATH)/src/libjpeg-turbo/*.c))
++
++include $(BUILD_STATIC_LIBRARY)
++
++include $(CLEAR_VARS)
++
  LOCAL_CPP_EXTENSION := .cc
  LOCAL_MODULE := libglmark2-ideas
  LOCAL_CFLAGS := -DGLMARK_DATA_PATH="" -DUSE_GLESv2 -Werror -Wall -Wextra\
@@ -34,7 +46,7 @@
  LOCAL_MODULE_TAGS := optional
  LOCAL_MODULE := libglmark2-android
--LOCAL_STATIC_LIBRARIES := libglmark2-matrix libglmark2-png libglmark2-ideas
++LOCAL_STATIC_LIBRARIES := libglmark2-matrix libglmark2-png libglmark2-ideas libglmark2-jpeg
  LOCAL_CFLAGS := -DGLMARK_DATA_PATH="" -DGLMARK_VERSION="\"2012.06\"" \
                  -DUSE_GLESv2 -Werror -Wall -Wextra -Wnon-virtual-dtor \
                  -Wno-error=unused-parameter
@@ -42,6 +54,7 @@
  LOCAL_C_INCLUDES := $(LOCAL_PATH)/src \
                      $(LOCAL_PATH)/src/libmatrix \
                      $(LOCAL_PATH)/src/scene-ideas \
++                    $(LOCAL_PATH)/src/libjpeg-turbo \
                      $(LOCAL_PATH)/src/libpng
  LOCAL_SRC_FILES := $(filter-out src/canvas% src/main.cpp, \
                       $(subst $(LOCAL_PATH)/,,$(wildcard $(LOCAL_PATH)/src/*.cpp))) \
 === added file 'src/image-reader.cpp'
 --- src/image-reader.cpp	1970-01-01 00:00:00 +0000
 +++ src/image-reader.cpp	2012-06-27 16:20:24 +0000
@@ -0,0 +1,386 @@
++/*
++ * Copyright © 2012 Linaro Limited
++ *
++ * This file is part of the glmark2 OpenGL (ES) 2.0 benchmark.
++ *
++ * glmark2 is free software: you can redistribute it and/or modify it under the
++ * terms of the GNU General Public License as published by the Free Software
++ * Foundation, either version 3 of the License, or (at your option) any later
++ * version.
++ *
++ * glmark2 is distributed in the hope that it will be useful, but WITHOUT ANY
++ * WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
++ * FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more
++ * details.
++ *
++ * You should have received a copy of the GNU General Public License along with
++ * glmark2.  If not, see <http://www.gnu.org/licenses/>.
++ *
++ * Authors:
++ *  Alexandros Frantzis
++ */
++#include <cstdio>
++#include <png.h>
++#include <jpeglib.h>
++#include <memory>
++
++#include "image-reader.h"
++#include "log.h"
++#include "util.h"
++
++/*******
++ * PNG *
++ *******/
++
++struct PNGReaderPrivate
++{
++    PNGReaderPrivate() :
++        png(0), info(0), rows(0), png_error(0),
++        current_row(0), row_stride(0) {}
++
++    static void png_read_fn(png_structp png_ptr, png_bytep data, png_size_t length)
++    {
++        std::istream *is = reinterpret_cast<std::istream*>(png_get_io_ptr(png_ptr));
++        is->read(reinterpret_cast<char *>(data), length);
++    }
++
++    png_structp png;
++    png_infop info;
++    png_bytepp rows;
++    bool png_error;
++    unsigned int current_row;
++    unsigned int row_stride;
++};
++
++PNGReader::PNGReader(const std::string& filename):
++    priv_(new PNGReaderPrivate())
++{
++    priv_->png_error = !init(filename);
++}
++
++PNGReader::~PNGReader()
++{
++    finish();
++    delete priv_;
++}
++
++bool
++PNGReader::error()
++{
++    return priv_->png_error;
++}
++
++bool
++PNGReader::nextRow(unsigned char *dst)
++{
++    bool ret;
++
++    if (priv_->current_row < height()) {
++        memcpy(dst, priv_->rows[priv_->current_row], priv_->row_stride);
++        priv_->current_row++;
++        ret = true;
++    }
++    else {
++        ret = false;
++    }
++
++    return ret;
++}
++
++unsigned int
++PNGReader::width() const
++{
++    return png_get_image_width(priv_->png, priv_->info);
++}
++
++unsigned int
++PNGReader::height() const
++{
++    return png_get_image_height(priv_->png, priv_->info);
++}
++
++unsigned int
++PNGReader::pixelBytes() const
++{
++    if (png_get_color_type(priv_->png, priv_->info) == PNG_COLOR_TYPE_RGB)
++    {
++        return 3;
++    }
++    return 4;
++}
++
++
++bool
++PNGReader::init(const std::string& filename)
++{
++    static const int png_transforms = PNG_TRANSFORM_STRIP_16 |
++                                      PNG_TRANSFORM_GRAY_TO_RGB |
++                                      PNG_TRANSFORM_PACKING |
++                                      PNG_TRANSFORM_EXPAND;
++
++    Log::debug("Reading PNG file %s\n", filename.c_str());
++
++    const std::auto_ptr<std::istream> is_ptr(Util::get_resource(filename));
++    if (!(*is_ptr)) {
++        Log::error("Cannot open file %s!\n", filename.c_str());
++        return false;
++    }
++
++    /* Set up all the libpng structs we need */
++    priv_->png = png_create_read_struct(PNG_LIBPNG_VER_STRING, 0, 0, 0);
++    if (!priv_->png) {
++        Log::error("Couldn't create libpng read struct\n");
++        return false;
++    }
++
++    priv_->info = png_create_info_struct(priv_->png);
++    if (!priv_->info) {
++        Log::error("Couldn't create libpng info struct\n");
++        return false;
++    }
++
++    /* Set up libpng error handling */
++    if (setjmp(png_jmpbuf(priv_->png))) {
++        Log::error("libpng error while reading file %s\n", filename.c_str());
++        return false;
++    }
++
++    /* Read the image information and data */
++    png_set_read_fn(priv_->png, reinterpret_cast<voidp>(is_ptr.get()),
++                    PNGReaderPrivate::png_read_fn);
++
++    png_read_png(priv_->png, priv_->info, png_transforms, 0);
++
++    priv_->rows = png_get_rows(priv_->png, priv_->info);
++
++    priv_->current_row = 0;
++    priv_->row_stride = width() * pixelBytes();
++
++    return true;
++}
++
++void
++PNGReader::finish()
++{
++    if (priv_->png)
++    {
++        png_destroy_read_struct(&priv_->png, &priv_->info, 0);
++    }
++}
++
++
++/********
++ * JPEG *
++ ********/
++
++struct JPEGErrorMgr
++{
++    struct jpeg_error_mgr pub;
++    jmp_buf jmp_buffer;
++
++    JPEGErrorMgr()
++    {
++        jpeg_std_error(&pub);
++        pub.error_exit = error_exit;
++    }
++
++    static void error_exit(j_common_ptr cinfo)
++    {
++        JPEGErrorMgr *err =
++            reinterpret_cast<JPEGErrorMgr *>(cinfo->err);
++
++        char buffer[JMSG_LENGTH_MAX];
++
++        /* Create the message */
++        (*cinfo->err->format_message)(cinfo, buffer);
++        std::string msg(std::string(buffer) + "\n");
++        Log::error(msg.c_str());
++
++        longjmp(err->jmp_buffer, 1);
++    }
++};
++
++struct JPEGIStreamSourceMgr
++{
++    static const int BUFFER_SIZE = 4096;
++    struct jpeg_source_mgr pub;
++    std::istream *is;
++    JOCTET buffer[BUFFER_SIZE];
++
++    JPEGIStreamSourceMgr(const std::string& filename) : is(0)
++    {
++        is = Util::get_resource(filename);
++
++        /* Fill in jpeg_source_mgr pub struct */
++        pub.init_source = init_source;
++        pub.fill_input_buffer = fill_input_buffer;
++        pub.skip_input_data = skip_input_data;
++        pub.resync_to_restart = jpeg_resync_to_restart; /* use default method */
++        pub.term_source = term_source;
++        pub.bytes_in_buffer = 0; /* forces fill_input_buffer on first read */
++        pub.next_input_byte = NULL; /* until buffer loaded */
++    }
++
++    ~JPEGIStreamSourceMgr()
++    {
++        delete is;
++    }
++
++    bool error()
++    {
++        return !is || (is->fail() && !is->eof());
++    }
++
++    static void init_source(j_decompress_ptr cinfo)
++    {
++        static_cast<void>(cinfo);
++    }
++
++    static boolean fill_input_buffer(j_decompress_ptr cinfo)
++    {
++        JPEGIStreamSourceMgr *src =
++            reinterpret_cast<JPEGIStreamSourceMgr *>(cinfo->src);
++
++        src->is->read(reinterpret_cast<char *>(src->buffer), BUFFER_SIZE);
++
++        src->pub.next_input_byte = src->buffer;
++        src->pub.bytes_in_buffer = src->is->gcount();
++
++        /*
++         * If the decoder needs more data, but we have no more bytes left to
++         * read mark the end of input.
++         */
++        if (src->pub.bytes_in_buffer == 0) {
++            src->pub.bytes_in_buffer = 2;
++            src->buffer[0] = 0xFF;
++            src->buffer[0] = JPEG_EOI;
++        }
++
++        return TRUE;
++    }
++
++    static void skip_input_data(j_decompress_ptr cinfo, long num_bytes)
++    {
++        JPEGIStreamSourceMgr *src =
++            reinterpret_cast<JPEGIStreamSourceMgr *>(cinfo->src);
++
++        if (num_bytes > 0) {
++            size_t n = static_cast<size_t>(num_bytes);
++            while (n > src->pub.bytes_in_buffer) {
++                n -= src->pub.bytes_in_buffer;
++                (*src->fill_input_buffer)(cinfo);
++            }
++            src->pub.next_input_byte += n;
++            src->pub.bytes_in_buffer -= n;
++        }
++    }
++
++    static void term_source(j_decompress_ptr cinfo)
++    {
++        static_cast<void>(cinfo);
++    }
++};
++
++struct JPEGReaderPrivate
++{
++    JPEGReaderPrivate(const std::string& filename) :
++        source_mgr(filename), jpeg_error(false) {}
++
++    struct jpeg_decompress_struct cinfo;
++    JPEGErrorMgr error_mgr;
++    JPEGIStreamSourceMgr source_mgr;
++    bool jpeg_error;
++};
++
++
++JPEGReader::JPEGReader(const std::string& filename) :
++    priv_(new JPEGReaderPrivate(filename))
++{
++    priv_->jpeg_error = !init(filename);
++}
++
++JPEGReader::~JPEGReader()
++{
++    finish();
++    delete priv_;
++}
++
++bool
++JPEGReader::error()
++{
++    return priv_->jpeg_error || priv_->source_mgr.error();
++}
++
++bool
++JPEGReader::nextRow(unsigned char *dst)
++{
++    bool ret = true;
++    unsigned char *buffer[1];
++    buffer[0] = dst;
++
++    /* Set up error handling */
++    if (setjmp(priv_->error_mgr.jmp_buffer)) {
++        return false;
++    }
++
++    /* While there are lines left, read next line */
++    if (priv_->cinfo.output_scanline < priv_->cinfo.output_height) {
++        jpeg_read_scanlines(&priv_->cinfo, buffer, 1);
++    }
++    else {
++        jpeg_finish_decompress(&priv_->cinfo);
++        ret = false;
++    }
++
++    return ret;
++}
++
++unsigned int
++JPEGReader::width() const
++{
++    return priv_->cinfo.output_width;
++}
++
++unsigned int
++JPEGReader::height() const
++{
++    return priv_->cinfo.output_height;
++}
++
++unsigned int
++JPEGReader::pixelBytes() const
++{
++    return priv_->cinfo.output_components;
++}
++
++bool
++JPEGReader::init(const std::string& filename)
++{
++    Log::debug("Reading JPEG file %s\n", filename.c_str());
++
++    /* Initialize error manager */
++    priv_->cinfo.err = reinterpret_cast<jpeg_error_mgr*>(&priv_->error_mgr);
++
++    if (setjmp(priv_->error_mgr.jmp_buffer)) {
++        return false;
++    }
++
++    jpeg_create_decompress(&priv_->cinfo);
++    priv_->cinfo.src = reinterpret_cast<jpeg_source_mgr*>(&priv_->source_mgr);
++
++    /* Read header */
++    jpeg_read_header(&priv_->cinfo, TRUE);
++
++    jpeg_start_decompress(&priv_->cinfo);
++
++    return true;
++}
++
++void
++JPEGReader::finish()
++{
++    jpeg_destroy_decompress(&priv_->cinfo);
++}
++
++
++
 === added file 'src/image-reader.h'
 --- src/image-reader.h	1970-01-01 00:00:00 +0000
 +++ src/image-reader.h	2012-06-27 16:20:24 +0000
@@ -0,0 +1,77 @@
++/*
++ * Copyright © 2012 Linaro Limited
++ *
++ * This file is part of the glmark2 OpenGL (ES) 2.0 benchmark.
++ *
++ * glmark2 is free software: you can redistribute it and/or modify it under the
++ * terms of the GNU General Public License as published by the Free Software
++ * Foundation, either version 3 of the License, or (at your option) any later
++ * version.
++ *
++ * glmark2 is distributed in the hope that it will be useful, but WITHOUT ANY
++ * WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
++ * FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more
++ * details.
++ *
++ * You should have received a copy of the GNU General Public License along with
++ * glmark2.  If not, see <http://www.gnu.org/licenses/>.
++ *
++ * Authors:
++ *  Alexandros Frantzis
++ */
++#include <string>
++
++class ImageReader
++{
++public:
++    virtual bool error() = 0;
++    virtual bool nextRow(unsigned char *dst) = 0;
++    virtual unsigned int width() const = 0;
++    virtual unsigned int height() const = 0;
++    virtual unsigned int pixelBytes() const = 0;
++    virtual ~ImageReader() {}
++};
++
++class PNGReaderPrivate;
++
++class PNGReader : public ImageReader
++{
++public:
++    PNGReader(const std::string& filename);
++
++    virtual ~PNGReader();
++    bool error();
++    bool nextRow(unsigned char *dst);
++
++    unsigned int width() const;
++    unsigned int height() const;
++    unsigned int pixelBytes() const;
++
++private:
++    bool init(const std::string& filename);
++    void finish();
++
++    PNGReaderPrivate *priv_;
++};
++
++class JPEGReaderPrivate;
++
++class JPEGReader : public ImageReader
++{
++public:
++    JPEGReader(const std::string& filename);
++
++    virtual ~JPEGReader();
++    bool error();
++    bool nextRow(unsigned char *dst);
++    unsigned int width() const;
++    unsigned int height() const;
++    unsigned int pixelBytes() const;
++
++private:
++    bool init(const std::string& filename);
++    void finish();
++
++    JPEGReaderPrivate *priv_;
++};
++
 === added directory 'src/libjpeg-turbo'
 === added file 'src/libjpeg-turbo/README'
 --- src/libjpeg-turbo/README	1970-01-01 00:00:00 +0000
 +++ src/libjpeg-turbo/README	2012-06-27 16:20:24 +0000
@@ -0,0 +1,290 @@
++libjpeg-turbo note:  This file contains portions of the libjpeg v6b and v8
++README files, with additional wordsmithing by The libjpeg-turbo Project.
++It is included only for reference, as some parts of it may not apply to
++libjpeg-turbo.  Please see README-turbo.txt for information specific to
++libjpeg-turbo.
++
++
++The Independent JPEG Group's JPEG software
++==========================================
++
++This distribution contains a release of the Independent JPEG Group's free JPEG
++software.  You are welcome to redistribute this software and to use it for any
++purpose, subject to the conditions under LEGAL ISSUES, below.
++
++This software is the work of Tom Lane, Guido Vollbeding, Philip Gladstone,
++Bill Allombert, Jim Boucher, Lee Crocker, Bob Friesenhahn, Ben Jackson,
++Julian Minguillon, Luis Ortiz, George Phillips, Davide Rossi, Ge' Weijers,
++and other members of the Independent JPEG Group.
++
++IJG is not affiliated with the official ISO JPEG standards committee.
++
++
++DOCUMENTATION ROADMAP
++=====================
++
++This file contains the following sections:
++
++OVERVIEW            General description of JPEG and the IJG software.
++LEGAL ISSUES        Copyright, lack of warranty, terms of distribution.
++REFERENCES          Where to learn more about JPEG.
++ARCHIVE LOCATIONS   Where to find newer versions of this software.
++FILE FORMAT WARS    Software *not* to get.
++TO DO               Plans for future IJG releases.
++
++Other documentation files in the distribution are:
++
++User documentation:
++  install.txt       How to configure and install the IJG software.
++  usage.txt         Usage instructions for cjpeg, djpeg, jpegtran,
++                    rdjpgcom, and wrjpgcom.
++  *.1               Unix-style man pages for programs (same info as usage.txt).
++  wizard.txt        Advanced usage instructions for JPEG wizards only.
++  change.log        Version-to-version change highlights.
++Programmer and internal documentation:
++  libjpeg.txt       How to use the JPEG library in your own programs.
++  example.c         Sample code for calling the JPEG library.
++  structure.txt     Overview of the JPEG library's internal structure.
++  filelist.txt      Road map of IJG files.
++  coderules.txt     Coding style rules --- please read if you contribute code.
++
++Please read at least the files install.txt and usage.txt.  Some information
++can also be found in the JPEG FAQ (Frequently Asked Questions) article.  See
++ARCHIVE LOCATIONS below to find out where to obtain the FAQ article.
++
++If you want to understand how the JPEG code works, we suggest reading one or
++more of the REFERENCES, then looking at the documentation files (in roughly
++the order listed) before diving into the code.
++
++
++OVERVIEW
++========
++
++This package contains C software to implement JPEG image encoding, decoding,
++and transcoding.  JPEG (pronounced "jay-peg") is a standardized compression
++method for full-color and gray-scale images.  JPEG's strong suit is compressing
++photographic images or other types of images that have smooth color and
++brightness transitions between neighboring pixels.  Images with sharp lines or
++other abrupt features may not compress well with JPEG, and a higher JPEG
++quality may have to be used to avoid visible compression artifacts with such
++images.
++
++JPEG is lossy, meaning that the output pixels are not necessarily identical to
++the input pixels.  However, on photographic content and other "smooth" images,
++very good compression ratios can be obtained with no visible compression
++artifacts, and extremely high compression ratios are possible if you are
++willing to sacrifice image quality (by reducing the "quality" setting in the
++compressor.)
++
++This software implements JPEG baseline, extended-sequential, and progressive
++compression processes.  Provision is made for supporting all variants of these
++processes, although some uncommon parameter settings aren't implemented yet.
++We have made no provision for supporting the hierarchical or lossless
++processes defined in the standard.
++
++We provide a set of library routines for reading and writing JPEG image files,
++plus two sample applications "cjpeg" and "djpeg", which use the library to
++perform conversion between JPEG and some other popular image file formats.
++The library is intended to be reused in other applications.
++
++In order to support file conversion and viewing software, we have included
++considerable functionality beyond the bare JPEG coding/decoding capability;
++for example, the color quantization modules are not strictly part of JPEG
++decoding, but they are essential for output to colormapped file formats or
++colormapped displays.  These extra functions can be compiled out of the
++library if not required for a particular application.
++
++We have also included "jpegtran", a utility for lossless transcoding between
++different JPEG processes, and "rdjpgcom" and "wrjpgcom", two simple
++applications for inserting and extracting textual comments in JFIF files.
++
++The emphasis in designing this software has been on achieving portability and
++flexibility, while also making it fast enough to be useful.  In particular,
++the software is not intended to be read as a tutorial on JPEG.  (See the
++REFERENCES section for introductory material.)  Rather, it is intended to
++be reliable, portable, industrial-strength code.  We do not claim to have
++achieved that goal in every aspect of the software, but we strive for it.
++
++We welcome the use of this software as a component of commercial products.
++No royalty is required, but we do ask for an acknowledgement in product
++documentation, as described under LEGAL ISSUES.
++
++
++LEGAL ISSUES
++============
++
++In plain English:
++
++1. We don't promise that this software works.  (But if you find any bugs,
++   please let us know!)
++2. You can use this software for whatever you want.  You don't have to pay us.
++3. You may not pretend that you wrote this software.  If you use it in a
++   program, you must acknowledge somewhere in your documentation that
++   you've used the IJG code.
++
++In legalese:
++
++The authors make NO WARRANTY or representation, either express or implied,
++with respect to this software, its quality, accuracy, merchantability, or
++fitness for a particular purpose.  This software is provided "AS IS", and you,
++its user, assume the entire risk as to its quality and accuracy.
++
++This software is copyright (C) 1991-2010, Thomas G. Lane, Guido Vollbeding.
++All Rights Reserved except as specified below.
++
++Permission is hereby granted to use, copy, modify, and distribute this
++software (or portions thereof) for any purpose, without fee, subject to these
++conditions:
++(1) If any part of the source code for this software is distributed, then this
++README file must be included, with this copyright and no-warranty notice
++unaltered; and any additions, deletions, or changes to the original files
++must be clearly indicated in accompanying documentation.
++(2) If only executable code is distributed, then the accompanying
++documentation must state that "this software is based in part on the work of
++the Independent JPEG Group".
++(3) Permission for use of this software is granted only if the user accepts
++full responsibility for any undesirable consequences; the authors accept
++NO LIABILITY for damages of any kind.
++
++These conditions apply to any software derived from or based on the IJG code,
++not just to the unmodified library.  If you use our work, you ought to
++acknowledge us.
++
++Permission is NOT granted for the use of any IJG author's name or company name
++in advertising or publicity relating to this software or products derived from
++it.  This software may be referred to only as "the Independent JPEG Group's
++software".
++
++We specifically permit and encourage the use of this software as the basis of
++commercial products, provided that all warranty or liability claims are
++assumed by the product vendor.
++
++
++ansi2knr.c is included in this distribution by permission of L. Peter Deutsch,
++sole proprietor of its copyright holder, Aladdin Enterprises of Menlo Park, CA.
++ansi2knr.c is NOT covered by the above copyright and conditions, but instead
++by the usual distribution terms of the Free Software Foundation; principally,
++that you must include source code if you redistribute it.  (See the file
++ansi2knr.c for full details.)  However, since ansi2knr.c is not needed as part
++of any program generated from the IJG code, this does not limit you more than
++the foregoing paragraphs do.
++
++The Unix configuration script "configure" was produced with GNU Autoconf.
++It is copyright by the Free Software Foundation but is freely distributable.
++The same holds for its supporting scripts (config.guess, config.sub,
++ltmain.sh).  Another support script, install-sh, is copyright by X Consortium
++but is also freely distributable.
++
++The IJG distribution formerly included code to read and write GIF files.
++To avoid entanglement with the Unisys LZW patent, GIF reading support has
++been removed altogether, and the GIF writer has been simplified to produce
++"uncompressed GIFs".  This technique does not use the LZW algorithm; the
++resulting GIF files are larger than usual, but are readable by all standard
++GIF decoders.
++
++We are required to state that
++    "The Graphics Interchange Format(c) is the Copyright property of
++    CompuServe Incorporated.  GIF(sm) is a Service Mark property of
++    CompuServe Incorporated."
++
++
++REFERENCES
++==========
++
++We recommend reading one or more of these references before trying to
++understand the innards of the JPEG software.
++
++The best short technical introduction to the JPEG compression algorithm is
++	Wallace, Gregory K.  "The JPEG Still Picture Compression Standard",
++	Communications of the ACM, April 1991 (vol. 34 no. 4), pp. 30-44.
++(Adjacent articles in that issue discuss MPEG motion picture compression,
++applications of JPEG, and related topics.)  If you don't have the CACM issue
++handy, a PostScript file containing a revised version of Wallace's article is
++available at http://www.ijg.org/files/wallace.ps.gz.  The file (actually
++a preprint for an article that appeared in IEEE Trans. Consumer Electronics)
++omits the sample images that appeared in CACM, but it includes corrections
++and some added material.  Note: the Wallace article is copyright ACM and IEEE,
++and it may not be used for commercial purposes.
++
++A somewhat less technical, more leisurely introduction to JPEG can be found in
++"The Data Compression Book" by Mark Nelson and Jean-loup Gailly, published by
++M&T Books (New York), 2nd ed. 1996, ISBN 1-55851-434-1.  This book provides
++good explanations and example C code for a multitude of compression methods
++including JPEG.  It is an excellent source if you are comfortable reading C
++code but don't know much about data compression in general.  The book's JPEG
++sample code is far from industrial-strength, but when you are ready to look
++at a full implementation, you've got one here...
++
++The best currently available description of JPEG is the textbook "JPEG Still
++Image Data Compression Standard" by William B. Pennebaker and Joan L.
++Mitchell, published by Van Nostrand Reinhold, 1993, ISBN 0-442-01272-1.
++Price US$59.95, 638 pp.  The book includes the complete text of the ISO JPEG
++standards (DIS 10918-1 and draft DIS 10918-2).
++
++The original JPEG standard is divided into two parts, Part 1 being the actual
++specification, while Part 2 covers compliance testing methods.  Part 1 is
++titled "Digital Compression and Coding of Continuous-tone Still Images,
++Part 1: Requirements and guidelines" and has document numbers ISO/IEC IS
++10918-1, ITU-T T.81.  Part 2 is titled "Digital Compression and Coding of
++Continuous-tone Still Images, Part 2: Compliance testing" and has document
++numbers ISO/IEC IS 10918-2, ITU-T T.83.
++
++The JPEG standard does not specify all details of an interchangeable file
++format.  For the omitted details we follow the "JFIF" conventions, revision
++1.02.  JFIF 1.02 has been adopted as an Ecma International Technical Report
++and thus received a formal publication status.  It is available as a free
++download in PDF format from
++http://www.ecma-international.org/publications/techreports/E-TR-098.htm.
++A PostScript version of the JFIF document is available at
++http://www.ijg.org/files/jfif.ps.gz.  There is also a plain text version at
++http://www.ijg.org/files/jfif.txt.gz, but it is missing the figures.
++
++The TIFF 6.0 file format specification can be obtained by FTP from
++ftp://ftp.sgi.com/graphics/tiff/TIFF6.ps.gz.  The JPEG incorporation scheme
++found in the TIFF 6.0 spec of 3-June-92 has a number of serious problems.
++IJG does not recommend use of the TIFF 6.0 design (TIFF Compression tag 6).
++Instead, we recommend the JPEG design proposed by TIFF Technical Note #2
++(Compression tag 7).  Copies of this Note can be obtained from
++http://www.ijg.org/files/.  It is expected that the next revision
++of the TIFF spec will replace the 6.0 JPEG design with the Note's design.
++Although IJG's own code does not support TIFF/JPEG, the free libtiff library
++uses our library to implement TIFF/JPEG per the Note.
++
++
++ARCHIVE LOCATIONS
++=================
++
++The "official" archive site for this software is www.ijg.org.
++The most recent released version can always be found there in
++directory "files".  This particular version will be archived as
++http://www.ijg.org/files/jpegsrc.v8d.tar.gz, and in Windows-compatible
++"zip" archive format as http://www.ijg.org/files/jpegsr8d.zip.
++
++The JPEG FAQ (Frequently Asked Questions) article is a source of some
++general information about JPEG.
++It is available on the World Wide Web at http://www.faqs.org/faqs/jpeg-faq/
++and other news.answers archive sites, including the official news.answers
++archive at rtfm.mit.edu: ftp://rtfm.mit.edu/pub/usenet/news.answers/jpeg-faq/.
++If you don't have Web or FTP access, send e-mail to mail-server@rtfm.mit.edu
++with body
++	send usenet/news.answers/jpeg-faq/part1
++	send usenet/news.answers/jpeg-faq/part2
++
++
++FILE FORMAT WARS
++================
++
++The ISO JPEG standards committee actually promotes different formats like
++"JPEG 2000" or "JPEG XR", which are incompatible with original DCT-based
++JPEG.  IJG therefore does not support these formats (see REFERENCES).  Indeed,
++one of the original reasons for developing this free software was to help
++force convergence on common, interoperable format standards for JPEG files.
++Don't use an incompatible file format!
++(In any case, our decoder will remain capable of reading existing JPEG
++image files indefinitely.)
++
++
++TO DO
++=====
++
++Please send bug reports, offers of help, etc. to jpeg-info@jpegclub.org.
 === added file 'src/libjpeg-turbo/README-turbo.txt'
 --- src/libjpeg-turbo/README-turbo.txt	1970-01-01 00:00:00 +0000
 +++ src/libjpeg-turbo/README-turbo.txt	2012-06-27 16:20:24 +0000
@@ -0,0 +1,361 @@
++*******************************************************************************
++**     Background
++*******************************************************************************
++
++libjpeg-turbo is a derivative of libjpeg that uses SIMD instructions (MMX,
++SSE2, NEON) to accelerate baseline JPEG compression and decompression on x86,
++x86-64, and ARM systems.  On such systems, libjpeg-turbo is generally 2-4x as
++fast as the unmodified version of libjpeg, all else being equal.
++
++libjpeg-turbo was originally based on libjpeg/SIMD by Miyasaka Masaru, but
++the TigerVNC and VirtualGL projects made numerous enhancements to the codec in
++2009, including improved support for Mac OS X, 64-bit support, support for
++32-bit and big-endian pixel formats (RGBX, XBGR, etc.), accelerated Huffman
++encoding/decoding, and various bug fixes.  The goal was to produce a fully
++open-source codec that could replace the partially closed-source TurboJPEG/IPP
++codec used by VirtualGL and TurboVNC.  libjpeg-turbo generally achieves 80-120%
++of the performance of TurboJPEG/IPP.  It is faster in some areas but slower in
++others.
++
++In early 2010, libjpeg-turbo spun off into its own independent project, with
++the goal of making high-speed JPEG compression/decompression technology
++available to a broader range of users and developers.
++
++
++*******************************************************************************
++**     License
++*******************************************************************************
++
++Most of libjpeg-turbo inherits the non-restrictive, BSD-style license used by
++libjpeg (see README.)  The TurboJPEG/OSS wrapper (both C and Java versions) and
++associated test programs bear a similar license, which is reproduced below:
++
++Redistribution and use in source and binary forms, with or without
++modification, are permitted provided that the following conditions are met:
++
++- Redistributions of source code must retain the above copyright notice,
++  this list of conditions and the following disclaimer.
++- Redistributions in binary form must reproduce the above copyright notice,
++  this list of conditions and the following disclaimer in the documentation
++  and/or other materials provided with the distribution.
++- Neither the name of the libjpeg-turbo Project nor the names of its
++  contributors may be used to endorse or promote products derived from this
++  software without specific prior written permission.
++
++THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS",
++AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
++IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
++ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE
++LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
++CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
++SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
++INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
++CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
++ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
++POSSIBILITY OF SUCH DAMAGE.
++
++
++*******************************************************************************
++**     Using libjpeg-turbo
++*******************************************************************************
++
++libjpeg-turbo includes two APIs that can be used to compress and decompress
++JPEG images:
++
++  TurboJPEG API:  This API provides an easy-to-use interface for compressing
++  and decompressing JPEG images in memory.  It also provides some functionality
++  that would not be straightforward to achieve using the underlying libjpeg
++  API, such as generating planar YUV images and performing multiple
++  simultaneous lossless transforms on an image.  The Java interface for
++  libjpeg-turbo is written on top of the TurboJPEG API.
++
++  libjpeg API:  This is the de facto industry-standard API for compressing and
++  decompressing JPEG images.  It is more difficult to use than the TurboJPEG
++  API but also more powerful.  libjpeg-turbo is both API/ABI-compatible and
++  mathematically compatible with libjpeg v6b.  It can also optionally be
++  configured to be API/ABI-compatible with libjpeg v7 and v8 (see below.)
++
++
++=============================
++Replacing libjpeg at Run Time
++=============================
++
++If a Unix application is dynamically linked with libjpeg, then you can replace
++libjpeg with libjpeg-turbo at run time by manipulating LD_LIBRARY_PATH.
++For instance:
++
++  [Using libjpeg]
++  > time cjpeg <vgl_5674_0098.ppm >vgl_5674_0098.jpg
++  real  0m0.392s
++  user  0m0.074s
++  sys   0m0.020s
++
++  [Using libjpeg-turbo]
++  > export LD_LIBRARY_PATH=/opt/libjpeg-turbo/{lib}:$LD_LIBRARY_PATH
++  > time cjpeg <vgl_5674_0098.ppm >vgl_5674_0098.jpg
++  real  0m0.109s
++  user  0m0.029s
++  sys   0m0.010s
++
++NOTE: {lib} can be lib, lib32, lib64, or lib/64, depending on the O/S and
++architecture.
++
++System administrators can also replace the libjpeg sym links in /usr/{lib} with
++links to the libjpeg-turbo dynamic library located in /opt/libjpeg-turbo/{lib}.
++This will effectively accelerate every application that uses the libjpeg
++dynamic library on the system.
++
++The libjpeg-turbo SDK for Visual C++ installs the libjpeg-turbo DLL
++(jpeg62.dll, jpeg7.dll, or jpeg8.dll, depending on whether it was built with
++libjpeg v6b, v7, or v8 emulation) into c:\libjpeg-turbo[64]\bin, and the PATH
++environment variable can be modified such that this directory is searched
++before any others that might contain a libjpeg DLL.  However, if a libjpeg
++DLL exists in an application's install directory, then Windows will load this
++DLL first whenever the application is launched.  Thus, if an application ships
++with jpeg62.dll, jpeg7.dll, or jpeg8.dll, then back up the application's
++version of this DLL and copy c:\libjpeg-turbo[64]\bin\jpeg*.dll into the
++application's install directory to accelerate it.
++
++The version of the libjpeg-turbo DLL distributed in the libjpeg-turbo SDK for
++Visual C++ requires the Visual C++ 2008 C run-time DLL (msvcr90.dll).
++msvcr90.dll ships with more recent versions of Windows, but users of older
++Windows releases can obtain it from the Visual C++ 2008 Redistributable
++Package, which is available as a free download from Microsoft's web site.
++
++NOTE:  Features of libjpeg that require passing a C run-time structure, such
++as a file handle, from an application to libjpeg will probably not work with
++the version of the libjpeg-turbo DLL distributed in the libjpeg-turbo SDK for
++Visual C++, unless the application is also built to use the Visual C++ 2008 C
++run-time DLL.  In particular, this affects jpeg_stdio_dest() and
++jpeg_stdio_src().
++
++Mac applications typically embed their own copies of the libjpeg dylib inside
++the (hidden) application bundle, so it is not possible to globally replace
++libjpeg on OS X systems.  If an application uses a shared library version of
++libjpeg, then it may be possible to replace the application's version of it.
++This would generally involve copying libjpeg.*.dylib from libjpeg-turbo into
++the appropriate place in the application bundle and using install_name_tool to
++repoint the dylib to the new directory.  This requires an advanced knowledge of
++OS X and would not survive an upgrade or a re-install of the application.
++Thus, it is not recommended for most users.
++
++=======================
++Replacing TurboJPEG/IPP
++=======================
++
++libjpeg-turbo is a drop-in replacement for the TurboJPEG/IPP SDK used by
++VirtualGL 2.1.x and TurboVNC 0.6 (and prior.)  libjpeg-turbo contains a wrapper
++library (TurboJPEG/OSS) that emulates the TurboJPEG API using libjpeg-turbo
++instead of the closed-source Intel Performance Primitives.  You can replace the
++TurboJPEG/IPP package on Linux systems with the libjpeg-turbo package in order
++to make existing releases of VirtualGL 2.1.x and TurboVNC 0.x use the new codec
++at run time.  Note that the 64-bit libjpeg-turbo packages contain only 64-bit
++binaries, whereas the TurboJPEG/IPP 64-bit packages contained both 64-bit and
++32-bit binaries.  Thus, to replace a TurboJPEG/IPP 64-bit package, install
++both the 64-bit and 32-bit versions of libjpeg-turbo.
++
++You can also build the VirtualGL 2.1.x and TurboVNC 0.6 source code with
++the libjpeg-turbo SDK instead of TurboJPEG/IPP.  It should work identically.
++libjpeg-turbo also includes static library versions of TurboJPEG/OSS, which
++are used to build VirtualGL 2.2 and TurboVNC 1.0 and later.
++
++========================================
++Using libjpeg-turbo in Your Own Programs
++========================================
++
++For the most part, libjpeg-turbo should work identically to libjpeg, so in
++most cases, an application can be built against libjpeg and then run against
++libjpeg-turbo.  On Unix systems (including Cygwin), you can build against
++libjpeg-turbo instead of libjpeg by setting
++
++  CPATH=/opt/libjpeg-turbo/include
++  and
++  LIBRARY_PATH=/opt/libjpeg-turbo/{lib}
++
++({lib} = lib32 or lib64, depending on whether you are building a 32-bit or a
++64-bit application.)
++
++If using MinGW, then set
++
++  CPATH=/c/libjpeg-turbo-gcc[64]/include
++  and
++  LIBRARY_PATH=/c/libjpeg-turbo-gcc[64]/lib
++
++Building against libjpeg-turbo is useful, for instance, if you want to build an
++application that leverages the libjpeg-turbo colorspace extensions (see below.)
++On Linux and Solaris systems, you would still need to manipulate
++LD_LIBRARY_PATH or create appropriate sym links to use libjpeg-turbo at run
++time.  On such systems, you can pass -R /opt/libjpeg-turbo/{lib} to the linker
++to force the use of libjpeg-turbo at run time rather than libjpeg (also useful
++if you want to leverage the colorspace extensions), or you can link against the
++libjpeg-turbo static library.
++
++To force a Linux, Solaris, or MinGW application to link against the static
++version of libjpeg-turbo, you can use the following linker options:
++
++  -Wl,-Bstatic -ljpeg -Wl,-Bdynamic
++
++On OS X, simply add /opt/libjpeg-turbo/lib/libjpeg.a to the linker command
++line (this also works on Linux and Solaris.)
++
++To build Visual C++ applications using libjpeg-turbo, add
++c:\libjpeg-turbo[64]\include to the system or user INCLUDE environment
++variable and c:\libjpeg-turbo[64]\lib to the system or user LIB environment
++variable, and then link against either jpeg.lib (to use the DLL version of
++libjpeg-turbo) or jpeg-static.lib (to use the static version of libjpeg-turbo.)
++
++=====================
++Colorspace Extensions
++=====================
++
++libjpeg-turbo includes extensions that allow JPEG images to be compressed
++directly from (and decompressed directly to) buffers that use BGR, BGRX,
++RGBX, XBGR, and XRGB pixel ordering.  This is implemented with ten new
++colorspace constants:
++
++  JCS_EXT_RGB   /* red/green/blue */
++  JCS_EXT_RGBX  /* red/green/blue/x */
++  JCS_EXT_BGR   /* blue/green/red */
++  JCS_EXT_BGRX  /* blue/green/red/x */
++  JCS_EXT_XBGR  /* x/blue/green/red */
++  JCS_EXT_XRGB  /* x/red/green/blue */
++  JCS_EXT_RGBA  /* red/green/blue/alpha */
++  JCS_EXT_BGRA  /* blue/green/red/alpha */
++  JCS_EXT_ABGR  /* alpha/blue/green/red */
++  JCS_EXT_ARGB  /* alpha/red/green/blue */
++
++Setting cinfo.in_color_space (compression) or cinfo.out_color_space
++(decompression) to one of these values will cause libjpeg-turbo to read the
++red, green, and blue values from (or write them to) the appropriate position in
++the pixel when compressing from/decompressing to an RGB buffer.
++
++Your application can check for the existence of these extensions at compile
++time with:
++
++  #ifdef JCS_EXTENSIONS
++
++At run time, attempting to use these extensions with a version of libjpeg
++that doesn't support them will result in a "Bogus input colorspace" error.
++
++When using the RGBX, BGRX, XBGR, and XRGB colorspaces during decompression, the
++X byte is undefined, and in order to ensure the best performance, libjpeg-turbo
++can set that byte to whatever value it wishes.  If an application expects the X
++byte to be used as an alpha channel, then it should specify JCS_EXT_RGBA,
++JCS_EXT_BGRA, JCS_EXT_ABGR, or JCS_EXT_ARGB.  When these colorspace constants
++are used, the X byte is guaranteed to be 0xFF, which is interpreted as opaque.
++
++Your application can check for the existence of the alpha channel colorspace
++extensions at compile time with:
++
++  #ifdef JCS_ALPHA_EXTENSIONS
++
++jcstest.c, located in the libjpeg-turbo source tree, demonstrates how to check
++for the existence of the colorspace extensions at compile time and run time.
++
++=================================
++libjpeg v7 and v8 API/ABI support
++=================================
++
++With libjpeg v7 and v8, new features were added that necessitated extending the
++compression and decompression structures.  Unfortunately, due to the exposed
++nature of those structures, extending them also necessitated breaking backward
++ABI compatibility with previous libjpeg releases.  Thus, programs that are
++built to use libjpeg v7 or v8 did not work with libjpeg-turbo, since it is
++based on the libjpeg v6b code base.  Although libjpeg v7 and v8 are still not
++as widely used as v6b, enough programs (including a few Linux distros) have
++made the switch that it was desirable to provide support for the libjpeg v7/v8
++API/ABI in libjpeg-turbo.  Although libjpeg-turbo can now be configured as a
++drop-in replacement for libjpeg v7 or v8, it should be noted that not all of
++the features in libjpeg v7 and v8 are supported (see below.)
++
++By passing an argument of --with-jpeg7 or --with-jpeg8 to configure, or an
++argument of -DWITH_JPEG7=1 or -DWITH_JPEG8=1 to cmake, you can build a version
++of libjpeg-turbo that emulates the libjpeg v7 or v8 API/ABI, so that programs
++that are built against libjpeg v7 or v8 can be run with libjpeg-turbo.  The
++following section describes which libjpeg v7+ features are supported and which
++aren't.
++
++libjpeg v7 and v8 Features:
++---------------------------
++
++Fully supported:
++
++-- cjpeg: Separate quality settings for luminance and chrominance
++   Note that the libpjeg v7+ API was extended to accommodate this feature only
++   for convenience purposes.  It has always been possible to implement this
++   feature with libjpeg v6b (see rdswitch.c for an example.)
++
++-- cjpeg: 32-bit BMP support
++
++-- jpegtran: lossless cropping
++
++-- jpegtran: -perfect option
++
++-- rdjpgcom: -raw option
++
++-- rdjpgcom: locale awareness
++
++
++Fully supported when using libjpeg v7/v8 emulation:
++
++-- libjpeg: In-memory source and destination managers
++
++
++Not supported:
++
++-- libjpeg: DCT scaling in compressor
++   cinfo.scale_num and cinfo.scale_denom are silently ignored.
++   There is no technical reason why DCT scaling cannot be supported, but
++   without the SmartScale extension (see below), it would only be able to
++   down-scale using ratios of 1/2, 8/15, 4/7, 8/13, 2/3, 8/11, 4/5, and 8/9,
++   which is of limited usefulness.
++
++-- libjpeg: SmartScale
++   cinfo.block_size is silently ignored.
++   SmartScale is an extension to the JPEG format that allows for DCT block
++   sizes other than 8x8.  It would be difficult to support this feature while
++   retaining backward compatibility with libjpeg v6b.
++
++-- libjpeg: IDCT scaling extensions in decompressor
++   libjpeg-turbo still supports IDCT scaling with scaling factors of 1/2, 1/4,
++   and 1/8 (same as libjpeg v6b.)
++
++-- libjpeg: Fancy downsampling in compressor
++   cinfo.do_fancy_downsampling is silently ignored.
++   This requires the DCT scaling feature, which is not supported.
++
++-- jpegtran: Scaling
++   This requires both the DCT scaling and SmartScale features, which are not
++   supported.
++
++-- Lossless RGB JPEG files
++   This requires the SmartScale feature, which is not supported.
++
++
++*******************************************************************************
++**     Performance pitfalls
++*******************************************************************************
++
++===============
++Restart Markers
++===============
++
++The optimized Huffman decoder in libjpeg-turbo does not handle restart markers
++in a way that makes the rest of the libjpeg infrastructure happy, so it is
++necessary to use the slow Huffman decoder when decompressing a JPEG image that
++has restart markers.  This can cause the decompression performance to drop by
++as much as 20%, but the performance will still be much greater than that of
++libjpeg.  Many consumer packages, such as PhotoShop, use restart markers when
++generating JPEG images, so images generated by those programs will experience
++this issue.
++
++===============================================
++Fast Integer Forward DCT at High Quality Levels
++===============================================
++
++The algorithm used by the SIMD-accelerated quantization function cannot produce
++correct results whenever the fast integer forward DCT is used along with a JPEG
++quality of 98-100.  Thus, libjpeg-turbo must use the non-SIMD quantization
++function in those cases.  This causes performance to drop by as much as 40%.
++It is therefore strongly advised that you use the slow integer forward DCT
++whenever encoding images with a JPEG quality of 98 or higher.
 === added file 'src/libjpeg-turbo/config.h'
 --- src/libjpeg-turbo/config.h	1970-01-01 00:00:00 +0000
 +++ src/libjpeg-turbo/config.h	2012-06-27 16:20:24 +0000
@@ -0,0 +1,137 @@
++/* config.h.  Generated from config.h.in by configure.  */
++/* config.h.in.  Generated from configure.ac by autoheader.  */
++
++/* Build number */
++#define BUILD "20120626"
++
++/* Support arithmetic encoding */
++#define C_ARITH_CODING_SUPPORTED 1
++
++/* Support arithmetic decoding */
++#define D_ARITH_CODING_SUPPORTED 1
++
++/* Define to 1 if you have the <dlfcn.h> header file. */
++#define HAVE_DLFCN_H 1
++
++/* Define to 1 if you have the <inttypes.h> header file. */
++#define HAVE_INTTYPES_H 1
++
++/* Define to 1 if you have the <jni.h> header file. */
++/* #undef HAVE_JNI_H */
++
++/* Define to 1 if you have the `memcpy' function. */
++#define HAVE_MEMCPY 1
++
++/* Define to 1 if you have the <memory.h> header file. */
++#define HAVE_MEMORY_H 1
++
++/* Define to 1 if you have the `memset' function. */
++#define HAVE_MEMSET 1
++
++/* Define if your compiler supports prototypes */
++#define HAVE_PROTOTYPES 1
++
++/* Define to 1 if you have the <stddef.h> header file. */
++#define HAVE_STDDEF_H 1
++
++/* Define to 1 if you have the <stdint.h> header file. */
++#define HAVE_STDINT_H 1
++
++/* Define to 1 if you have the <stdlib.h> header file. */
++#define HAVE_STDLIB_H 1
++
++/* Define to 1 if you have the <strings.h> header file. */
++#define HAVE_STRINGS_H 1
++
++/* Define to 1 if you have the <string.h> header file. */
++#define HAVE_STRING_H 1
++
++/* Define to 1 if you have the <sys/stat.h> header file. */
++#define HAVE_SYS_STAT_H 1
++
++/* Define to 1 if you have the <sys/types.h> header file. */
++#define HAVE_SYS_TYPES_H 1
++
++/* Define to 1 if you have the <unistd.h> header file. */
++#define HAVE_UNISTD_H 1
++
++/* Define to 1 if the system has the type `unsigned char'. */
++#define HAVE_UNSIGNED_CHAR 1
++
++/* Define to 1 if the system has the type `unsigned short'. */
++#define HAVE_UNSIGNED_SHORT 1
++
++/* Compiler does not support pointers to undefined structures. */
++/* #undef INCOMPLETE_TYPES_BROKEN */
++
++/* How to obtain function inlining. */
++#define INLINE __attribute__((always_inline))
++
++/* libjpeg API version */
++#define JPEG_LIB_VERSION 62
++
++/* libjpeg-turbo version */
++#define LIBJPEG_TURBO_VERSION 1.2.0
++
++/* Define to the sub-directory in which libtool stores uninstalled libraries.
++   */
++#define LT_OBJDIR ".libs/"
++
++/* Define if you have BSD-like bzero and bcopy */
++/* #undef NEED_BSD_STRINGS */
++
++/* Define if you need short function names */
++/* #undef NEED_SHORT_EXTERNAL_NAMES */
++
++/* Define if you have sys/types.h */
++#define NEED_SYS_TYPES_H 1
++
++/* Name of package */
++#define PACKAGE "libjpeg-turbo"
++
++/* Define to the address where bug reports for this package should be sent. */
++#define PACKAGE_BUGREPORT ""
++
++/* Define to the full name of this package. */
++#define PACKAGE_NAME "libjpeg-turbo"
++
++/* Define to the full name and version of this package. */
++#define PACKAGE_STRING "libjpeg-turbo 1.2.0"
++
++/* Define to the one symbol short name of this package. */
++#define PACKAGE_TARNAME "libjpeg-turbo"
++
++/* Define to the home page for this package. */
++#define PACKAGE_URL ""
++
++/* Define to the version of this package. */
++#define PACKAGE_VERSION "1.2.0"
++
++/* Define if shift is unsigned */
++/* #undef RIGHT_SHIFT_IS_UNSIGNED */
++
++/* Define to 1 if you have the ANSI C header files. */
++#define STDC_HEADERS 1
++
++/* Version number of package */
++#define VERSION "1.2.0"
++
++/* Use accelerated SIMD routines. */
++#define WITH_SIMD 1
++
++/* Define to 1 if type `char' is unsigned and you are not using gcc.  */
++#ifndef __CHAR_UNSIGNED__
++/* # undef __CHAR_UNSIGNED__ */
++#endif
++
++/* Define to empty if `const' does not conform to ANSI C. */
++/* #undef const */
++
++/* Define to `__inline__' or `__inline' if that's what the C compiler
++   calls it, or to nothing if 'inline' is not supported under any name.  */
++#ifndef __cplusplus
++/* #undef inline */
++#endif
++
++/* Define to `unsigned int' if <sys/types.h> does not define. */
++/* #undef size_t */
 === added file 'src/libjpeg-turbo/jaricom.c'
 --- src/libjpeg-turbo/jaricom.c	1970-01-01 00:00:00 +0000
 +++ src/libjpeg-turbo/jaricom.c	2012-06-27 16:20:24 +0000
@@ -0,0 +1,153 @@
++/*
++ * jaricom.c
++ *
++ * Developed 1997-2009 by Guido Vollbeding.
++ * This file is part of the Independent JPEG Group's software.
++ * For conditions of distribution and use, see the accompanying README file.
++ *
++ * This file contains probability estimation tables for common use in
++ * arithmetic entropy encoding and decoding routines.
++ *
++ * This data represents Table D.2 in the JPEG spec (ISO/IEC IS 10918-1
++ * and CCITT Recommendation ITU-T T.81) and Table 24 in the JBIG spec
++ * (ISO/IEC IS 11544 and CCITT Recommendation ITU-T T.82).
++ */
++
++#define JPEG_INTERNALS
++#include "jinclude.h"
++#include "jpeglib.h"
++
++/* The following #define specifies the packing of the four components
++ * into the compact INT32 representation.
++ * Note that this formula must match the actual arithmetic encoder
++ * and decoder implementation.  The implementation has to be changed
++ * if this formula is changed.
++ * The current organization is leaned on Markus Kuhn's JBIG
++ * implementation (jbig_tab.c).
++ */
++
++#define V(i,a,b,c,d) (((INT32)a << 16) | ((INT32)c << 8) | ((INT32)d << 7) | b)
++
++const INT32 jpeg_aritab[113+1] = {
++/*
++ * Index, Qe_Value, Next_Index_LPS, Next_Index_MPS, Switch_MPS
++ */
++  V(   0, 0x5a1d,   1,   1, 1 ),
++  V(   1, 0x2586,  14,   2, 0 ),
++  V(   2, 0x1114,  16,   3, 0 ),
++  V(   3, 0x080b,  18,   4, 0 ),
++  V(   4, 0x03d8,  20,   5, 0 ),
++  V(   5, 0x01da,  23,   6, 0 ),
++  V(   6, 0x00e5,  25,   7, 0 ),
++  V(   7, 0x006f,  28,   8, 0 ),
++  V(   8, 0x0036,  30,   9, 0 ),
++  V(   9, 0x001a,  33,  10, 0 ),
++  V(  10, 0x000d,  35,  11, 0 ),
++  V(  11, 0x0006,   9,  12, 0 ),
++  V(  12, 0x0003,  10,  13, 0 ),
++  V(  13, 0x0001,  12,  13, 0 ),
++  V(  14, 0x5a7f,  15,  15, 1 ),
++  V(  15, 0x3f25,  36,  16, 0 ),
++  V(  16, 0x2cf2,  38,  17, 0 ),
++  V(  17, 0x207c,  39,  18, 0 ),
++  V(  18, 0x17b9,  40,  19, 0 ),
++  V(  19, 0x1182,  42,  20, 0 ),
++  V(  20, 0x0cef,  43,  21, 0 ),
++  V(  21, 0x09a1,  45,  22, 0 ),
++  V(  22, 0x072f,  46,  23, 0 ),
++  V(  23, 0x055c,  48,  24, 0 ),
++  V(  24, 0x0406,  49,  25, 0 ),
++  V(  25, 0x0303,  51,  26, 0 ),
++  V(  26, 0x0240,  52,  27, 0 ),
++  V(  27, 0x01b1,  54,  28, 0 ),
++  V(  28, 0x0144,  56,  29, 0 ),
++  V(  29, 0x00f5,  57,  30, 0 ),
++  V(  30, 0x00b7,  59,  31, 0 ),
++  V(  31, 0x008a,  60,  32, 0 ),
++  V(  32, 0x0068,  62,  33, 0 ),
++  V(  33, 0x004e,  63,  34, 0 ),
++  V(  34, 0x003b,  32,  35, 0 ),
++  V(  35, 0x002c,  33,   9, 0 ),
++  V(  36, 0x5ae1,  37,  37, 1 ),
++  V(  37, 0x484c,  64,  38, 0 ),
++  V(  38, 0x3a0d,  65,  39, 0 ),
++  V(  39, 0x2ef1,  67,  40, 0 ),
++  V(  40, 0x261f,  68,  41, 0 ),
++  V(  41, 0x1f33,  69,  42, 0 ),
++  V(  42, 0x19a8,  70,  43, 0 ),
++  V(  43, 0x1518,  72,  44, 0 ),
++  V(  44, 0x1177,  73,  45, 0 ),
++  V(  45, 0x0e74,  74,  46, 0 ),
++  V(  46, 0x0bfb,  75,  47, 0 ),
++  V(  47, 0x09f8,  77,  48, 0 ),
++  V(  48, 0x0861,  78,  49, 0 ),
++  V(  49, 0x0706,  79,  50, 0 ),
++  V(  50, 0x05cd,  48,  51, 0 ),
++  V(  51, 0x04de,  50,  52, 0 ),
++  V(  52, 0x040f,  50,  53, 0 ),
++  V(  53, 0x0363,  51,  54, 0 ),
++  V(  54, 0x02d4,  52,  55, 0 ),
++  V(  55, 0x025c,  53,  56, 0 ),
++  V(  56, 0x01f8,  54,  57, 0 ),
++  V(  57, 0x01a4,  55,  58, 0 ),
++  V(  58, 0x0160,  56,  59, 0 ),
++  V(  59, 0x0125,  57,  60, 0 ),
++  V(  60, 0x00f6,  58,  61, 0 ),
++  V(  61, 0x00cb,  59,  62, 0 ),
++  V(  62, 0x00ab,  61,  63, 0 ),
++  V(  63, 0x008f,  61,  32, 0 ),
++  V(  64, 0x5b12,  65,  65, 1 ),
++  V(  65, 0x4d04,  80,  66, 0 ),
++  V(  66, 0x412c,  81,  67, 0 ),
++  V(  67, 0x37d8,  82,  68, 0 ),
++  V(  68, 0x2fe8,  83,  69, 0 ),
++  V(  69, 0x293c,  84,  70, 0 ),
++  V(  70, 0x2379,  86,  71, 0 ),
++  V(  71, 0x1edf,  87,  72, 0 ),
++  V(  72, 0x1aa9,  87,  73, 0 ),
++  V(  73, 0x174e,  72,  74, 0 ),
++  V(  74, 0x1424,  72,  75, 0 ),
++  V(  75, 0x119c,  74,  76, 0 ),
++  V(  76, 0x0f6b,  74,  77, 0 ),
++  V(  77, 0x0d51,  75,  78, 0 ),
++  V(  78, 0x0bb6,  77,  79, 0 ),
++  V(  79, 0x0a40,  77,  48, 0 ),
++  V(  80, 0x5832,  80,  81, 1 ),
++  V(  81, 0x4d1c,  88,  82, 0 ),
++  V(  82, 0x438e,  89,  83, 0 ),
++  V(  83, 0x3bdd,  90,  84, 0 ),
++  V(  84, 0x34ee,  91,  85, 0 ),
++  V(  85, 0x2eae,  92,  86, 0 ),
++  V(  86, 0x299a,  93,  87, 0 ),
++  V(  87, 0x2516,  86,  71, 0 ),
++  V(  88, 0x5570,  88,  89, 1 ),
++  V(  89, 0x4ca9,  95,  90, 0 ),
++  V(  90, 0x44d9,  96,  91, 0 ),
++  V(  91, 0x3e22,  97,  92, 0 ),
++  V(  92, 0x3824,  99,  93, 0 ),
++  V(  93, 0x32b4,  99,  94, 0 ),
++  V(  94, 0x2e17,  93,  86, 0 ),
++  V(  95, 0x56a8,  95,  96, 1 ),
++  V(  96, 0x4f46, 101,  97, 0 ),
++  V(  97, 0x47e5, 102,  98, 0 ),
++  V(  98, 0x41cf, 103,  99, 0 ),
++  V(  99, 0x3c3d, 104, 100, 0 ),
++  V( 100, 0x375e,  99,  93, 0 ),
++  V( 101, 0x5231, 105, 102, 0 ),
++  V( 102, 0x4c0f, 106, 103, 0 ),
++  V( 103, 0x4639, 107, 104, 0 ),
++  V( 104, 0x415e, 103,  99, 0 ),
++  V( 105, 0x5627, 105, 106, 1 ),
++  V( 106, 0x50e7, 108, 107, 0 ),
++  V( 107, 0x4b85, 109, 103, 0 ),
++  V( 108, 0x5597, 110, 109, 0 ),
++  V( 109, 0x504f, 111, 107, 0 ),
++  V( 110, 0x5a10, 110, 111, 1 ),
++  V( 111, 0x5522, 112, 109, 0 ),
++  V( 112, 0x59eb, 112, 111, 1 ),
++/*
++ * This last entry is used for fixed probability estimate of 0.5
++ * as recommended in Section 10.3 Table 5 of ITU-T Rec. T.851.
++ */
++  V( 113, 0x5a1d, 113, 113, 0 )
++};
 === added file 'src/libjpeg-turbo/jcapimin.c'
 --- src/libjpeg-turbo/jcapimin.c	1970-01-01 00:00:00 +0000
 +++ src/libjpeg-turbo/jcapimin.c	2012-06-27 16:20:24 +0000
@@ -0,0 +1,292 @@
++/*
++ * jcapimin.c
++ *
++ * Copyright (C) 1994-1998, Thomas G. Lane.
++ * Modified 2003-2010 by Guido Vollbeding.
++ * This file is part of the Independent JPEG Group's software.
++ * For conditions of distribution and use, see the accompanying README file.
++ *
++ * This file contains application interface code for the compression half
++ * of the JPEG library.  These are the "minimum" API routines that may be
++ * needed in either the normal full-compression case or the transcoding-only
++ * case.
++ *
++ * Most of the routines intended to be called directly by an application
++ * are in this file or in jcapistd.c.  But also see jcparam.c for
++ * parameter-setup helper routines, jcomapi.c for routines shared by
++ * compression and decompression, and jctrans.c for the transcoding case.
++ */
++
++#define JPEG_INTERNALS
++#include "jinclude.h"
++#include "jpeglib.h"
++
++
++/*
++ * Initialization of a JPEG compression object.
++ * The error manager must already be set up (in case memory manager fails).
++ */
++
++GLOBAL(void)
++jpeg_CreateCompress (j_compress_ptr cinfo, int version, size_t structsize)
++{
++  int i;
++
++  /* Guard against version mismatches between library and caller. */
++  cinfo->mem = NULL;		/* so jpeg_destroy knows mem mgr not called */
++  if (version != JPEG_LIB_VERSION)
++    ERREXIT2(cinfo, JERR_BAD_LIB_VERSION, JPEG_LIB_VERSION, version);
++  if (structsize != SIZEOF(struct jpeg_compress_struct))
++    ERREXIT2(cinfo, JERR_BAD_STRUCT_SIZE,
++	     (int) SIZEOF(struct jpeg_compress_struct), (int) structsize);
++
++  /* For debugging purposes, we zero the whole master structure.
++   * But the application has already set the err pointer, and may have set
++   * client_data, so we have to save and restore those fields.
++   * Note: if application hasn't set client_data, tools like Purify may
++   * complain here.
++   */
++  {
++    struct jpeg_error_mgr * err = cinfo->err;
++    void * client_data = cinfo->client_data; /* ignore Purify complaint here */
++    MEMZERO(cinfo, SIZEOF(struct jpeg_compress_struct));
++    cinfo->err = err;
++    cinfo->client_data = client_data;
++  }
++  cinfo->is_decompressor = FALSE;
++
++  /* Initialize a memory manager instance for this object */
++  jinit_memory_mgr((j_common_ptr) cinfo);
++
++  /* Zero out pointers to permanent structures. */
++  cinfo->progress = NULL;
++  cinfo->dest = NULL;
++
++  cinfo->comp_info = NULL;
++
++  for (i = 0; i < NUM_QUANT_TBLS; i++) {
++    cinfo->quant_tbl_ptrs[i] = NULL;
++#if JPEG_LIB_VERSION >= 70
++    cinfo->q_scale_factor[i] = 100;
++#endif
++  }
++
++  for (i = 0; i < NUM_HUFF_TBLS; i++) {
++    cinfo->dc_huff_tbl_ptrs[i] = NULL;
++    cinfo->ac_huff_tbl_ptrs[i] = NULL;
++  }
++
++#if JPEG_LIB_VERSION >= 80
++  /* Must do it here for emit_dqt in case jpeg_write_tables is used */
++  cinfo->block_size = DCTSIZE;
++  cinfo->natural_order = jpeg_natural_order;
++  cinfo->lim_Se = DCTSIZE2-1;
++#endif
++
++  cinfo->script_space = NULL;
++
++  cinfo->input_gamma = 1.0;	/* in case application forgets */
++
++  /* OK, I'm ready */
++  cinfo->global_state = CSTATE_START;
++}
++
++
++/*
++ * Destruction of a JPEG compression object
++ */
++
++GLOBAL(void)
++jpeg_destroy_compress (j_compress_ptr cinfo)
++{
++  jpeg_destroy((j_common_ptr) cinfo); /* use common routine */
++}
++
++
++/*
++ * Abort processing of a JPEG compression operation,
++ * but don't destroy the object itself.
++ */
++
++GLOBAL(void)
++jpeg_abort_compress (j_compress_ptr cinfo)
++{
++  jpeg_abort((j_common_ptr) cinfo); /* use common routine */
++}
++
++
++/*
++ * Forcibly suppress or un-suppress all quantization and Huffman tables.
++ * Marks all currently defined tables as already written (if suppress)
++ * or not written (if !suppress).  This will control whether they get emitted
++ * by a subsequent jpeg_start_compress call.
++ *
++ * This routine is exported for use by applications that want to produce
++ * abbreviated JPEG datastreams.  It logically belongs in jcparam.c, but
++ * since it is called by jpeg_start_compress, we put it here --- otherwise
++ * jcparam.o would be linked whether the application used it or not.
++ */
++
++GLOBAL(void)
++jpeg_suppress_tables (j_compress_ptr cinfo, boolean suppress)
++{
++  int i;
++  JQUANT_TBL * qtbl;
++  JHUFF_TBL * htbl;
++
++  for (i = 0; i < NUM_QUANT_TBLS; i++) {
++    if ((qtbl = cinfo->quant_tbl_ptrs[i]) != NULL)
++      qtbl->sent_table = suppress;
++  }
++
++  for (i = 0; i < NUM_HUFF_TBLS; i++) {
++    if ((htbl = cinfo->dc_huff_tbl_ptrs[i]) != NULL)
++      htbl->sent_table = suppress;
++    if ((htbl = cinfo->ac_huff_tbl_ptrs[i]) != NULL)
++      htbl->sent_table = suppress;
++  }
++}
++
++
++/*
++ * Finish JPEG compression.
++ *
++ * If a multipass operating mode was selected, this may do a great deal of
++ * work including most of the actual output.
++ */
++
++GLOBAL(void)
++jpeg_finish_compress (j_compress_ptr cinfo)
++{
++  JDIMENSION iMCU_row;
++
++  if (cinfo->global_state == CSTATE_SCANNING ||
++      cinfo->global_state == CSTATE_RAW_OK) {
++    /* Terminate first pass */
++    if (cinfo->next_scanline < cinfo->image_height)
++      ERREXIT(cinfo, JERR_TOO_LITTLE_DATA);
++    (*cinfo->master->finish_pass) (cinfo);
++  } else if (cinfo->global_state != CSTATE_WRCOEFS)
++    ERREXIT1(cinfo, JERR_BAD_STATE, cinfo->global_state);
++  /* Perform any remaining passes */
++  while (! cinfo->master->is_last_pass) {
++    (*cinfo->master->prepare_for_pass) (cinfo);
++    for (iMCU_row = 0; iMCU_row < cinfo->total_iMCU_rows; iMCU_row++) {
++      if (cinfo->progress != NULL) {
++	cinfo->progress->pass_counter = (long) iMCU_row;
++	cinfo->progress->pass_limit = (long) cinfo->total_iMCU_rows;
++	(*cinfo->progress->progress_monitor) ((j_common_ptr) cinfo);
++      }
++      /* We bypass the main controller and invoke coef controller directly;
++       * all work is being done from the coefficient buffer.
++       */
++      if (! (*cinfo->coef->compress_data) (cinfo, (JSAMPIMAGE) NULL))
++	ERREXIT(cinfo, JERR_CANT_SUSPEND);
++    }
++    (*cinfo->master->finish_pass) (cinfo);
++  }
++  /* Write EOI, do final cleanup */
++  (*cinfo->marker->write_file_trailer) (cinfo);
++  (*cinfo->dest->term_destination) (cinfo);
++  /* We can use jpeg_abort to release memory and reset global_state */
++  jpeg_abort((j_common_ptr) cinfo);
++}
++
++
++/*
++ * Write a special marker.
++ * This is only recommended for writing COM or APPn markers.
++ * Must be called after jpeg_start_compress() and before
++ * first call to jpeg_write_scanlines() or jpeg_write_raw_data().
++ */
++
++GLOBAL(void)
++jpeg_write_marker (j_compress_ptr cinfo, int marker,
++		   const JOCTET *dataptr, unsigned int datalen)
++{
++  JMETHOD(void, write_marker_byte, (j_compress_ptr info, int val));
++
++  if (cinfo->next_scanline != 0 ||
++      (cinfo->global_state != CSTATE_SCANNING &&
++       cinfo->global_state != CSTATE_RAW_OK &&
++       cinfo->global_state != CSTATE_WRCOEFS))
++    ERREXIT1(cinfo, JERR_BAD_STATE, cinfo->global_state);
++
++  (*cinfo->marker->write_marker_header) (cinfo, marker, datalen);
++  write_marker_byte = cinfo->marker->write_marker_byte;	/* copy for speed */
++  while (datalen--) {
++    (*write_marker_byte) (cinfo, *dataptr);
++    dataptr++;
++  }
++}
++
++/* Same, but piecemeal. */
++
++GLOBAL(void)
++jpeg_write_m_header (j_compress_ptr cinfo, int marker, unsigned int datalen)
++{
++  if (cinfo->next_scanline != 0 ||
++      (cinfo->global_state != CSTATE_SCANNING &&
++       cinfo->global_state != CSTATE_RAW_OK &&
++       cinfo->global_state != CSTATE_WRCOEFS))
++    ERREXIT1(cinfo, JERR_BAD_STATE, cinfo->global_state);
++
++  (*cinfo->marker->write_marker_header) (cinfo, marker, datalen);
++}
++
++GLOBAL(void)
++jpeg_write_m_byte (j_compress_ptr cinfo, int val)
++{
++  (*cinfo->marker->write_marker_byte) (cinfo, val);
++}
++
++
++/*
++ * Alternate compression function: just write an abbreviated table file.
++ * Before calling this, all parameters and a data destination must be set up.
++ *
++ * To produce a pair of files containing abbreviated tables and abbreviated
++ * image data, one would proceed as follows:
++ *
++ *		initialize JPEG object
++ *		set JPEG parameters
++ *		set destination to table file
++ *		jpeg_write_tables(cinfo);
++ *		set destination to image file
++ *		jpeg_start_compress(cinfo, FALSE);
++ *		write data...
++ *		jpeg_finish_compress(cinfo);
++ *
++ * jpeg_write_tables has the side effect of marking all tables written
++ * (same as jpeg_suppress_tables(..., TRUE)).  Thus a subsequent start_compress
++ * will not re-emit the tables unless it is passed write_all_tables=TRUE.
++ */
++
++GLOBAL(void)
++jpeg_write_tables (j_compress_ptr cinfo)
++{
++  if (cinfo->global_state != CSTATE_START)
++    ERREXIT1(cinfo, JERR_BAD_STATE, cinfo->global_state);
++
++  /* (Re)initialize error mgr and destination modules */
++  (*cinfo->err->reset_error_mgr) ((j_common_ptr) cinfo);
++  (*cinfo->dest->init_destination) (cinfo);
++  /* Initialize the marker writer ... bit of a crock to do it here. */
++  jinit_marker_writer(cinfo);
++  /* Write them tables! */
++  (*cinfo->marker->write_tables_only) (cinfo);
++  /* And clean up. */
++  (*cinfo->dest->term_destination) (cinfo);
++  /*
++   * In library releases up through v6a, we called jpeg_abort() here to free
++   * any working memory allocated by the destination manager and marker
++   * writer.  Some applications had a problem with that: they allocated space
++   * of their own from the library memory manager, and didn't want it to go
++   * away during write_tables.  So now we do nothing.  This will cause a
++   * memory leak if an app calls write_tables repeatedly without doing a full
++   * compression cycle or otherwise resetting the JPEG object.  However, that
++   * seems less bad than unexpectedly freeing memory in the normal case.
++   * An app that prefers the old behavior can call jpeg_abort for itself after
++   * each call to jpeg_write_tables().
++   */
++}
 === added file 'src/libjpeg-turbo/jcapistd.c'
 --- src/libjpeg-turbo/jcapistd.c	1970-01-01 00:00:00 +0000
 +++ src/libjpeg-turbo/jcapistd.c	2012-06-27 16:20:24 +0000
@@ -0,0 +1,161 @@
++/*
++ * jcapistd.c
++ *
++ * Copyright (C) 1994-1996, Thomas G. Lane.
++ * This file is part of the Independent JPEG Group's software.
++ * For conditions of distribution and use, see the accompanying README file.
++ *
++ * This file contains application interface code for the compression half
++ * of the JPEG library.  These are the "standard" API routines that are
++ * used in the normal full-compression case.  They are not used by a
++ * transcoding-only application.  Note that if an application links in
++ * jpeg_start_compress, it will end up linking in the entire compressor.
++ * We thus must separate this file from jcapimin.c to avoid linking the
++ * whole compression library into a transcoder.
++ */
++
++#define JPEG_INTERNALS
++#include "jinclude.h"
++#include "jpeglib.h"
++
++
++/*
++ * Compression initialization.
++ * Before calling this, all parameters and a data destination must be set up.
++ *
++ * We require a write_all_tables parameter as a failsafe check when writing
++ * multiple datastreams from the same compression object.  Since prior runs
++ * will have left all the tables marked sent_table=TRUE, a subsequent run
++ * would emit an abbreviated stream (no tables) by default.  This may be what
++ * is wanted, but for safety's sake it should not be the default behavior:
++ * programmers should have to make a deliberate choice to emit abbreviated
++ * images.  Therefore the documentation and examples should encourage people
++ * to pass write_all_tables=TRUE; then it will take active thought to do the
++ * wrong thing.
++ */
++
++GLOBAL(void)
++jpeg_start_compress (j_compress_ptr cinfo, boolean write_all_tables)
++{
++  if (cinfo->global_state != CSTATE_START)
++    ERREXIT1(cinfo, JERR_BAD_STATE, cinfo->global_state);
++
++  if (write_all_tables)
++    jpeg_suppress_tables(cinfo, FALSE);	/* mark all tables to be written */
++
++  /* (Re)initialize error mgr and destination modules */
++  (*cinfo->err->reset_error_mgr) ((j_common_ptr) cinfo);
++  (*cinfo->dest->init_destination) (cinfo);
++  /* Perform master selection of active modules */
++  jinit_compress_master(cinfo);
++  /* Set up for the first pass */
++  (*cinfo->master->prepare_for_pass) (cinfo);
++  /* Ready for application to drive first pass through jpeg_write_scanlines
++   * or jpeg_write_raw_data.
++   */
++  cinfo->next_scanline = 0;
++  cinfo->global_state = (cinfo->raw_data_in ? CSTATE_RAW_OK : CSTATE_SCANNING);
++}
++
++
++/*
++ * Write some scanlines of data to the JPEG compressor.
++ *
++ * The return value will be the number of lines actually written.
++ * This should be less than the supplied num_lines only in case that
++ * the data destination module has requested suspension of the compressor,
++ * or if more than image_height scanlines are passed in.
++ *
++ * Note: we warn about excess calls to jpeg_write_scanlines() since
++ * this likely signals an application programmer error.  However,
++ * excess scanlines passed in the last valid call are *silently* ignored,
++ * so that the application need not adjust num_lines for end-of-image
++ * when using a multiple-scanline buffer.
++ */
++
++GLOBAL(JDIMENSION)
++jpeg_write_scanlines (j_compress_ptr cinfo, JSAMPARRAY scanlines,
++		      JDIMENSION num_lines)
++{
++  JDIMENSION row_ctr, rows_left;
++
++  if (cinfo->global_state != CSTATE_SCANNING)
++    ERREXIT1(cinfo, JERR_BAD_STATE, cinfo->global_state);
++  if (cinfo->next_scanline >= cinfo->image_height)
++    WARNMS(cinfo, JWRN_TOO_MUCH_DATA);
++
++  /* Call progress monitor hook if present */
++  if (cinfo->progress != NULL) {
++    cinfo->progress->pass_counter = (long) cinfo->next_scanline;
++    cinfo->progress->pass_limit = (long) cinfo->image_height;
++    (*cinfo->progress->progress_monitor) ((j_common_ptr) cinfo);
++  }
++
++  /* Give master control module another chance if this is first call to
++   * jpeg_write_scanlines.  This lets output of the frame/scan headers be
++   * delayed so that application can write COM, etc, markers between
++   * jpeg_start_compress and jpeg_write_scanlines.
++   */
++  if (cinfo->master->call_pass_startup)
++    (*cinfo->master->pass_startup) (cinfo);
++
++  /* Ignore any extra scanlines at bottom of image. */
++  rows_left = cinfo->image_height - cinfo->next_scanline;
++  if (num_lines > rows_left)
++    num_lines = rows_left;
++
++  row_ctr = 0;
++  (*cinfo->main->process_data) (cinfo, scanlines, &row_ctr, num_lines);
++  cinfo->next_scanline += row_ctr;
++  return row_ctr;
++}
++
++
++/*
++ * Alternate entry point to write raw data.
++ * Processes exactly one iMCU row per call, unless suspended.
++ */
++
++GLOBAL(JDIMENSION)
++jpeg_write_raw_data (j_compress_ptr cinfo, JSAMPIMAGE data,
++		     JDIMENSION num_lines)
++{
++  JDIMENSION lines_per_iMCU_row;
++
++  if (cinfo->global_state != CSTATE_RAW_OK)
++    ERREXIT1(cinfo, JERR_BAD_STATE, cinfo->global_state);
++  if (cinfo->next_scanline >= cinfo->image_height) {
++    WARNMS(cinfo, JWRN_TOO_MUCH_DATA);
++    return 0;
++  }
++
++  /* Call progress monitor hook if present */
++  if (cinfo->progress != NULL) {
++    cinfo->progress->pass_counter = (long) cinfo->next_scanline;
++    cinfo->progress->pass_limit = (long) cinfo->image_height;
++    (*cinfo->progress->progress_monitor) ((j_common_ptr) cinfo);
++  }
++
++  /* Give master control module another chance if this is first call to
++   * jpeg_write_raw_data.  This lets output of the frame/scan headers be
++   * delayed so that application can write COM, etc, markers between
++   * jpeg_start_compress and jpeg_write_raw_data.
++   */
++  if (cinfo->master->call_pass_startup)
++    (*cinfo->master->pass_startup) (cinfo);
++
++  /* Verify that at least one iMCU row has been passed. */
++  lines_per_iMCU_row = cinfo->max_v_samp_factor * DCTSIZE;
++  if (num_lines < lines_per_iMCU_row)
++    ERREXIT(cinfo, JERR_BUFFER_SIZE);
++
++  /* Directly compress the row. */
++  if (! (*cinfo->coef->compress_data) (cinfo, data)) {
++    /* If compressor did not consume the whole row, suspend processing. */
++    return 0;
++  }
++
++  /* OK, we processed one iMCU row. */
++  cinfo->next_scanline += lines_per_iMCU_row;
++  return lines_per_iMCU_row;
++}
 === added file 'src/libjpeg-turbo/jcarith.c'
 --- src/libjpeg-turbo/jcarith.c	1970-01-01 00:00:00 +0000
 +++ src/libjpeg-turbo/jcarith.c	2012-06-27 16:20:24 +0000
@@ -0,0 +1,925 @@
++/*
++ * jcarith.c
++ *
++ * Developed 1997-2009 by Guido Vollbeding.
++ * This file is part of the Independent JPEG Group's software.
++ * For conditions of distribution and use, see the accompanying README file.
++ *
++ * This file contains portable arithmetic entropy encoding routines for JPEG
++ * (implementing the ISO/IEC IS 10918-1 and CCITT Recommendation ITU-T T.81).
++ *
++ * Both sequential and progressive modes are supported in this single module.
++ *
++ * Suspension is not currently supported in this module.
++ */
++
++#define JPEG_INTERNALS
++#include "jinclude.h"
++#include "jpeglib.h"
++
++
++/* Expanded entropy encoder object for arithmetic encoding. */
++
++typedef struct {
++  struct jpeg_entropy_encoder pub; /* public fields */
++
++  INT32 c; /* C register, base of coding interval, layout as in sec. D.1.3 */
++  INT32 a;               /* A register, normalized size of coding interval */
++  INT32 sc;        /* counter for stacked 0xFF values which might overflow */
++  INT32 zc;          /* counter for pending 0x00 output values which might *
++                          * be discarded at the end ("Pacman" termination) */
++  int ct;  /* bit shift counter, determines when next byte will be written */
++  int buffer;                /* buffer for most recent output byte != 0xFF */
++
++  int last_dc_val[MAX_COMPS_IN_SCAN]; /* last DC coef for each component */
++  int dc_context[MAX_COMPS_IN_SCAN]; /* context index for DC conditioning */
++
++  unsigned int restarts_to_go;	/* MCUs left in this restart interval */
++  int next_restart_num;		/* next restart number to write (0-7) */
++
++  /* Pointers to statistics areas (these workspaces have image lifespan) */
++  unsigned char * dc_stats[NUM_ARITH_TBLS];
++  unsigned char * ac_stats[NUM_ARITH_TBLS];
++
++  /* Statistics bin for coding with fixed probability 0.5 */
++  unsigned char fixed_bin[4];
++} arith_entropy_encoder;
++
++typedef arith_entropy_encoder * arith_entropy_ptr;
++
++/* The following two definitions specify the allocation chunk size
++ * for the statistics area.
++ * According to sections F.1.4.4.1.3 and F.1.4.4.2, we need at least
++ * 49 statistics bins for DC, and 245 statistics bins for AC coding.
++ *
++ * We use a compact representation with 1 byte per statistics bin,
++ * thus the numbers directly represent byte sizes.
++ * This 1 byte per statistics bin contains the meaning of the MPS
++ * (more probable symbol) in the highest bit (mask 0x80), and the
++ * index into the probability estimation state machine table
++ * in the lower bits (mask 0x7F).
++ */
++
++#define DC_STAT_BINS 64
++#define AC_STAT_BINS 256
++
++/* NOTE: Uncomment the following #define if you want to use the
++ * given formula for calculating the AC conditioning parameter Kx
++ * for spectral selection progressive coding in section G.1.3.2
++ * of the spec (Kx = Kmin + SRL (8 + Se - Kmin) 4).
++ * Although the spec and P&M authors claim that this "has proven
++ * to give good results for 8 bit precision samples", I'm not
++ * convinced yet that this is really beneficial.
++ * Early tests gave only very marginal compression enhancements
++ * (a few - around 5 or so - bytes even for very large files),
++ * which would turn out rather negative if we'd suppress the
++ * DAC (Define Arithmetic Conditioning) marker segments for
++ * the default parameters in the future.
++ * Note that currently the marker writing module emits 12-byte
++ * DAC segments for a full-component scan in a color image.
++ * This is not worth worrying about IMHO. However, since the
++ * spec defines the default values to be used if the tables
++ * are omitted (unlike Huffman tables, which are required
++ * anyway), one might optimize this behaviour in the future,
++ * and then it would be disadvantageous to use custom tables if
++ * they don't provide sufficient gain to exceed the DAC size.
++ *
++ * On the other hand, I'd consider it as a reasonable result
++ * that the conditioning has no significant influence on the
++ * compression performance. This means that the basic
++ * statistical model is already rather stable.
++ *
++ * Thus, at the moment, we use the default conditioning values
++ * anyway, and do not use the custom formula.
++ *
++#define CALCULATE_SPECTRAL_CONDITIONING
++ */
++
++/* IRIGHT_SHIFT is like RIGHT_SHIFT, but works on int rather than INT32.
++ * We assume that int right shift is unsigned if INT32 right shift is,
++ * which should be safe.
++ */
++
++#ifdef RIGHT_SHIFT_IS_UNSIGNED
++#define ISHIFT_TEMPS	int ishift_temp;
++#define IRIGHT_SHIFT(x,shft)  \
++	((ishift_temp = (x)) < 0 ? \
++	 (ishift_temp >> (shft)) | ((~0) << (16-(shft))) : \
++	 (ishift_temp >> (shft)))
++#else
++#define ISHIFT_TEMPS
++#define IRIGHT_SHIFT(x,shft)	((x) >> (shft))
++#endif
++
++
++LOCAL(void)
++emit_byte (int val, j_compress_ptr cinfo)
++/* Write next output byte; we do not support suspension in this module. */
++{
++  struct jpeg_destination_mgr * dest = cinfo->dest;
++
++  *dest->next_output_byte++ = (JOCTET) val;
++  if (--dest->free_in_buffer == 0)
++    if (! (*dest->empty_output_buffer) (cinfo))
++      ERREXIT(cinfo, JERR_CANT_SUSPEND);
++}
++
++
++/*
++ * Finish up at the end of an arithmetic-compressed scan.
++ */
++
++METHODDEF(void)
++finish_pass (j_compress_ptr cinfo)
++{
++  arith_entropy_ptr e = (arith_entropy_ptr) cinfo->entropy;
++  INT32 temp;
++
++  /* Section D.1.8: Termination of encoding */
++
++  /* Find the e->c in the coding interval with the largest
++   * number of trailing zero bits */
++  if ((temp = (e->a - 1 + e->c) & 0xFFFF0000L) < e->c)
++    e->c = temp + 0x8000L;
++  else
++    e->c = temp;
++  /* Send remaining bytes to output */
++  e->c <<= e->ct;
++  if (e->c & 0xF8000000L) {
++    /* One final overflow has to be handled */
++    if (e->buffer >= 0) {
++      if (e->zc)
++	do emit_byte(0x00, cinfo);
++	while (--e->zc);
++      emit_byte(e->buffer + 1, cinfo);
++      if (e->buffer + 1 == 0xFF)
++	emit_byte(0x00, cinfo);
++    }
++    e->zc += e->sc;  /* carry-over converts stacked 0xFF bytes to 0x00 */
++    e->sc = 0;
++  } else {
++    if (e->buffer == 0)
++      ++e->zc;
++    else if (e->buffer >= 0) {
++      if (e->zc)
++	do emit_byte(0x00, cinfo);
++	while (--e->zc);
++      emit_byte(e->buffer, cinfo);
++    }
++    if (e->sc) {
++      if (e->zc)
++	do emit_byte(0x00, cinfo);
++	while (--e->zc);
++      do {
++	emit_byte(0xFF, cinfo);
++	emit_byte(0x00, cinfo);
++      } while (--e->sc);
++    }
++  }
++  /* Output final bytes only if they are not 0x00 */
++  if (e->c & 0x7FFF800L) {
++    if (e->zc)  /* output final pending zero bytes */
++      do emit_byte(0x00, cinfo);
++      while (--e->zc);
++    emit_byte((e->c >> 19) & 0xFF, cinfo);
++    if (((e->c >> 19) & 0xFF) == 0xFF)
++      emit_byte(0x00, cinfo);
++    if (e->c & 0x7F800L) {
++      emit_byte((e->c >> 11) & 0xFF, cinfo);
++      if (((e->c >> 11) & 0xFF) == 0xFF)
++	emit_byte(0x00, cinfo);
++    }
++  }
++}
++
++
++/*
++ * The core arithmetic encoding routine (common in JPEG and JBIG).
++ * This needs to go as fast as possible.
++ * Machine-dependent optimization facilities
++ * are not utilized in this portable implementation.
++ * However, this code should be fairly efficient and
++ * may be a good base for further optimizations anyway.
++ *
++ * Parameter 'val' to be encoded may be 0 or 1 (binary decision).
++ *
++ * Note: I've added full "Pacman" termination support to the
++ * byte output routines, which is equivalent to the optional
++ * Discard_final_zeros procedure (Figure D.15) in the spec.
++ * Thus, we always produce the shortest possible output
++ * stream compliant to the spec (no trailing zero bytes,
++ * except for FF stuffing).
++ *
++ * I've also introduced a new scheme for accessing
++ * the probability estimation state machine table,
++ * derived from Markus Kuhn's JBIG implementation.
++ */
++
++LOCAL(void)
++arith_encode (j_compress_ptr cinfo, unsigned char *st, int val)
++{
++  register arith_entropy_ptr e = (arith_entropy_ptr) cinfo->entropy;
++  register unsigned char nl, nm;
++  register INT32 qe, temp;
++  register int sv;
++
++  /* Fetch values from our compact representation of Table D.2:
++   * Qe values and probability estimation state machine
++   */
++  sv = *st;
++  qe = jpeg_aritab[sv & 0x7F];	/* => Qe_Value */
++  nl = qe & 0xFF; qe >>= 8;	/* Next_Index_LPS + Switch_MPS */
++  nm = qe & 0xFF; qe >>= 8;	/* Next_Index_MPS */
++
++  /* Encode & estimation procedures per sections D.1.4 & D.1.5 */
++  e->a -= qe;
++  if (val != (sv >> 7)) {
++    /* Encode the less probable symbol */
++    if (e->a >= qe) {
++      /* If the interval size (qe) for the less probable symbol (LPS)
++       * is larger than the interval size for the MPS, then exchange
++       * the two symbols for coding efficiency, otherwise code the LPS
++       * as usual: */
++      e->c += e->a;
++      e->a = qe;
++    }
++    *st = (sv & 0x80) ^ nl;	/* Estimate_after_LPS */
++  } else {
++    /* Encode the more probable symbol */
++    if (e->a >= 0x8000L)
++      return;  /* A >= 0x8000 -> ready, no renormalization required */
++    if (e->a < qe) {
++      /* If the interval size (qe) for the less probable symbol (LPS)
++       * is larger than the interval size for the MPS, then exchange
++       * the two symbols for coding efficiency: */
++      e->c += e->a;
++      e->a = qe;
++    }
++    *st = (sv & 0x80) ^ nm;	/* Estimate_after_MPS */
++  }
++
++  /* Renormalization & data output per section D.1.6 */
++  do {
++    e->a <<= 1;
++    e->c <<= 1;
++    if (--e->ct == 0) {
++      /* Another byte is ready for output */
++      temp = e->c >> 19;
++      if (temp > 0xFF) {
++	/* Handle overflow over all stacked 0xFF bytes */
++	if (e->buffer >= 0) {
++	  if (e->zc)
++	    do emit_byte(0x00, cinfo);
++	    while (--e->zc);
++	  emit_byte(e->buffer + 1, cinfo);
++	  if (e->buffer + 1 == 0xFF)
++	    emit_byte(0x00, cinfo);
++	}
++	e->zc += e->sc;  /* carry-over converts stacked 0xFF bytes to 0x00 */
++	e->sc = 0;
++	/* Note: The 3 spacer bits in the C register guarantee
++	 * that the new buffer byte can't be 0xFF here
++	 * (see page 160 in the P&M JPEG book). */
++	e->buffer = temp & 0xFF;  /* new output byte, might overflow later */
++      } else if (temp == 0xFF) {
++	++e->sc;  /* stack 0xFF byte (which might overflow later) */
++      } else {
++	/* Output all stacked 0xFF bytes, they will not overflow any more */
++	if (e->buffer == 0)
++	  ++e->zc;
++	else if (e->buffer >= 0) {
++	  if (e->zc)
++	    do emit_byte(0x00, cinfo);
++	    while (--e->zc);
++	  emit_byte(e->buffer, cinfo);
++	}
++	if (e->sc) {
++	  if (e->zc)
++	    do emit_byte(0x00, cinfo);
++	    while (--e->zc);
++	  do {
++	    emit_byte(0xFF, cinfo);
++	    emit_byte(0x00, cinfo);
++	  } while (--e->sc);
++	}
++	e->buffer = temp & 0xFF;  /* new output byte (can still overflow) */
++      }
++      e->c &= 0x7FFFFL;
++      e->ct += 8;
++    }
++  } while (e->a < 0x8000L);
++}
++
++
++/*
++ * Emit a restart marker & resynchronize predictions.
++ */
++
++LOCAL(void)
++emit_restart (j_compress_ptr cinfo, int restart_num)
++{
++  arith_entropy_ptr entropy = (arith_entropy_ptr) cinfo->entropy;
++  int ci;
++  jpeg_component_info * compptr;
++
++  finish_pass(cinfo);
++
++  emit_byte(0xFF, cinfo);
++  emit_byte(JPEG_RST0 + restart_num, cinfo);
++
++  /* Re-initialize statistics areas */
++  for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
++    compptr = cinfo->cur_comp_info[ci];
++    /* DC needs no table for refinement scan */
++    if (cinfo->progressive_mode == 0 || (cinfo->Ss == 0 && cinfo->Ah == 0)) {
++      MEMZERO(entropy->dc_stats[compptr->dc_tbl_no], DC_STAT_BINS);
++      /* Reset DC predictions to 0 */
++      entropy->last_dc_val[ci] = 0;
++      entropy->dc_context[ci] = 0;
++    }
++    /* AC needs no table when not present */
++    if (cinfo->progressive_mode == 0 || cinfo->Se) {
++      MEMZERO(entropy->ac_stats[compptr->ac_tbl_no], AC_STAT_BINS);
++    }
++  }
++
++  /* Reset arithmetic encoding variables */
++  entropy->c = 0;
++  entropy->a = 0x10000L;
++  entropy->sc = 0;
++  entropy->zc = 0;
++  entropy->ct = 11;
++  entropy->buffer = -1;  /* empty */
++}
++
++
++/*
++ * MCU encoding for DC initial scan (either spectral selection,
++ * or first pass of successive approximation).
++ */
++
++METHODDEF(boolean)
++encode_mcu_DC_first (j_compress_ptr cinfo, JBLOCKROW *MCU_data)
++{
++  arith_entropy_ptr entropy = (arith_entropy_ptr) cinfo->entropy;
++  JBLOCKROW block;
++  unsigned char *st;
++  int blkn, ci, tbl;
++  int v, v2, m;
++  ISHIFT_TEMPS
++
++  /* Emit restart marker if needed */
++  if (cinfo->restart_interval) {
++    if (entropy->restarts_to_go == 0) {
++      emit_restart(cinfo, entropy->next_restart_num);
++      entropy->restarts_to_go = cinfo->restart_interval;
++      entropy->next_restart_num++;
++      entropy->next_restart_num &= 7;
++    }
++    entropy->restarts_to_go--;
++  }
++
++  /* Encode the MCU data blocks */
++  for (blkn = 0; blkn < cinfo->blocks_in_MCU; blkn++) {
++    block = MCU_data[blkn];
++    ci = cinfo->MCU_membership[blkn];
++    tbl = cinfo->cur_comp_info[ci]->dc_tbl_no;
++
++    /* Compute the DC value after the required point transform by Al.
++     * This is simply an arithmetic right shift.
++     */
++    m = IRIGHT_SHIFT((int) ((*block)[0]), cinfo->Al);
++
++    /* Sections F.1.4.1 & F.1.4.4.1: Encoding of DC coefficients */
++
++    /* Table F.4: Point to statistics bin S0 for DC coefficient coding */
++    st = entropy->dc_stats[tbl] + entropy->dc_context[ci];
++
++    /* Figure F.4: Encode_DC_DIFF */
++    if ((v = m - entropy->last_dc_val[ci]) == 0) {
++      arith_encode(cinfo, st, 0);
++      entropy->dc_context[ci] = 0;	/* zero diff category */
++    } else {
++      entropy->last_dc_val[ci] = m;
++      arith_encode(cinfo, st, 1);
++      /* Figure F.6: Encoding nonzero value v */
++      /* Figure F.7: Encoding the sign of v */
++      if (v > 0) {
++	arith_encode(cinfo, st + 1, 0);	/* Table F.4: SS = S0 + 1 */
++	st += 2;			/* Table F.4: SP = S0 + 2 */
++	entropy->dc_context[ci] = 4;	/* small positive diff category */
++      } else {
++	v = -v;
++	arith_encode(cinfo, st + 1, 1);	/* Table F.4: SS = S0 + 1 */
++	st += 3;			/* Table F.4: SN = S0 + 3 */
++	entropy->dc_context[ci] = 8;	/* small negative diff category */
++      }
++      /* Figure F.8: Encoding the magnitude category of v */
++      m = 0;
++      if (v -= 1) {
++	arith_encode(cinfo, st, 1);
++	m = 1;
++	v2 = v;
++	st = entropy->dc_stats[tbl] + 20; /* Table F.4: X1 = 20 */
++	while (v2 >>= 1) {
++	  arith_encode(cinfo, st, 1);
++	  m <<= 1;
++	  st += 1;
++	}
++      }
++      arith_encode(cinfo, st, 0);
++      /* Section F.1.4.4.1.2: Establish dc_context conditioning category */
++      if (m < (int) ((1L << cinfo->arith_dc_L[tbl]) >> 1))
++	entropy->dc_context[ci] = 0;	/* zero diff category */
++      else if (m > (int) ((1L << cinfo->arith_dc_U[tbl]) >> 1))
++	entropy->dc_context[ci] += 8;	/* large diff category */
++      /* Figure F.9: Encoding the magnitude bit pattern of v */
++      st += 14;
++      while (m >>= 1)
++	arith_encode(cinfo, st, (m & v) ? 1 : 0);
++    }
++  }
++
++  return TRUE;
++}
++
++
++/*
++ * MCU encoding for AC initial scan (either spectral selection,
++ * or first pass of successive approximation).
++ */
++
++METHODDEF(boolean)
++encode_mcu_AC_first (j_compress_ptr cinfo, JBLOCKROW *MCU_data)
++{
++  arith_entropy_ptr entropy = (arith_entropy_ptr) cinfo->entropy;
++  JBLOCKROW block;
++  unsigned char *st;
++  int tbl, k, ke;
++  int v, v2, m;
++
++  /* Emit restart marker if needed */
++  if (cinfo->restart_interval) {
++    if (entropy->restarts_to_go == 0) {
++      emit_restart(cinfo, entropy->next_restart_num);
++      entropy->restarts_to_go = cinfo->restart_interval;
++      entropy->next_restart_num++;
++      entropy->next_restart_num &= 7;
++    }
++    entropy->restarts_to_go--;
++  }
++
++  /* Encode the MCU data block */
++  block = MCU_data[0];
++  tbl = cinfo->cur_comp_info[0]->ac_tbl_no;
++
++  /* Sections F.1.4.2 & F.1.4.4.2: Encoding of AC coefficients */
++
++  /* Establish EOB (end-of-block) index */
++  for (ke = cinfo->Se; ke > 0; ke--)
++    /* We must apply the point transform by Al.  For AC coefficients this
++     * is an integer division with rounding towards 0.  To do this portably
++     * in C, we shift after obtaining the absolute value.
++     */
++    if ((v = (*block)[jpeg_natural_order[ke]]) >= 0) {
++      if (v >>= cinfo->Al) break;
++    } else {
++      v = -v;
++      if (v >>= cinfo->Al) break;
++    }
++
++  /* Figure F.5: Encode_AC_Coefficients */
++  for (k = cinfo->Ss; k <= ke; k++) {
++    st = entropy->ac_stats[tbl] + 3 * (k - 1);
++    arith_encode(cinfo, st, 0);		/* EOB decision */
++    for (;;) {
++      if ((v = (*block)[jpeg_natural_order[k]]) >= 0) {
++	if (v >>= cinfo->Al) {
++	  arith_encode(cinfo, st + 1, 1);
++	  arith_encode(cinfo, entropy->fixed_bin, 0);
++	  break;
++	}
++      } else {
++	v = -v;
++	if (v >>= cinfo->Al) {
++	  arith_encode(cinfo, st + 1, 1);
++	  arith_encode(cinfo, entropy->fixed_bin, 1);
++	  break;
++	}
++      }
++      arith_encode(cinfo, st + 1, 0); st += 3; k++;
++    }
++    st += 2;
++    /* Figure F.8: Encoding the magnitude category of v */
++    m = 0;
++    if (v -= 1) {
++      arith_encode(cinfo, st, 1);
++      m = 1;
++      v2 = v;
++      if (v2 >>= 1) {
++	arith_encode(cinfo, st, 1);
++	m <<= 1;
++	st = entropy->ac_stats[tbl] +
++	     (k <= cinfo->arith_ac_K[tbl] ? 189 : 217);
++	while (v2 >>= 1) {
++	  arith_encode(cinfo, st, 1);
++	  m <<= 1;
++	  st += 1;
++	}
++      }
++    }
++    arith_encode(cinfo, st, 0);
++    /* Figure F.9: Encoding the magnitude bit pattern of v */
++    st += 14;
++    while (m >>= 1)
++      arith_encode(cinfo, st, (m & v) ? 1 : 0);
++  }
++  /* Encode EOB decision only if k <= cinfo->Se */
++  if (k <= cinfo->Se) {
++    st = entropy->ac_stats[tbl] + 3 * (k - 1);
++    arith_encode(cinfo, st, 1);
++  }
++
++  return TRUE;
++}
++
++
++/*
++ * MCU encoding for DC successive approximation refinement scan.
++ */
++
++METHODDEF(boolean)
++encode_mcu_DC_refine (j_compress_ptr cinfo, JBLOCKROW *MCU_data)
++{
++  arith_entropy_ptr entropy = (arith_entropy_ptr) cinfo->entropy;
++  unsigned char *st;
++  int Al, blkn;
++
++  /* Emit restart marker if needed */
++  if (cinfo->restart_interval) {
++    if (entropy->restarts_to_go == 0) {
++      emit_restart(cinfo, entropy->next_restart_num);
++      entropy->restarts_to_go = cinfo->restart_interval;
++      entropy->next_restart_num++;
++      entropy->next_restart_num &= 7;
++    }
++    entropy->restarts_to_go--;
++  }
++
++  st = entropy->fixed_bin;	/* use fixed probability estimation */
++  Al = cinfo->Al;
++
++  /* Encode the MCU data blocks */
++  for (blkn = 0; blkn < cinfo->blocks_in_MCU; blkn++) {
++    /* We simply emit the Al'th bit of the DC coefficient value. */
++    arith_encode(cinfo, st, (MCU_data[blkn][0][0] >> Al) & 1);
++  }
++
++  return TRUE;
++}
++
++
++/*
++ * MCU encoding for AC successive approximation refinement scan.
++ */
++
++METHODDEF(boolean)
++encode_mcu_AC_refine (j_compress_ptr cinfo, JBLOCKROW *MCU_data)
++{
++  arith_entropy_ptr entropy = (arith_entropy_ptr) cinfo->entropy;
++  JBLOCKROW block;
++  unsigned char *st;
++  int tbl, k, ke, kex;
++  int v;
++
++  /* Emit restart marker if needed */
++  if (cinfo->restart_interval) {
++    if (entropy->restarts_to_go == 0) {
++      emit_restart(cinfo, entropy->next_restart_num);
++      entropy->restarts_to_go = cinfo->restart_interval;
++      entropy->next_restart_num++;
++      entropy->next_restart_num &= 7;
++    }
++    entropy->restarts_to_go--;
++  }
++
++  /* Encode the MCU data block */
++  block = MCU_data[0];
++  tbl = cinfo->cur_comp_info[0]->ac_tbl_no;
++
++  /* Section G.1.3.3: Encoding of AC coefficients */
++
++  /* Establish EOB (end-of-block) index */
++  for (ke = cinfo->Se; ke > 0; ke--)
++    /* We must apply the point transform by Al.  For AC coefficients this
++     * is an integer division with rounding towards 0.  To do this portably
++     * in C, we shift after obtaining the absolute value.
++     */
++    if ((v = (*block)[jpeg_natural_order[ke]]) >= 0) {
++      if (v >>= cinfo->Al) break;
++    } else {
++      v = -v;
++      if (v >>= cinfo->Al) break;
++    }
++
++  /* Establish EOBx (previous stage end-of-block) index */
++  for (kex = ke; kex > 0; kex--)
++    if ((v = (*block)[jpeg_natural_order[kex]]) >= 0) {
++      if (v >>= cinfo->Ah) break;
++    } else {
++      v = -v;
++      if (v >>= cinfo->Ah) break;
++    }
++
++  /* Figure G.10: Encode_AC_Coefficients_SA */
++  for (k = cinfo->Ss; k <= ke; k++) {
++    st = entropy->ac_stats[tbl] + 3 * (k - 1);
++    if (k > kex)
++      arith_encode(cinfo, st, 0);	/* EOB decision */
++    for (;;) {
++      if ((v = (*block)[jpeg_natural_order[k]]) >= 0) {
++	if (v >>= cinfo->Al) {
++	  if (v >> 1)			/* previously nonzero coef */
++	    arith_encode(cinfo, st + 2, (v & 1));
++	  else {			/* newly nonzero coef */
++	    arith_encode(cinfo, st + 1, 1);
++	    arith_encode(cinfo, entropy->fixed_bin, 0);
++	  }
++	  break;
++	}
++      } else {
++	v = -v;
++	if (v >>= cinfo->Al) {
++	  if (v >> 1)			/* previously nonzero coef */
++	    arith_encode(cinfo, st + 2, (v & 1));
++	  else {			/* newly nonzero coef */
++	    arith_encode(cinfo, st + 1, 1);
++	    arith_encode(cinfo, entropy->fixed_bin, 1);
++	  }
++	  break;
++	}
++      }
++      arith_encode(cinfo, st + 1, 0); st += 3; k++;
++    }
++  }
++  /* Encode EOB decision only if k <= cinfo->Se */
++  if (k <= cinfo->Se) {
++    st = entropy->ac_stats[tbl] + 3 * (k - 1);
++    arith_encode(cinfo, st, 1);
++  }
++
++  return TRUE;
++}
++
++
++/*
++ * Encode and output one MCU's worth of arithmetic-compressed coefficients.
++ */
++
++METHODDEF(boolean)
++encode_mcu (j_compress_ptr cinfo, JBLOCKROW *MCU_data)
++{
++  arith_entropy_ptr entropy = (arith_entropy_ptr) cinfo->entropy;
++  jpeg_component_info * compptr;
++  JBLOCKROW block;
++  unsigned char *st;
++  int blkn, ci, tbl, k, ke;
++  int v, v2, m;
++
++  /* Emit restart marker if needed */
++  if (cinfo->restart_interval) {
++    if (entropy->restarts_to_go == 0) {
++      emit_restart(cinfo, entropy->next_restart_num);
++      entropy->restarts_to_go = cinfo->restart_interval;
++      entropy->next_restart_num++;
++      entropy->next_restart_num &= 7;
++    }
++    entropy->restarts_to_go--;
++  }
++
++  /* Encode the MCU data blocks */
++  for (blkn = 0; blkn < cinfo->blocks_in_MCU; blkn++) {
++    block = MCU_data[blkn];
++    ci = cinfo->MCU_membership[blkn];
++    compptr = cinfo->cur_comp_info[ci];
++
++    /* Sections F.1.4.1 & F.1.4.4.1: Encoding of DC coefficients */
++
++    tbl = compptr->dc_tbl_no;
++
++    /* Table F.4: Point to statistics bin S0 for DC coefficient coding */
++    st = entropy->dc_stats[tbl] + entropy->dc_context[ci];
++
++    /* Figure F.4: Encode_DC_DIFF */
++    if ((v = (*block)[0] - entropy->last_dc_val[ci]) == 0) {
++      arith_encode(cinfo, st, 0);
++      entropy->dc_context[ci] = 0;	/* zero diff category */
++    } else {
++      entropy->last_dc_val[ci] = (*block)[0];
++      arith_encode(cinfo, st, 1);
++      /* Figure F.6: Encoding nonzero value v */
++      /* Figure F.7: Encoding the sign of v */
++      if (v > 0) {
++	arith_encode(cinfo, st + 1, 0);	/* Table F.4: SS = S0 + 1 */
++	st += 2;			/* Table F.4: SP = S0 + 2 */
++	entropy->dc_context[ci] = 4;	/* small positive diff category */
++      } else {
++	v = -v;
++	arith_encode(cinfo, st + 1, 1);	/* Table F.4: SS = S0 + 1 */
++	st += 3;			/* Table F.4: SN = S0 + 3 */
++	entropy->dc_context[ci] = 8;	/* small negative diff category */
++      }
++      /* Figure F.8: Encoding the magnitude category of v */
++      m = 0;
++      if (v -= 1) {
++	arith_encode(cinfo, st, 1);
++	m = 1;
++	v2 = v;
++	st = entropy->dc_stats[tbl] + 20; /* Table F.4: X1 = 20 */
++	while (v2 >>= 1) {
++	  arith_encode(cinfo, st, 1);
++	  m <<= 1;
++	  st += 1;
++	}
++      }
++      arith_encode(cinfo, st, 0);
++      /* Section F.1.4.4.1.2: Establish dc_context conditioning category */
++      if (m < (int) ((1L << cinfo->arith_dc_L[tbl]) >> 1))
++	entropy->dc_context[ci] = 0;	/* zero diff category */
++      else if (m > (int) ((1L << cinfo->arith_dc_U[tbl]) >> 1))
++	entropy->dc_context[ci] += 8;	/* large diff category */
++      /* Figure F.9: Encoding the magnitude bit pattern of v */
++      st += 14;
++      while (m >>= 1)
++	arith_encode(cinfo, st, (m & v) ? 1 : 0);
++    }
++
++    /* Sections F.1.4.2 & F.1.4.4.2: Encoding of AC coefficients */
++
++    tbl = compptr->ac_tbl_no;
++
++    /* Establish EOB (end-of-block) index */
++    for (ke = DCTSIZE2 - 1; ke > 0; ke--)
++      if ((*block)[jpeg_natural_order[ke]]) break;
++
++    /* Figure F.5: Encode_AC_Coefficients */
++    for (k = 1; k <= ke; k++) {
++      st = entropy->ac_stats[tbl] + 3 * (k - 1);
++      arith_encode(cinfo, st, 0);	/* EOB decision */
++      while ((v = (*block)[jpeg_natural_order[k]]) == 0) {
++	arith_encode(cinfo, st + 1, 0); st += 3; k++;
++      }
++      arith_encode(cinfo, st + 1, 1);
++      /* Figure F.6: Encoding nonzero value v */
++      /* Figure F.7: Encoding the sign of v */
++      if (v > 0) {
++	arith_encode(cinfo, entropy->fixed_bin, 0);
++      } else {
++	v = -v;
++	arith_encode(cinfo, entropy->fixed_bin, 1);
++      }
++      st += 2;
++      /* Figure F.8: Encoding the magnitude category of v */
++      m = 0;
++      if (v -= 1) {
++	arith_encode(cinfo, st, 1);
++	m = 1;
++	v2 = v;
++	if (v2 >>= 1) {
++	  arith_encode(cinfo, st, 1);
++	  m <<= 1;
++	  st = entropy->ac_stats[tbl] +
++	       (k <= cinfo->arith_ac_K[tbl] ? 189 : 217);
++	  while (v2 >>= 1) {
++	    arith_encode(cinfo, st, 1);
++	    m <<= 1;
++	    st += 1;
++	  }
++	}
++      }
++      arith_encode(cinfo, st, 0);
++      /* Figure F.9: Encoding the magnitude bit pattern of v */
++      st += 14;
++      while (m >>= 1)
++	arith_encode(cinfo, st, (m & v) ? 1 : 0);
++    }
++    /* Encode EOB decision only if k <= DCTSIZE2 - 1 */
++    if (k <= DCTSIZE2 - 1) {
++      st = entropy->ac_stats[tbl] + 3 * (k - 1);
++      arith_encode(cinfo, st, 1);
++    }
++  }
++
++  return TRUE;
++}
++
++
++/*
++ * Initialize for an arithmetic-compressed scan.
++ */
++
++METHODDEF(void)
++start_pass (j_compress_ptr cinfo, boolean gather_statistics)
++{
++  arith_entropy_ptr entropy = (arith_entropy_ptr) cinfo->entropy;
++  int ci, tbl;
++  jpeg_component_info * compptr;
++
++  if (gather_statistics)
++    /* Make sure to avoid that in the master control logic!
++     * We are fully adaptive here and need no extra
++     * statistics gathering pass!
++     */
++    ERREXIT(cinfo, JERR_NOT_COMPILED);
++
++  /* We assume jcmaster.c already validated the progressive scan parameters. */
++
++  /* Select execution routines */
++  if (cinfo->progressive_mode) {
++    if (cinfo->Ah == 0) {
++      if (cinfo->Ss == 0)
++	entropy->pub.encode_mcu = encode_mcu_DC_first;
++      else
++	entropy->pub.encode_mcu = encode_mcu_AC_first;
++    } else {
++      if (cinfo->Ss == 0)
++	entropy->pub.encode_mcu = encode_mcu_DC_refine;
++      else
++	entropy->pub.encode_mcu = encode_mcu_AC_refine;
++    }
++  } else
++    entropy->pub.encode_mcu = encode_mcu;
++
++  /* Allocate & initialize requested statistics areas */
++  for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
++    compptr = cinfo->cur_comp_info[ci];
++    /* DC needs no table for refinement scan */
++    if (cinfo->progressive_mode == 0 || (cinfo->Ss == 0 && cinfo->Ah == 0)) {
++      tbl = compptr->dc_tbl_no;
++      if (tbl < 0 || tbl >= NUM_ARITH_TBLS)
++	ERREXIT1(cinfo, JERR_NO_ARITH_TABLE, tbl);
++      if (entropy->dc_stats[tbl] == NULL)
++	entropy->dc_stats[tbl] = (unsigned char *) (*cinfo->mem->alloc_small)
++	  ((j_common_ptr) cinfo, JPOOL_IMAGE, DC_STAT_BINS);
++      MEMZERO(entropy->dc_stats[tbl], DC_STAT_BINS);
++      /* Initialize DC predictions to 0 */
++      entropy->last_dc_val[ci] = 0;
++      entropy->dc_context[ci] = 0;
++    }
++    /* AC needs no table when not present */
++    if (cinfo->progressive_mode == 0 || cinfo->Se) {
++      tbl = compptr->ac_tbl_no;
++      if (tbl < 0 || tbl >= NUM_ARITH_TBLS)
++	ERREXIT1(cinfo, JERR_NO_ARITH_TABLE, tbl);
++      if (entropy->ac_stats[tbl] == NULL)
++	entropy->ac_stats[tbl] = (unsigned char *) (*cinfo->mem->alloc_small)
++	  ((j_common_ptr) cinfo, JPOOL_IMAGE, AC_STAT_BINS);
++      MEMZERO(entropy->ac_stats[tbl], AC_STAT_BINS);
++#ifdef CALCULATE_SPECTRAL_CONDITIONING
++      if (cinfo->progressive_mode)
++	/* Section G.1.3.2: Set appropriate arithmetic conditioning value Kx */
++	cinfo->arith_ac_K[tbl] = cinfo->Ss + ((8 + cinfo->Se - cinfo->Ss) >> 4);
++#endif
++    }
++  }
++
++  /* Initialize arithmetic encoding variables */
++  entropy->c = 0;
++  entropy->a = 0x10000L;
++  entropy->sc = 0;
++  entropy->zc = 0;
++  entropy->ct = 11;
++  entropy->buffer = -1;  /* empty */
++
++  /* Initialize restart stuff */
++  entropy->restarts_to_go = cinfo->restart_interval;
++  entropy->next_restart_num = 0;
++}
++
++
++/*
++ * Module initialization routine for arithmetic entropy encoding.
++ */
++
++GLOBAL(void)
++jinit_arith_encoder (j_compress_ptr cinfo)
++{
++  arith_entropy_ptr entropy;
++  int i;
++
++  entropy = (arith_entropy_ptr)
++    (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
++				SIZEOF(arith_entropy_encoder));
++  cinfo->entropy = (struct jpeg_entropy_encoder *) entropy;
++  entropy->pub.start_pass = start_pass;
++  entropy->pub.finish_pass = finish_pass;
++
++  /* Mark tables unallocated */
++  for (i = 0; i < NUM_ARITH_TBLS; i++) {
++    entropy->dc_stats[i] = NULL;
++    entropy->ac_stats[i] = NULL;
++  }
++
++  /* Initialize index for fixed probability estimation */
++  entropy->fixed_bin[0] = 113;
++}
 === added file 'src/libjpeg-turbo/jccoefct.c'
 --- src/libjpeg-turbo/jccoefct.c	1970-01-01 00:00:00 +0000
 +++ src/libjpeg-turbo/jccoefct.c	2012-06-27 16:20:24 +0000
@@ -0,0 +1,449 @@
++/*
++ * jccoefct.c
++ *
++ * Copyright (C) 1994-1997, Thomas G. Lane.
++ * This file is part of the Independent JPEG Group's software.
++ * For conditions of distribution and use, see the accompanying README file.
++ *
++ * This file contains the coefficient buffer controller for compression.
++ * This controller is the top level of the JPEG compressor proper.
++ * The coefficient buffer lies between forward-DCT and entropy encoding steps.
++ */
++
++#define JPEG_INTERNALS
++#include "jinclude.h"
++#include "jpeglib.h"
++
++
++/* We use a full-image coefficient buffer when doing Huffman optimization,
++ * and also for writing multiple-scan JPEG files.  In all cases, the DCT
++ * step is run during the first pass, and subsequent passes need only read
++ * the buffered coefficients.
++ */
++#ifdef ENTROPY_OPT_SUPPORTED
++#define FULL_COEF_BUFFER_SUPPORTED
++#else
++#ifdef C_MULTISCAN_FILES_SUPPORTED
++#define FULL_COEF_BUFFER_SUPPORTED
++#endif
++#endif
++
++
++/* Private buffer controller object */
++
++typedef struct {
++  struct jpeg_c_coef_controller pub; /* public fields */
++
++  JDIMENSION iMCU_row_num;	/* iMCU row # within image */
++  JDIMENSION mcu_ctr;		/* counts MCUs processed in current row */
++  int MCU_vert_offset;		/* counts MCU rows within iMCU row */
++  int MCU_rows_per_iMCU_row;	/* number of such rows needed */
++
++  /* For single-pass compression, it's sufficient to buffer just one MCU
++   * (although this may prove a bit slow in practice).  We allocate a
++   * workspace of C_MAX_BLOCKS_IN_MCU coefficient blocks, and reuse it for each
++   * MCU constructed and sent.  (On 80x86, the workspace is FAR even though
++   * it's not really very big; this is to keep the module interfaces unchanged
++   * when a large coefficient buffer is necessary.)
++   * In multi-pass modes, this array points to the current MCU's blocks
++   * within the virtual arrays.
++   */
++  JBLOCKROW MCU_buffer[C_MAX_BLOCKS_IN_MCU];
++
++  /* In multi-pass modes, we need a virtual block array for each component. */
++  jvirt_barray_ptr whole_image[MAX_COMPONENTS];
++} my_coef_controller;
++
++typedef my_coef_controller * my_coef_ptr;
++
++
++/* Forward declarations */
++METHODDEF(boolean) compress_data
++    JPP((j_compress_ptr cinfo, JSAMPIMAGE input_buf));
++#ifdef FULL_COEF_BUFFER_SUPPORTED
++METHODDEF(boolean) compress_first_pass
++    JPP((j_compress_ptr cinfo, JSAMPIMAGE input_buf));
++METHODDEF(boolean) compress_output
++    JPP((j_compress_ptr cinfo, JSAMPIMAGE input_buf));
++#endif
++
++
++LOCAL(void)
++start_iMCU_row (j_compress_ptr cinfo)
++/* Reset within-iMCU-row counters for a new row */
++{
++  my_coef_ptr coef = (my_coef_ptr) cinfo->coef;
++
++  /* In an interleaved scan, an MCU row is the same as an iMCU row.
++   * In a noninterleaved scan, an iMCU row has v_samp_factor MCU rows.
++   * But at the bottom of the image, process only what's left.
++   */
++  if (cinfo->comps_in_scan > 1) {
++    coef->MCU_rows_per_iMCU_row = 1;
++  } else {
++    if (coef->iMCU_row_num < (cinfo->total_iMCU_rows-1))
++      coef->MCU_rows_per_iMCU_row = cinfo->cur_comp_info[0]->v_samp_factor;
++    else
++      coef->MCU_rows_per_iMCU_row = cinfo->cur_comp_info[0]->last_row_height;
++  }
++
++  coef->mcu_ctr = 0;
++  coef->MCU_vert_offset = 0;
++}
++
++
++/*
++ * Initialize for a processing pass.
++ */
++
++METHODDEF(void)
++start_pass_coef (j_compress_ptr cinfo, J_BUF_MODE pass_mode)
++{
++  my_coef_ptr coef = (my_coef_ptr) cinfo->coef;
++
++  coef->iMCU_row_num = 0;
++  start_iMCU_row(cinfo);
++
++  switch (pass_mode) {
++  case JBUF_PASS_THRU:
++    if (coef->whole_image[0] != NULL)
++      ERREXIT(cinfo, JERR_BAD_BUFFER_MODE);
++    coef->pub.compress_data = compress_data;
++    break;
++#ifdef FULL_COEF_BUFFER_SUPPORTED
++  case JBUF_SAVE_AND_PASS:
++    if (coef->whole_image[0] == NULL)
++      ERREXIT(cinfo, JERR_BAD_BUFFER_MODE);
++    coef->pub.compress_data = compress_first_pass;
++    break;
++  case JBUF_CRANK_DEST:
++    if (coef->whole_image[0] == NULL)
++      ERREXIT(cinfo, JERR_BAD_BUFFER_MODE);
++    coef->pub.compress_data = compress_output;
++    break;
++#endif
++  default:
++    ERREXIT(cinfo, JERR_BAD_BUFFER_MODE);
++    break;
++  }
++}
++
++
++/*
++ * Process some data in the single-pass case.
++ * We process the equivalent of one fully interleaved MCU row ("iMCU" row)
++ * per call, ie, v_samp_factor block rows for each component in the image.
++ * Returns TRUE if the iMCU row is completed, FALSE if suspended.
++ *
++ * NB: input_buf contains a plane for each component in image,
++ * which we index according to the component's SOF position.
++ */
++
++METHODDEF(boolean)
++compress_data (j_compress_ptr cinfo, JSAMPIMAGE input_buf)
++{
++  my_coef_ptr coef = (my_coef_ptr) cinfo->coef;
++  JDIMENSION MCU_col_num;	/* index of current MCU within row */
++  JDIMENSION last_MCU_col = cinfo->MCUs_per_row - 1;
++  JDIMENSION last_iMCU_row = cinfo->total_iMCU_rows - 1;
++  int blkn, bi, ci, yindex, yoffset, blockcnt;
++  JDIMENSION ypos, xpos;
++  jpeg_component_info *compptr;
++
++  /* Loop to write as much as one whole iMCU row */
++  for (yoffset = coef->MCU_vert_offset; yoffset < coef->MCU_rows_per_iMCU_row;
++       yoffset++) {
++    for (MCU_col_num = coef->mcu_ctr; MCU_col_num <= last_MCU_col;
++	 MCU_col_num++) {
++      /* Determine where data comes from in input_buf and do the DCT thing.
++       * Each call on forward_DCT processes a horizontal row of DCT blocks
++       * as wide as an MCU; we rely on having allocated the MCU_buffer[] blocks
++       * sequentially.  Dummy blocks at the right or bottom edge are filled in
++       * specially.  The data in them does not matter for image reconstruction,
++       * so we fill them with values that will encode to the smallest amount of
++       * data, viz: all zeroes in the AC entries, DC entries equal to previous
++       * block's DC value.  (Thanks to Thomas Kinsman for this idea.)
++       */
++      blkn = 0;
++      for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
++	compptr = cinfo->cur_comp_info[ci];
++	blockcnt = (MCU_col_num < last_MCU_col) ? compptr->MCU_width
++						: compptr->last_col_width;
++	xpos = MCU_col_num * compptr->MCU_sample_width;
++	ypos = yoffset * DCTSIZE; /* ypos == (yoffset+yindex) * DCTSIZE */
++	for (yindex = 0; yindex < compptr->MCU_height; yindex++) {
++	  if (coef->iMCU_row_num < last_iMCU_row ||
++	      yoffset+yindex < compptr->last_row_height) {
++	    (*cinfo->fdct->forward_DCT) (cinfo, compptr,
++					 input_buf[compptr->component_index],
++					 coef->MCU_buffer[blkn],
++					 ypos, xpos, (JDIMENSION) blockcnt);
++	    if (blockcnt < compptr->MCU_width) {
++	      /* Create some dummy blocks at the right edge of the image. */
++	      jzero_far((void FAR *) coef->MCU_buffer[blkn + blockcnt],
++			(compptr->MCU_width - blockcnt) * SIZEOF(JBLOCK));
++	      for (bi = blockcnt; bi < compptr->MCU_width; bi++) {
++		coef->MCU_buffer[blkn+bi][0][0] = coef->MCU_buffer[blkn+bi-1][0][0];
++	      }
++	    }
++	  } else {
++	    /* Create a row of dummy blocks at the bottom of the image. */
++	    jzero_far((void FAR *) coef->MCU_buffer[blkn],
++		      compptr->MCU_width * SIZEOF(JBLOCK));
++	    for (bi = 0; bi < compptr->MCU_width; bi++) {
++	      coef->MCU_buffer[blkn+bi][0][0] = coef->MCU_buffer[blkn-1][0][0];
++	    }
++	  }
++	  blkn += compptr->MCU_width;
++	  ypos += DCTSIZE;
++	}
++      }
++      /* Try to write the MCU.  In event of a suspension failure, we will
++       * re-DCT the MCU on restart (a bit inefficient, could be fixed...)
++       */
++      if (! (*cinfo->entropy->encode_mcu) (cinfo, coef->MCU_buffer)) {
++	/* Suspension forced; update state counters and exit */
++	coef->MCU_vert_offset = yoffset;
++	coef->mcu_ctr = MCU_col_num;
++	return FALSE;
++      }
++    }
++    /* Completed an MCU row, but perhaps not an iMCU row */
++    coef->mcu_ctr = 0;
++  }
++  /* Completed the iMCU row, advance counters for next one */
++  coef->iMCU_row_num++;
++  start_iMCU_row(cinfo);
++  return TRUE;
++}
++
++
++#ifdef FULL_COEF_BUFFER_SUPPORTED
++
++/*
++ * Process some data in the first pass of a multi-pass case.
++ * We process the equivalent of one fully interleaved MCU row ("iMCU" row)
++ * per call, ie, v_samp_factor block rows for each component in the image.
++ * This amount of data is read from the source buffer, DCT'd and quantized,
++ * and saved into the virtual arrays.  We also generate suitable dummy blocks
++ * as needed at the right and lower edges.  (The dummy blocks are constructed
++ * in the virtual arrays, which have been padded appropriately.)  This makes
++ * it possible for subsequent passes not to worry about real vs. dummy blocks.
++ *
++ * We must also emit the data to the entropy encoder.  This is conveniently
++ * done by calling compress_output() after we've loaded the current strip
++ * of the virtual arrays.
++ *
++ * NB: input_buf contains a plane for each component in image.  All
++ * components are DCT'd and loaded into the virtual arrays in this pass.
++ * However, it may be that only a subset of the components are emitted to
++ * the entropy encoder during this first pass; be careful about looking
++ * at the scan-dependent variables (MCU dimensions, etc).
++ */
++
++METHODDEF(boolean)
++compress_first_pass (j_compress_ptr cinfo, JSAMPIMAGE input_buf)
++{
++  my_coef_ptr coef = (my_coef_ptr) cinfo->coef;
++  JDIMENSION last_iMCU_row = cinfo->total_iMCU_rows - 1;
++  JDIMENSION blocks_across, MCUs_across, MCUindex;
++  int bi, ci, h_samp_factor, block_row, block_rows, ndummy;
++  JCOEF lastDC;
++  jpeg_component_info *compptr;
++  JBLOCKARRAY buffer;
++  JBLOCKROW thisblockrow, lastblockrow;
++
++  for (ci = 0, compptr = cinfo->comp_info; ci < cinfo->num_components;
++       ci++, compptr++) {
++    /* Align the virtual buffer for this component. */
++    buffer = (*cinfo->mem->access_virt_barray)
++      ((j_common_ptr) cinfo, coef->whole_image[ci],
++       coef->iMCU_row_num * compptr->v_samp_factor,
++       (JDIMENSION) compptr->v_samp_factor, TRUE);
++    /* Count non-dummy DCT block rows in this iMCU row. */
++    if (coef->iMCU_row_num < last_iMCU_row)
++      block_rows = compptr->v_samp_factor;
++    else {
++      /* NB: can't use last_row_height here, since may not be set! */
++      block_rows = (int) (compptr->height_in_blocks % compptr->v_samp_factor);
++      if (block_rows == 0) block_rows = compptr->v_samp_factor;
++    }
++    blocks_across = compptr->width_in_blocks;
++    h_samp_factor = compptr->h_samp_factor;
++    /* Count number of dummy blocks to be added at the right margin. */
++    ndummy = (int) (blocks_across % h_samp_factor);
++    if (ndummy > 0)
++      ndummy = h_samp_factor - ndummy;
++    /* Perform DCT for all non-dummy blocks in this iMCU row.  Each call
++     * on forward_DCT processes a complete horizontal row of DCT blocks.
++     */
++    for (block_row = 0; block_row < block_rows; block_row++) {
++      thisblockrow = buffer[block_row];
++      (*cinfo->fdct->forward_DCT) (cinfo, compptr,
++				   input_buf[ci], thisblockrow,
++				   (JDIMENSION) (block_row * DCTSIZE),
++				   (JDIMENSION) 0, blocks_across);
++      if (ndummy > 0) {
++	/* Create dummy blocks at the right edge of the image. */
++	thisblockrow += blocks_across; /* => first dummy block */
++	jzero_far((void FAR *) thisblockrow, ndummy * SIZEOF(JBLOCK));
++	lastDC = thisblockrow[-1][0];
++	for (bi = 0; bi < ndummy; bi++) {
++	  thisblockrow[bi][0] = lastDC;
++	}
++      }
++    }
++    /* If at end of image, create dummy block rows as needed.
++     * The tricky part here is that within each MCU, we want the DC values
++     * of the dummy blocks to match the last real block's DC value.
++     * This squeezes a few more bytes out of the resulting file...
++     */
++    if (coef->iMCU_row_num == last_iMCU_row) {
++      blocks_across += ndummy;	/* include lower right corner */
++      MCUs_across = blocks_across / h_samp_factor;
++      for (block_row = block_rows; block_row < compptr->v_samp_factor;
++	   block_row++) {
++	thisblockrow = buffer[block_row];
++	lastblockrow = buffer[block_row-1];
++	jzero_far((void FAR *) thisblockrow,
++		  (size_t) (blocks_across * SIZEOF(JBLOCK)));
++	for (MCUindex = 0; MCUindex < MCUs_across; MCUindex++) {
++	  lastDC = lastblockrow[h_samp_factor-1][0];
++	  for (bi = 0; bi < h_samp_factor; bi++) {
++	    thisblockrow[bi][0] = lastDC;
++	  }
++	  thisblockrow += h_samp_factor; /* advance to next MCU in row */
++	  lastblockrow += h_samp_factor;
++	}
++      }
++    }
++  }
++  /* NB: compress_output will increment iMCU_row_num if successful.
++   * A suspension return will result in redoing all the work above next time.
++   */
++
++  /* Emit data to the entropy encoder, sharing code with subsequent passes */
++  return compress_output(cinfo, input_buf);
++}
++
++
++/*
++ * Process some data in subsequent passes of a multi-pass case.
++ * We process the equivalent of one fully interleaved MCU row ("iMCU" row)
++ * per call, ie, v_samp_factor block rows for each component in the scan.
++ * The data is obtained from the virtual arrays and fed to the entropy coder.
++ * Returns TRUE if the iMCU row is completed, FALSE if suspended.
++ *
++ * NB: input_buf is ignored; it is likely to be a NULL pointer.
++ */
++
++METHODDEF(boolean)
++compress_output (j_compress_ptr cinfo, JSAMPIMAGE input_buf)
++{
++  my_coef_ptr coef = (my_coef_ptr) cinfo->coef;
++  JDIMENSION MCU_col_num;	/* index of current MCU within row */
++  int blkn, ci, xindex, yindex, yoffset;
++  JDIMENSION start_col;
++  JBLOCKARRAY buffer[MAX_COMPS_IN_SCAN];
++  JBLOCKROW buffer_ptr;
++  jpeg_component_info *compptr;
++
++  /* Align the virtual buffers for the components used in this scan.
++   * NB: during first pass, this is safe only because the buffers will
++   * already be aligned properly, so jmemmgr.c won't need to do any I/O.
++   */
++  for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
++    compptr = cinfo->cur_comp_info[ci];
++    buffer[ci] = (*cinfo->mem->access_virt_barray)
++      ((j_common_ptr) cinfo, coef->whole_image[compptr->component_index],
++       coef->iMCU_row_num * compptr->v_samp_factor,
++       (JDIMENSION) compptr->v_samp_factor, FALSE);
++  }
++
++  /* Loop to process one whole iMCU row */
++  for (yoffset = coef->MCU_vert_offset; yoffset < coef->MCU_rows_per_iMCU_row;
++       yoffset++) {
++    for (MCU_col_num = coef->mcu_ctr; MCU_col_num < cinfo->MCUs_per_row;
++	 MCU_col_num++) {
++      /* Construct list of pointers to DCT blocks belonging to this MCU */
++      blkn = 0;			/* index of current DCT block within MCU */
++      for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
++	compptr = cinfo->cur_comp_info[ci];
++	start_col = MCU_col_num * compptr->MCU_width;
++	for (yindex = 0; yindex < compptr->MCU_height; yindex++) {
++	  buffer_ptr = buffer[ci][yindex+yoffset] + start_col;
++	  for (xindex = 0; xindex < compptr->MCU_width; xindex++) {
++	    coef->MCU_buffer[blkn++] = buffer_ptr++;
++	  }
++	}
++      }
++      /* Try to write the MCU. */
++      if (! (*cinfo->entropy->encode_mcu) (cinfo, coef->MCU_buffer)) {
++	/* Suspension forced; update state counters and exit */
++	coef->MCU_vert_offset = yoffset;
++	coef->mcu_ctr = MCU_col_num;
++	return FALSE;
++      }
++    }
++    /* Completed an MCU row, but perhaps not an iMCU row */
++    coef->mcu_ctr = 0;
++  }
++  /* Completed the iMCU row, advance counters for next one */
++  coef->iMCU_row_num++;
++  start_iMCU_row(cinfo);
++  return TRUE;
++}
++
++#endif /* FULL_COEF_BUFFER_SUPPORTED */
++
++
++/*
++ * Initialize coefficient buffer controller.
++ */
++
++GLOBAL(void)
++jinit_c_coef_controller (j_compress_ptr cinfo, boolean need_full_buffer)
++{
++  my_coef_ptr coef;
++
++  coef = (my_coef_ptr)
++    (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
++				SIZEOF(my_coef_controller));
++  cinfo->coef = (struct jpeg_c_coef_controller *) coef;
++  coef->pub.start_pass = start_pass_coef;
++
++  /* Create the coefficient buffer. */
++  if (need_full_buffer) {
++#ifdef FULL_COEF_BUFFER_SUPPORTED
++    /* Allocate a full-image virtual array for each component, */
++    /* padded to a multiple of samp_factor DCT blocks in each direction. */
++    int ci;
++    jpeg_component_info *compptr;
++
++    for (ci = 0, compptr = cinfo->comp_info; ci < cinfo->num_components;
++	 ci++, compptr++) {
++      coef->whole_image[ci] = (*cinfo->mem->request_virt_barray)
++	((j_common_ptr) cinfo, JPOOL_IMAGE, FALSE,
++	 (JDIMENSION) jround_up((long) compptr->width_in_blocks,
++				(long) compptr->h_samp_factor),
++	 (JDIMENSION) jround_up((long) compptr->height_in_blocks,
++				(long) compptr->v_samp_factor),
++	 (JDIMENSION) compptr->v_samp_factor);
++    }
++#else
++    ERREXIT(cinfo, JERR_BAD_BUFFER_MODE);
++#endif
++  } else {
++    /* We only need a single-MCU buffer. */
++    JBLOCKROW buffer;
++    int i;
++
++    buffer = (JBLOCKROW)
++      (*cinfo->mem->alloc_large) ((j_common_ptr) cinfo, JPOOL_IMAGE,
++				  C_MAX_BLOCKS_IN_MCU * SIZEOF(JBLOCK));
++    for (i = 0; i < C_MAX_BLOCKS_IN_MCU; i++) {
++      coef->MCU_buffer[i] = buffer + i;
++    }
++    coef->whole_image[0] = NULL; /* flag for no virtual arrays */
++  }
++}
 === added file 'src/libjpeg-turbo/jccolext.c.inc'
 --- src/libjpeg-turbo/jccolext.c.inc	1970-01-01 00:00:00 +0000
 +++ src/libjpeg-turbo/jccolext.c.inc	2012-06-27 16:20:24 +0000
@@ -0,0 +1,114 @@
++/*
++ * jccolext.c
++ *
++ * Copyright (C) 1991-1996, Thomas G. Lane.
++ * Copyright (C) 2009-2011, D. R. Commander.
++ * This file is part of the Independent JPEG Group's software.
++ * For conditions of distribution and use, see the accompanying README file.
++ *
++ * This file contains input colorspace conversion routines.
++ */
++
++
++/* This file is included by jccolor.c */
++
++
++/*
++ * Convert some rows of samples to the JPEG colorspace.
++ *
++ * Note that we change from the application's interleaved-pixel format
++ * to our internal noninterleaved, one-plane-per-component format.
++ * The input buffer is therefore three times as wide as the output buffer.
++ *
++ * A starting row offset is provided only for the output buffer.  The caller
++ * can easily adjust the passed input_buf value to accommodate any row
++ * offset required on that side.
++ */
++
++INLINE
++LOCAL(void)
++rgb_ycc_convert_internal (j_compress_ptr cinfo,
++                          JSAMPARRAY input_buf, JSAMPIMAGE output_buf,
++                          JDIMENSION output_row, int num_rows)
++{
++  my_cconvert_ptr cconvert = (my_cconvert_ptr) cinfo->cconvert;
++  register int r, g, b;
++  register INT32 * ctab = cconvert->rgb_ycc_tab;
++  register JSAMPROW inptr;
++  register JSAMPROW outptr0, outptr1, outptr2;
++  register JDIMENSION col;
++  JDIMENSION num_cols = cinfo->image_width;
++
++  while (--num_rows >= 0) {
++    inptr = *input_buf++;
++    outptr0 = output_buf[0][output_row];
++    outptr1 = output_buf[1][output_row];
++    outptr2 = output_buf[2][output_row];
++    output_row++;
++    for (col = 0; col < num_cols; col++) {
++      r = GETJSAMPLE(inptr[RGB_RED]);
++      g = GETJSAMPLE(inptr[RGB_GREEN]);
++      b = GETJSAMPLE(inptr[RGB_BLUE]);
++      inptr += RGB_PIXELSIZE;
++      /* If the inputs are 0..MAXJSAMPLE, the outputs of these equations
++       * must be too; we do not need an explicit range-limiting operation.
++       * Hence the value being shifted is never negative, and we don't
++       * need the general RIGHT_SHIFT macro.
++       */
++      /* Y */
++      outptr0[col] = (JSAMPLE)
++		((ctab[r+R_Y_OFF] + ctab[g+G_Y_OFF] + ctab[b+B_Y_OFF])
++		 >> SCALEBITS);
++      /* Cb */
++      outptr1[col] = (JSAMPLE)
++		((ctab[r+R_CB_OFF] + ctab[g+G_CB_OFF] + ctab[b+B_CB_OFF])
++		 >> SCALEBITS);
++      /* Cr */
++      outptr2[col] = (JSAMPLE)
++		((ctab[r+R_CR_OFF] + ctab[g+G_CR_OFF] + ctab[b+B_CR_OFF])
++		 >> SCALEBITS);
++    }
++  }
++}
++
++
++/**************** Cases other than RGB -> YCbCr **************/
++
++
++/*
++ * Convert some rows of samples to the JPEG colorspace.
++ * This version handles RGB->grayscale conversion, which is the same
++ * as the RGB->Y portion of RGB->YCbCr.
++ * We assume rgb_ycc_start has been called (we only use the Y tables).
++ */
++
++INLINE
++LOCAL(void)
++rgb_gray_convert_internal (j_compress_ptr cinfo,
++                           JSAMPARRAY input_buf, JSAMPIMAGE output_buf,
++                           JDIMENSION output_row, int num_rows)
++{
++  my_cconvert_ptr cconvert = (my_cconvert_ptr) cinfo->cconvert;
++  register int r, g, b;
++  register INT32 * ctab = cconvert->rgb_ycc_tab;
++  register JSAMPROW inptr;
++  register JSAMPROW outptr;
++  register JDIMENSION col;
++  JDIMENSION num_cols = cinfo->image_width;
++
++  while (--num_rows >= 0) {
++    inptr = *input_buf++;
++    outptr = output_buf[0][output_row];
++    output_row++;
++    for (col = 0; col < num_cols; col++) {
++      r = GETJSAMPLE(inptr[RGB_RED]);
++      g = GETJSAMPLE(inptr[RGB_GREEN]);
++      b = GETJSAMPLE(inptr[RGB_BLUE]);
++      inptr += RGB_PIXELSIZE;
++      /* Y */
++      outptr[col] = (JSAMPLE)
++		((ctab[r+R_Y_OFF] + ctab[g+G_Y_OFF] + ctab[b+B_Y_OFF])
++		 >> SCALEBITS);
++    }
++  }
++}
 === added file 'src/libjpeg-turbo/jccolor.c'
 --- src/libjpeg-turbo/jccolor.c	1970-01-01 00:00:00 +0000
 +++ src/libjpeg-turbo/jccolor.c	2012-06-27 16:20:24 +0000
@@ -0,0 +1,599 @@
++/*
++ * jccolor.c
++ *
++ * Copyright (C) 1991-1996, Thomas G. Lane.
++ * Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
++ * Copyright (C) 2009-2011, D. R. Commander.
++ * This file is part of the Independent JPEG Group's software.
++ * For conditions of distribution and use, see the accompanying README file.
++ *
++ * This file contains input colorspace conversion routines.
++ */
++
++#define JPEG_INTERNALS
++#include "jinclude.h"
++#include "jpeglib.h"
++#include "jsimd.h"
++#include "config.h"
++
++
++/* Private subobject */
++
++typedef struct {
++  struct jpeg_color_converter pub; /* public fields */
++
++  /* Private state for RGB->YCC conversion */
++  INT32 * rgb_ycc_tab;		/* => table for RGB to YCbCr conversion */
++} my_color_converter;
++
++typedef my_color_converter * my_cconvert_ptr;
++
++
++/**************** RGB -> YCbCr conversion: most common case **************/
++
++/*
++ * YCbCr is defined per CCIR 601-1, except that Cb and Cr are
++ * normalized to the range 0..MAXJSAMPLE rather than -0.5 .. 0.5.
++ * The conversion equations to be implemented are therefore
++ *	Y  =  0.29900 * R + 0.58700 * G + 0.11400 * B
++ *	Cb = -0.16874 * R - 0.33126 * G + 0.50000 * B  + CENTERJSAMPLE
++ *	Cr =  0.50000 * R - 0.41869 * G - 0.08131 * B  + CENTERJSAMPLE
++ * (These numbers are derived from TIFF 6.0 section 21, dated 3-June-92.)
++ * Note: older versions of the IJG code used a zero offset of MAXJSAMPLE/2,
++ * rather than CENTERJSAMPLE, for Cb and Cr.  This gave equal positive and
++ * negative swings for Cb/Cr, but meant that grayscale values (Cb=Cr=0)
++ * were not represented exactly.  Now we sacrifice exact representation of
++ * maximum red and maximum blue in order to get exact grayscales.
++ *
++ * To avoid floating-point arithmetic, we represent the fractional constants
++ * as integers scaled up by 2^16 (about 4 digits precision); we have to divide
++ * the products by 2^16, with appropriate rounding, to get the correct answer.
++ *
++ * For even more speed, we avoid doing any multiplications in the inner loop
++ * by precalculating the constants times R,G,B for all possible values.
++ * For 8-bit JSAMPLEs this is very reasonable (only 256 entries per table);
++ * for 12-bit samples it is still acceptable.  It's not very reasonable for
++ * 16-bit samples, but if you want lossless storage you shouldn't be changing
++ * colorspace anyway.
++ * The CENTERJSAMPLE offsets and the rounding fudge-factor of 0.5 are included
++ * in the tables to save adding them separately in the inner loop.
++ */
++
++#define SCALEBITS	16	/* speediest right-shift on some machines */
++#define CBCR_OFFSET	((INT32) CENTERJSAMPLE << SCALEBITS)
++#define ONE_HALF	((INT32) 1 << (SCALEBITS-1))
++#define FIX(x)		((INT32) ((x) * (1L<<SCALEBITS) + 0.5))
++
++/* We allocate one big table and divide it up into eight parts, instead of
++ * doing eight alloc_small requests.  This lets us use a single table base
++ * address, which can be held in a register in the inner loops on many
++ * machines (more than can hold all eight addresses, anyway).
++ */
++
++#define R_Y_OFF		0			/* offset to R => Y section */
++#define G_Y_OFF		(1*(MAXJSAMPLE+1))	/* offset to G => Y section */
++#define B_Y_OFF		(2*(MAXJSAMPLE+1))	/* etc. */
++#define R_CB_OFF	(3*(MAXJSAMPLE+1))
++#define G_CB_OFF	(4*(MAXJSAMPLE+1))
++#define B_CB_OFF	(5*(MAXJSAMPLE+1))
++#define R_CR_OFF	B_CB_OFF		/* B=>Cb, R=>Cr are the same */
++#define G_CR_OFF	(6*(MAXJSAMPLE+1))
++#define B_CR_OFF	(7*(MAXJSAMPLE+1))
++#define TABLE_SIZE	(8*(MAXJSAMPLE+1))
++
++
++/* Include inline routines for colorspace extensions */
++
++#include "jccolext.c.inc"
++#undef RGB_RED
++#undef RGB_GREEN
++#undef RGB_BLUE
++#undef RGB_PIXELSIZE
++
++#define RGB_RED EXT_RGB_RED
++#define RGB_GREEN EXT_RGB_GREEN
++#define RGB_BLUE EXT_RGB_BLUE
++#define RGB_PIXELSIZE EXT_RGB_PIXELSIZE
++#define rgb_ycc_convert_internal extrgb_ycc_convert_internal
++#define rgb_gray_convert_internal extrgb_gray_convert_internal
++#include "jccolext.c.inc"
++#undef RGB_RED
++#undef RGB_GREEN
++#undef RGB_BLUE
++#undef RGB_PIXELSIZE
++#undef rgb_ycc_convert_internal
++#undef rgb_gray_convert_internal
++
++#define RGB_RED EXT_RGBX_RED
++#define RGB_GREEN EXT_RGBX_GREEN
++#define RGB_BLUE EXT_RGBX_BLUE
++#define RGB_PIXELSIZE EXT_RGBX_PIXELSIZE
++#define rgb_ycc_convert_internal extrgbx_ycc_convert_internal
++#define rgb_gray_convert_internal extrgbx_gray_convert_internal
++#include "jccolext.c.inc"
++#undef RGB_RED
++#undef RGB_GREEN
++#undef RGB_BLUE
++#undef RGB_PIXELSIZE
++#undef rgb_ycc_convert_internal
++#undef rgb_gray_convert_internal
++
++#define RGB_RED EXT_BGR_RED
++#define RGB_GREEN EXT_BGR_GREEN
++#define RGB_BLUE EXT_BGR_BLUE
++#define RGB_PIXELSIZE EXT_BGR_PIXELSIZE
++#define rgb_ycc_convert_internal extbgr_ycc_convert_internal
++#define rgb_gray_convert_internal extbgr_gray_convert_internal
++#include "jccolext.c.inc"
++#undef RGB_RED
++#undef RGB_GREEN
++#undef RGB_BLUE
++#undef RGB_PIXELSIZE
++#undef rgb_ycc_convert_internal
++#undef rgb_gray_convert_internal
++
++#define RGB_RED EXT_BGRX_RED
++#define RGB_GREEN EXT_BGRX_GREEN
++#define RGB_BLUE EXT_BGRX_BLUE
++#define RGB_PIXELSIZE EXT_BGRX_PIXELSIZE
++#define rgb_ycc_convert_internal extbgrx_ycc_convert_internal
++#define rgb_gray_convert_internal extbgrx_gray_convert_internal
++#include "jccolext.c.inc"
++#undef RGB_RED
++#undef RGB_GREEN
++#undef RGB_BLUE
++#undef RGB_PIXELSIZE
++#undef rgb_ycc_convert_internal
++#undef rgb_gray_convert_internal
++
++#define RGB_RED EXT_XBGR_RED
++#define RGB_GREEN EXT_XBGR_GREEN
++#define RGB_BLUE EXT_XBGR_BLUE
++#define RGB_PIXELSIZE EXT_XBGR_PIXELSIZE
++#define rgb_ycc_convert_internal extxbgr_ycc_convert_internal
++#define rgb_gray_convert_internal extxbgr_gray_convert_internal
++#include "jccolext.c.inc"
++#undef RGB_RED
++#undef RGB_GREEN
++#undef RGB_BLUE
++#undef RGB_PIXELSIZE
++#undef rgb_ycc_convert_internal
++#undef rgb_gray_convert_internal
++
++#define RGB_RED EXT_XRGB_RED
++#define RGB_GREEN EXT_XRGB_GREEN
++#define RGB_BLUE EXT_XRGB_BLUE
++#define RGB_PIXELSIZE EXT_XRGB_PIXELSIZE
++#define rgb_ycc_convert_internal extxrgb_ycc_convert_internal
++#define rgb_gray_convert_internal extxrgb_gray_convert_internal
++#include "jccolext.c.inc"
++#undef RGB_RED
++#undef RGB_GREEN
++#undef RGB_BLUE
++#undef RGB_PIXELSIZE
++#undef rgb_ycc_convert_internal
++#undef rgb_gray_convert_internal
++
++
++/*
++ * Initialize for RGB->YCC colorspace conversion.
++ */
++
++METHODDEF(void)
++rgb_ycc_start (j_compress_ptr cinfo)
++{
++  my_cconvert_ptr cconvert = (my_cconvert_ptr) cinfo->cconvert;
++  INT32 * rgb_ycc_tab;
++  INT32 i;
++
++  /* Allocate and fill in the conversion tables. */
++  cconvert->rgb_ycc_tab = rgb_ycc_tab = (INT32 *)
++    (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
++				(TABLE_SIZE * SIZEOF(INT32)));
++
++  for (i = 0; i <= MAXJSAMPLE; i++) {
++    rgb_ycc_tab[i+R_Y_OFF] = FIX(0.29900) * i;
++    rgb_ycc_tab[i+G_Y_OFF] = FIX(0.58700) * i;
++    rgb_ycc_tab[i+B_Y_OFF] = FIX(0.11400) * i     + ONE_HALF;
++    rgb_ycc_tab[i+R_CB_OFF] = (-FIX(0.16874)) * i;
++    rgb_ycc_tab[i+G_CB_OFF] = (-FIX(0.33126)) * i;
++    /* We use a rounding fudge-factor of 0.5-epsilon for Cb and Cr.
++     * This ensures that the maximum output will round to MAXJSAMPLE
++     * not MAXJSAMPLE+1, and thus that we don't have to range-limit.
++     */
++    rgb_ycc_tab[i+B_CB_OFF] = FIX(0.50000) * i    + CBCR_OFFSET + ONE_HALF-1;
++/*  B=>Cb and R=>Cr tables are the same
++    rgb_ycc_tab[i+R_CR_OFF] = FIX(0.50000) * i    + CBCR_OFFSET + ONE_HALF-1;
++*/
++    rgb_ycc_tab[i+G_CR_OFF] = (-FIX(0.41869)) * i;
++    rgb_ycc_tab[i+B_CR_OFF] = (-FIX(0.08131)) * i;
++  }
++}
++
++
++/*
++ * Convert some rows of samples to the JPEG colorspace.
++ */
++
++METHODDEF(void)
++rgb_ycc_convert (j_compress_ptr cinfo,
++		 JSAMPARRAY input_buf, JSAMPIMAGE output_buf,
++		 JDIMENSION output_row, int num_rows)
++{
++  switch (cinfo->in_color_space) {
++    case JCS_EXT_RGB:
++      extrgb_ycc_convert_internal(cinfo, input_buf, output_buf, output_row,
++                                  num_rows);
++      break;
++    case JCS_EXT_RGBX:
++    case JCS_EXT_RGBA:
++      extrgbx_ycc_convert_internal(cinfo, input_buf, output_buf, output_row,
++                                   num_rows);
++      break;
++    case JCS_EXT_BGR:
++      extbgr_ycc_convert_internal(cinfo, input_buf, output_buf, output_row,
++                                  num_rows);
++      break;
++    case JCS_EXT_BGRX:
++    case JCS_EXT_BGRA:
++      extbgrx_ycc_convert_internal(cinfo, input_buf, output_buf, output_row,
++                                   num_rows);
++      break;
++    case JCS_EXT_XBGR:
++    case JCS_EXT_ABGR:
++      extxbgr_ycc_convert_internal(cinfo, input_buf, output_buf, output_row,
++                                   num_rows);
++      break;
++    case JCS_EXT_XRGB:
++    case JCS_EXT_ARGB:
++      extxrgb_ycc_convert_internal(cinfo, input_buf, output_buf, output_row,
++                                   num_rows);
++      break;
++    default:
++      rgb_ycc_convert_internal(cinfo, input_buf, output_buf, output_row,
++                               num_rows);
++      break;
++  }
++}
++
++
++/**************** Cases other than RGB -> YCbCr **************/
++
++
++/*
++ * Convert some rows of samples to the JPEG colorspace.
++ */
++
++METHODDEF(void)
++rgb_gray_convert (j_compress_ptr cinfo,
++		  JSAMPARRAY input_buf, JSAMPIMAGE output_buf,
++		  JDIMENSION output_row, int num_rows)
++{
++  switch (cinfo->in_color_space) {
++    case JCS_EXT_RGB:
++      extrgb_gray_convert_internal(cinfo, input_buf, output_buf, output_row,
++                                   num_rows);
++      break;
++    case JCS_EXT_RGBX:
++    case JCS_EXT_RGBA:
++      extrgbx_gray_convert_internal(cinfo, input_buf, output_buf, output_row,
++                                    num_rows);
++      break;
++    case JCS_EXT_BGR:
++      extbgr_gray_convert_internal(cinfo, input_buf, output_buf, output_row,
++                                   num_rows);
++      break;
++    case JCS_EXT_BGRX:
++    case JCS_EXT_BGRA:
++      extbgrx_gray_convert_internal(cinfo, input_buf, output_buf, output_row,
++                                    num_rows);
++      break;
++    case JCS_EXT_XBGR:
++    case JCS_EXT_ABGR:
++      extxbgr_gray_convert_internal(cinfo, input_buf, output_buf, output_row,
++                                    num_rows);
++      break;
++    case JCS_EXT_XRGB:
++    case JCS_EXT_ARGB:
++      extxrgb_gray_convert_internal(cinfo, input_buf, output_buf, output_row,
++                                    num_rows);
++      break;
++    default:
++      rgb_gray_convert_internal(cinfo, input_buf, output_buf, output_row,
++                                num_rows);
++      break;
++  }
++}
++
++
++/*
++ * Convert some rows of samples to the JPEG colorspace.
++ * This version handles Adobe-style CMYK->YCCK conversion,
++ * where we convert R=1-C, G=1-M, and B=1-Y to YCbCr using the same
++ * conversion as above, while passing K (black) unchanged.
++ * We assume rgb_ycc_start has been called.
++ */
++
++METHODDEF(void)
++cmyk_ycck_convert (j_compress_ptr cinfo,
++		   JSAMPARRAY input_buf, JSAMPIMAGE output_buf,
++		   JDIMENSION output_row, int num_rows)
++{
++  my_cconvert_ptr cconvert = (my_cconvert_ptr) cinfo->cconvert;
++  register int r, g, b;
++  register INT32 * ctab = cconvert->rgb_ycc_tab;
++  register JSAMPROW inptr;
++  register JSAMPROW outptr0, outptr1, outptr2, outptr3;
++  register JDIMENSION col;
++  JDIMENSION num_cols = cinfo->image_width;
++
++  while (--num_rows >= 0) {
++    inptr = *input_buf++;
++    outptr0 = output_buf[0][output_row];
++    outptr1 = output_buf[1][output_row];
++    outptr2 = output_buf[2][output_row];
++    outptr3 = output_buf[3][output_row];
++    output_row++;
++    for (col = 0; col < num_cols; col++) {
++      r = MAXJSAMPLE - GETJSAMPLE(inptr[0]);
++      g = MAXJSAMPLE - GETJSAMPLE(inptr[1]);
++      b = MAXJSAMPLE - GETJSAMPLE(inptr[2]);
++      /* K passes through as-is */
++      outptr3[col] = inptr[3];	/* don't need GETJSAMPLE here */
++      inptr += 4;
++      /* If the inputs are 0..MAXJSAMPLE, the outputs of these equations
++       * must be too; we do not need an explicit range-limiting operation.
++       * Hence the value being shifted is never negative, and we don't
++       * need the general RIGHT_SHIFT macro.
++       */
++      /* Y */
++      outptr0[col] = (JSAMPLE)
++		((ctab[r+R_Y_OFF] + ctab[g+G_Y_OFF] + ctab[b+B_Y_OFF])
++		 >> SCALEBITS);
++      /* Cb */
++      outptr1[col] = (JSAMPLE)
++		((ctab[r+R_CB_OFF] + ctab[g+G_CB_OFF] + ctab[b+B_CB_OFF])
++		 >> SCALEBITS);
++      /* Cr */
++      outptr2[col] = (JSAMPLE)
++		((ctab[r+R_CR_OFF] + ctab[g+G_CR_OFF] + ctab[b+B_CR_OFF])
++		 >> SCALEBITS);
++    }
++  }
++}
++
++
++/*
++ * Convert some rows of samples to the JPEG colorspace.
++ * This version handles grayscale output with no conversion.
++ * The source can be either plain grayscale or YCbCr (since Y == gray).
++ */
++
++METHODDEF(void)
++grayscale_convert (j_compress_ptr cinfo,
++		   JSAMPARRAY input_buf, JSAMPIMAGE output_buf,
++		   JDIMENSION output_row, int num_rows)
++{
++  register JSAMPROW inptr;
++  register JSAMPROW outptr;
++  register JDIMENSION col;
++  JDIMENSION num_cols = cinfo->image_width;
++  int instride = cinfo->input_components;
++
++  while (--num_rows >= 0) {
++    inptr = *input_buf++;
++    outptr = output_buf[0][output_row];
++    output_row++;
++    for (col = 0; col < num_cols; col++) {
++      outptr[col] = inptr[0];	/* don't need GETJSAMPLE() here */
++      inptr += instride;
++    }
++  }
++}
++
++
++/*
++ * Convert some rows of samples to the JPEG colorspace.
++ * This version handles multi-component colorspaces without conversion.
++ * We assume input_components == num_components.
++ */
++
++METHODDEF(void)
++null_convert (j_compress_ptr cinfo,
++	      JSAMPARRAY input_buf, JSAMPIMAGE output_buf,
++	      JDIMENSION output_row, int num_rows)
++{
++  register JSAMPROW inptr;
++  register JSAMPROW outptr;
++  register JDIMENSION col;
++  register int ci;
++  int nc = cinfo->num_components;
++  JDIMENSION num_cols = cinfo->image_width;
++
++  while (--num_rows >= 0) {
++    /* It seems fastest to make a separate pass for each component. */
++    for (ci = 0; ci < nc; ci++) {
++      inptr = *input_buf;
++      outptr = output_buf[ci][output_row];
++      for (col = 0; col < num_cols; col++) {
++	outptr[col] = inptr[ci]; /* don't need GETJSAMPLE() here */
++	inptr += nc;
++      }
++    }
++    input_buf++;
++    output_row++;
++  }
++}
++
++
++/*
++ * Empty method for start_pass.
++ */
++
++METHODDEF(void)
++null_method (j_compress_ptr cinfo)
++{
++  /* no work needed */
++}
++
++
++/*
++ * Module initialization routine for input colorspace conversion.
++ */
++
++GLOBAL(void)
++jinit_color_converter (j_compress_ptr cinfo)
++{
++  my_cconvert_ptr cconvert;
++
++  cconvert = (my_cconvert_ptr)
++    (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
++				SIZEOF(my_color_converter));
++  cinfo->cconvert = (struct jpeg_color_converter *) cconvert;
++  /* set start_pass to null method until we find out differently */
++  cconvert->pub.start_pass = null_method;
++
++  /* Make sure input_components agrees with in_color_space */
++  switch (cinfo->in_color_space) {
++  case JCS_GRAYSCALE:
++    if (cinfo->input_components != 1)
++      ERREXIT(cinfo, JERR_BAD_IN_COLORSPACE);
++    break;
++
++  case JCS_RGB:
++  case JCS_EXT_RGB:
++  case JCS_EXT_RGBX:
++  case JCS_EXT_BGR:
++  case JCS_EXT_BGRX:
++  case JCS_EXT_XBGR:
++  case JCS_EXT_XRGB:
++  case JCS_EXT_RGBA:
++  case JCS_EXT_BGRA:
++  case JCS_EXT_ABGR:
++  case JCS_EXT_ARGB:
++    if (cinfo->input_components != rgb_pixelsize[cinfo->in_color_space])
++      ERREXIT(cinfo, JERR_BAD_IN_COLORSPACE);
++    break;
++
++  case JCS_YCbCr:
++    if (cinfo->input_components != 3)
++      ERREXIT(cinfo, JERR_BAD_IN_COLORSPACE);
++    break;
++
++  case JCS_CMYK:
++  case JCS_YCCK:
++    if (cinfo->input_components != 4)
++      ERREXIT(cinfo, JERR_BAD_IN_COLORSPACE);
++    break;
++
++  default:			/* JCS_UNKNOWN can be anything */
++    if (cinfo->input_components < 1)
++      ERREXIT(cinfo, JERR_BAD_IN_COLORSPACE);
++    break;
++  }
++
++  /* Check num_components, set conversion method based on requested space */
++  switch (cinfo->jpeg_color_space) {
++  case JCS_GRAYSCALE:
++    if (cinfo->num_components != 1)
++      ERREXIT(cinfo, JERR_BAD_J_COLORSPACE);
++    if (cinfo->in_color_space == JCS_GRAYSCALE)
++      cconvert->pub.color_convert = grayscale_convert;
++    else if (cinfo->in_color_space == JCS_RGB ||
++             cinfo->in_color_space == JCS_EXT_RGB ||
++             cinfo->in_color_space == JCS_EXT_RGBX ||
++             cinfo->in_color_space == JCS_EXT_BGR ||
++             cinfo->in_color_space == JCS_EXT_BGRX ||
++             cinfo->in_color_space == JCS_EXT_XBGR ||
++             cinfo->in_color_space == JCS_EXT_XRGB ||
++             cinfo->in_color_space == JCS_EXT_RGBA ||
++             cinfo->in_color_space == JCS_EXT_BGRA ||
++             cinfo->in_color_space == JCS_EXT_ABGR ||
++             cinfo->in_color_space == JCS_EXT_ARGB) {
++      if (jsimd_can_rgb_gray())
++        cconvert->pub.color_convert = jsimd_rgb_gray_convert;
++      else {
++        cconvert->pub.start_pass = rgb_ycc_start;
++        cconvert->pub.color_convert = rgb_gray_convert;
++      }
++    } else if (cinfo->in_color_space == JCS_YCbCr)
++      cconvert->pub.color_convert = grayscale_convert;
++    else
++      ERREXIT(cinfo, JERR_CONVERSION_NOTIMPL);
++    break;
++
++  case JCS_RGB:
++  case JCS_EXT_RGB:
++  case JCS_EXT_RGBX:
++  case JCS_EXT_BGR:
++  case JCS_EXT_BGRX:
++  case JCS_EXT_XBGR:
++  case JCS_EXT_XRGB:
++  case JCS_EXT_RGBA:
++  case JCS_EXT_BGRA:
++  case JCS_EXT_ABGR:
++  case JCS_EXT_ARGB:
++    if (cinfo->num_components != 3)
++      ERREXIT(cinfo, JERR_BAD_J_COLORSPACE);
++    if (cinfo->in_color_space == cinfo->jpeg_color_space &&
++      rgb_pixelsize[cinfo->in_color_space] == 3)
++      cconvert->pub.color_convert = null_convert;
++    else
++      ERREXIT(cinfo, JERR_CONVERSION_NOTIMPL);
++    break;
++
++  case JCS_YCbCr:
++    if (cinfo->num_components != 3)
++      ERREXIT(cinfo, JERR_BAD_J_COLORSPACE);
++    if (cinfo->in_color_space == JCS_RGB ||
++        cinfo->in_color_space == JCS_EXT_RGB ||
++        cinfo->in_color_space == JCS_EXT_RGBX ||
++        cinfo->in_color_space == JCS_EXT_BGR ||
++        cinfo->in_color_space == JCS_EXT_BGRX ||
++        cinfo->in_color_space == JCS_EXT_XBGR ||
++        cinfo->in_color_space == JCS_EXT_XRGB ||
++        cinfo->in_color_space == JCS_EXT_RGBA ||
++        cinfo->in_color_space == JCS_EXT_BGRA ||
++        cinfo->in_color_space == JCS_EXT_ABGR ||
++        cinfo->in_color_space == JCS_EXT_ARGB) {
++      if (jsimd_can_rgb_ycc())
++        cconvert->pub.color_convert = jsimd_rgb_ycc_convert;
++      else {
++        cconvert->pub.start_pass = rgb_ycc_start;
++        cconvert->pub.color_convert = rgb_ycc_convert;
++      }
++    } else if (cinfo->in_color_space == JCS_YCbCr)
++      cconvert->pub.color_convert = null_convert;
++    else
++      ERREXIT(cinfo, JERR_CONVERSION_NOTIMPL);
++    break;
++
++  case JCS_CMYK:
++    if (cinfo->num_components != 4)
++      ERREXIT(cinfo, JERR_BAD_J_COLORSPACE);
++    if (cinfo->in_color_space == JCS_CMYK)
++      cconvert->pub.color_convert = null_convert;
++    else
++      ERREXIT(cinfo, JERR_CONVERSION_NOTIMPL);
++    break;
++
++  case JCS_YCCK:
++    if (cinfo->num_components != 4)
++      ERREXIT(cinfo, JERR_BAD_J_COLORSPACE);
++    if (cinfo->in_color_space == JCS_CMYK) {
++      cconvert->pub.start_pass = rgb_ycc_start;
++      cconvert->pub.color_convert = cmyk_ycck_convert;
++    } else if (cinfo->in_color_space == JCS_YCCK)
++      cconvert->pub.color_convert = null_convert;
++    else
++      ERREXIT(cinfo, JERR_CONVERSION_NOTIMPL);
++    break;
++
++  default:			/* allow null conversion of JCS_UNKNOWN */
++    if (cinfo->jpeg_color_space != cinfo->in_color_space ||
++	cinfo->num_components != cinfo->input_components)
++      ERREXIT(cinfo, JERR_CONVERSION_NOTIMPL);
++    cconvert->pub.color_convert = null_convert;
++    break;
++  }
++}
 === added file 'src/libjpeg-turbo/jcdctmgr.c'
 --- src/libjpeg-turbo/jcdctmgr.c	1970-01-01 00:00:00 +0000
 +++ src/libjpeg-turbo/jcdctmgr.c	2012-06-27 16:20:24 +0000
@@ -0,0 +1,642 @@
++/*
++ * jcdctmgr.c
++ *
++ * Copyright (C) 1994-1996, Thomas G. Lane.
++ * Copyright (C) 1999-2006, MIYASAKA Masaru.
++ * Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
++ * Copyright (C) 2011 D. R. Commander
++ * This file is part of the Independent JPEG Group's software.
++ * For conditions of distribution and use, see the accompanying README file.
++ *
++ * This file contains the forward-DCT management logic.
++ * This code selects a particular DCT implementation to be used,
++ * and it performs related housekeeping chores including coefficient
++ * quantization.
++ */
++
++#define JPEG_INTERNALS
++#include "jinclude.h"
++#include "jpeglib.h"
++#include "jdct.h"		/* Private declarations for DCT subsystem */
++#include "jsimddct.h"
++
++
++/* Private subobject for this module */
++
++typedef JMETHOD(void, forward_DCT_method_ptr, (DCTELEM * data));
++typedef JMETHOD(void, float_DCT_method_ptr, (FAST_FLOAT * data));
++
++typedef JMETHOD(void, convsamp_method_ptr,
++                (JSAMPARRAY sample_data, JDIMENSION start_col,
++                 DCTELEM * workspace));
++typedef JMETHOD(void, float_convsamp_method_ptr,
++                (JSAMPARRAY sample_data, JDIMENSION start_col,
++                 FAST_FLOAT *workspace));
++
++typedef JMETHOD(void, quantize_method_ptr,
++                (JCOEFPTR coef_block, DCTELEM * divisors,
++                 DCTELEM * workspace));
++typedef JMETHOD(void, float_quantize_method_ptr,
++                (JCOEFPTR coef_block, FAST_FLOAT * divisors,
++                 FAST_FLOAT * workspace));
++
++METHODDEF(void) quantize (JCOEFPTR, DCTELEM *, DCTELEM *);
++
++typedef struct {
++  struct jpeg_forward_dct pub;	/* public fields */
++
++  /* Pointer to the DCT routine actually in use */
++  forward_DCT_method_ptr dct;
++  convsamp_method_ptr convsamp;
++  quantize_method_ptr quantize;
++
++  /* The actual post-DCT divisors --- not identical to the quant table
++   * entries, because of scaling (especially for an unnormalized DCT).
++   * Each table is given in normal array order.
++   */
++  DCTELEM * divisors[NUM_QUANT_TBLS];
++
++  /* work area for FDCT subroutine */
++  DCTELEM * workspace;
++
++#ifdef DCT_FLOAT_SUPPORTED
++  /* Same as above for the floating-point case. */
++  float_DCT_method_ptr float_dct;
++  float_convsamp_method_ptr float_convsamp;
++  float_quantize_method_ptr float_quantize;
++  FAST_FLOAT * float_divisors[NUM_QUANT_TBLS];
++  FAST_FLOAT * float_workspace;
++#endif
++} my_fdct_controller;
++
++typedef my_fdct_controller * my_fdct_ptr;
++
++
++/*
++ * Find the highest bit in an integer through binary search.
++ */
++LOCAL(int)
++flss (UINT16 val)
++{
++  int bit;
++
++  bit = 16;
++
++  if (!val)
++    return 0;
++
++  if (!(val & 0xff00)) {
++    bit -= 8;
++    val <<= 8;
++  }
++  if (!(val & 0xf000)) {
++    bit -= 4;
++    val <<= 4;
++  }
++  if (!(val & 0xc000)) {
++    bit -= 2;
++    val <<= 2;
++  }
++  if (!(val & 0x8000)) {
++    bit -= 1;
++    val <<= 1;
++  }
++
++  return bit;
++}
++
++/*
++ * Compute values to do a division using reciprocal.
++ *
++ * This implementation is based on an algorithm described in
++ *   "How to optimize for the Pentium family of microprocessors"
++ *   (http://www.agner.org/assem/).
++ * More information about the basic algorithm can be found in
++ * the paper "Integer Division Using Reciprocals" by Robert Alverson.
++ *
++ * The basic idea is to replace x/d by x * d^-1. In order to store
++ * d^-1 with enough precision we shift it left a few places. It turns
++ * out that this algoright gives just enough precision, and also fits
++ * into DCTELEM:
++ *
++ *   b = (the number of significant bits in divisor) - 1
++ *   r = (word size) + b
++ *   f = 2^r / divisor
++ *
++ * f will not be an integer for most cases, so we need to compensate
++ * for the rounding error introduced:
++ *
++ *   no fractional part:
++ *
++ *       result = input >> r
++ *
++ *   fractional part of f < 0.5:
++ *
++ *       round f down to nearest integer
++ *       result = ((input + 1) * f) >> r
++ *
++ *   fractional part of f > 0.5:
++ *
++ *       round f up to nearest integer
++ *       result = (input * f) >> r
++ *
++ * This is the original algorithm that gives truncated results. But we
++ * want properly rounded results, so we replace "input" with
++ * "input + divisor/2".
++ *
++ * In order to allow SIMD implementations we also tweak the values to
++ * allow the same calculation to be made at all times:
++ *
++ *   dctbl[0] = f rounded to nearest integer
++ *   dctbl[1] = divisor / 2 (+ 1 if fractional part of f < 0.5)
++ *   dctbl[2] = 1 << ((word size) * 2 - r)
++ *   dctbl[3] = r - (word size)
++ *
++ * dctbl[2] is for stupid instruction sets where the shift operation
++ * isn't member wise (e.g. MMX).
++ *
++ * The reason dctbl[2] and dctbl[3] reduce the shift with (word size)
++ * is that most SIMD implementations have a "multiply and store top
++ * half" operation.
++ *
++ * Lastly, we store each of the values in their own table instead
++ * of in a consecutive manner, yet again in order to allow SIMD
++ * routines.
++ */
++LOCAL(int)
++compute_reciprocal (UINT16 divisor, DCTELEM * dtbl)
++{
++  UDCTELEM2 fq, fr;
++  UDCTELEM c;
++  int b, r;
++
++  b = flss(divisor) - 1;
++  r  = sizeof(DCTELEM) * 8 + b;
++
++  fq = ((UDCTELEM2)1 << r) / divisor;
++  fr = ((UDCTELEM2)1 << r) % divisor;
++
++  c = divisor / 2; /* for rounding */
++
++  if (fr == 0) { /* divisor is power of two */
++    /* fq will be one bit too large to fit in DCTELEM, so adjust */
++    fq >>= 1;
++    r--;
++  } else if (fr <= (divisor / 2U)) { /* fractional part is < 0.5 */
++    c++;
++  } else { /* fractional part is > 0.5 */
++    fq++;
++  }
++
++  dtbl[DCTSIZE2 * 0] = (DCTELEM) fq;      /* reciprocal */
++  dtbl[DCTSIZE2 * 1] = (DCTELEM) c;       /* correction + roundfactor */
++  dtbl[DCTSIZE2 * 2] = (DCTELEM) (1 << (sizeof(DCTELEM)*8*2 - r));  /* scale */
++  dtbl[DCTSIZE2 * 3] = (DCTELEM) r - sizeof(DCTELEM)*8; /* shift */
++
++  if(r <= 16) return 0;
++  else return 1;
++}
++
++/*
++ * Initialize for a processing pass.
++ * Verify that all referenced Q-tables are present, and set up
++ * the divisor table for each one.
++ * In the current implementation, DCT of all components is done during
++ * the first pass, even if only some components will be output in the
++ * first scan.  Hence all components should be examined here.
++ */
++
++METHODDEF(void)
++start_pass_fdctmgr (j_compress_ptr cinfo)
++{
++  my_fdct_ptr fdct = (my_fdct_ptr) cinfo->fdct;
++  int ci, qtblno, i;
++  jpeg_component_info *compptr;
++  JQUANT_TBL * qtbl;
++  DCTELEM * dtbl;
++
++  for (ci = 0, compptr = cinfo->comp_info; ci < cinfo->num_components;
++       ci++, compptr++) {
++    qtblno = compptr->quant_tbl_no;
++    /* Make sure specified quantization table is present */
++    if (qtblno < 0 || qtblno >= NUM_QUANT_TBLS ||
++	cinfo->quant_tbl_ptrs[qtblno] == NULL)
++      ERREXIT1(cinfo, JERR_NO_QUANT_TABLE, qtblno);
++    qtbl = cinfo->quant_tbl_ptrs[qtblno];
++    /* Compute divisors for this quant table */
++    /* We may do this more than once for same table, but it's not a big deal */
++    switch (cinfo->dct_method) {
++#ifdef DCT_ISLOW_SUPPORTED
++    case JDCT_ISLOW:
++      /* For LL&M IDCT method, divisors are equal to raw quantization
++       * coefficients multiplied by 8 (to counteract scaling).
++       */
++      if (fdct->divisors[qtblno] == NULL) {
++	fdct->divisors[qtblno] = (DCTELEM *)
++	  (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
++				      (DCTSIZE2 * 4) * SIZEOF(DCTELEM));
++      }
++      dtbl = fdct->divisors[qtblno];
++      for (i = 0; i < DCTSIZE2; i++) {
++	if(!compute_reciprocal(qtbl->quantval[i] << 3, &dtbl[i])
++	  && fdct->quantize == jsimd_quantize)
++	  fdct->quantize = quantize;
++      }
++      break;
++#endif
++#ifdef DCT_IFAST_SUPPORTED
++    case JDCT_IFAST:
++      {
++	/* For AA&N IDCT method, divisors are equal to quantization
++	 * coefficients scaled by scalefactor[row]*scalefactor[col], where
++	 *   scalefactor[0] = 1
++	 *   scalefactor[k] = cos(k*PI/16) * sqrt(2)    for k=1..7
++	 * We apply a further scale factor of 8.
++	 */
++#define CONST_BITS 14
++	static const INT16 aanscales[DCTSIZE2] = {
++	  /* precomputed values scaled up by 14 bits */
++	  16384, 22725, 21407, 19266, 16384, 12873,  8867,  4520,
++	  22725, 31521, 29692, 26722, 22725, 17855, 12299,  6270,
++	  21407, 29692, 27969, 25172, 21407, 16819, 11585,  5906,
++	  19266, 26722, 25172, 22654, 19266, 15137, 10426,  5315,
++	  16384, 22725, 21407, 19266, 16384, 12873,  8867,  4520,
++	  12873, 17855, 16819, 15137, 12873, 10114,  6967,  3552,
++	   8867, 12299, 11585, 10426,  8867,  6967,  4799,  2446,
++	   4520,  6270,  5906,  5315,  4520,  3552,  2446,  1247
++	};
++	SHIFT_TEMPS
++
++	if (fdct->divisors[qtblno] == NULL) {
++	  fdct->divisors[qtblno] = (DCTELEM *)
++	    (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
++					(DCTSIZE2 * 4) * SIZEOF(DCTELEM));
++	}
++	dtbl = fdct->divisors[qtblno];
++	for (i = 0; i < DCTSIZE2; i++) {
++	  if(!compute_reciprocal(
++	    DESCALE(MULTIPLY16V16((INT32) qtbl->quantval[i],
++				  (INT32) aanscales[i]),
++		    CONST_BITS-3), &dtbl[i])
++	    && fdct->quantize == jsimd_quantize)
++	    fdct->quantize = quantize;
++	}
++      }
++      break;
++#endif
++#ifdef DCT_FLOAT_SUPPORTED
++    case JDCT_FLOAT:
++      {
++	/* For float AA&N IDCT method, divisors are equal to quantization
++	 * coefficients scaled by scalefactor[row]*scalefactor[col], where
++	 *   scalefactor[0] = 1
++	 *   scalefactor[k] = cos(k*PI/16) * sqrt(2)    for k=1..7
++	 * We apply a further scale factor of 8.
++	 * What's actually stored is 1/divisor so that the inner loop can
++	 * use a multiplication rather than a division.
++	 */
++	FAST_FLOAT * fdtbl;
++	int row, col;
++	static const double aanscalefactor[DCTSIZE] = {
++	  1.0, 1.387039845, 1.306562965, 1.175875602,
++	  1.0, 0.785694958, 0.541196100, 0.275899379
++	};
++
++	if (fdct->float_divisors[qtblno] == NULL) {
++	  fdct->float_divisors[qtblno] = (FAST_FLOAT *)
++	    (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
++					DCTSIZE2 * SIZEOF(FAST_FLOAT));
++	}
++	fdtbl = fdct->float_divisors[qtblno];
++	i = 0;
++	for (row = 0; row < DCTSIZE; row++) {
++	  for (col = 0; col < DCTSIZE; col++) {
++	    fdtbl[i] = (FAST_FLOAT)
++	      (1.0 / (((double) qtbl->quantval[i] *
++		       aanscalefactor[row] * aanscalefactor[col] * 8.0)));
++	    i++;
++	  }
++	}
++      }
++      break;
++#endif
++    default:
++      ERREXIT(cinfo, JERR_NOT_COMPILED);
++      break;
++    }
++  }
++}
++
++
++/*
++ * Load data into workspace, applying unsigned->signed conversion.
++ */
++
++METHODDEF(void)
++convsamp (JSAMPARRAY sample_data, JDIMENSION start_col, DCTELEM * workspace)
++{
++  register DCTELEM *workspaceptr;
++  register JSAMPROW elemptr;
++  register int elemr;
++
++  workspaceptr = workspace;
++  for (elemr = 0; elemr < DCTSIZE; elemr++) {
++    elemptr = sample_data[elemr] + start_col;
++
++#if DCTSIZE == 8		/* unroll the inner loop */
++    *workspaceptr++ = GETJSAMPLE(*elemptr++) - CENTERJSAMPLE;
++    *workspaceptr++ = GETJSAMPLE(*elemptr++) - CENTERJSAMPLE;
++    *workspaceptr++ = GETJSAMPLE(*elemptr++) - CENTERJSAMPLE;
++    *workspaceptr++ = GETJSAMPLE(*elemptr++) - CENTERJSAMPLE;
++    *workspaceptr++ = GETJSAMPLE(*elemptr++) - CENTERJSAMPLE;
++    *workspaceptr++ = GETJSAMPLE(*elemptr++) - CENTERJSAMPLE;
++    *workspaceptr++ = GETJSAMPLE(*elemptr++) - CENTERJSAMPLE;
++    *workspaceptr++ = GETJSAMPLE(*elemptr++) - CENTERJSAMPLE;
++#else
++    {
++      register int elemc;
++      for (elemc = DCTSIZE; elemc > 0; elemc--)
++        *workspaceptr++ = GETJSAMPLE(*elemptr++) - CENTERJSAMPLE;
++    }
++#endif
++  }
++}
++
++
++/*
++ * Quantize/descale the coefficients, and store into coef_blocks[].
++ */
++
++METHODDEF(void)
++quantize (JCOEFPTR coef_block, DCTELEM * divisors, DCTELEM * workspace)
++{
++  int i;
++  DCTELEM temp;
++  UDCTELEM recip, corr, shift;
++  UDCTELEM2 product;
++  JCOEFPTR output_ptr = coef_block;
++
++  for (i = 0; i < DCTSIZE2; i++) {
++    temp = workspace[i];
++    recip = divisors[i + DCTSIZE2 * 0];
++    corr =  divisors[i + DCTSIZE2 * 1];
++    shift = divisors[i + DCTSIZE2 * 3];
++
++    if (temp < 0) {
++      temp = -temp;
++      product = (UDCTELEM2)(temp + corr) * recip;
++      product >>= shift + sizeof(DCTELEM)*8;
++      temp = product;
++      temp = -temp;
++    } else {
++      product = (UDCTELEM2)(temp + corr) * recip;
++      product >>= shift + sizeof(DCTELEM)*8;
++      temp = product;
++    }
++
++    output_ptr[i] = (JCOEF) temp;
++  }
++}
++
++
++/*
++ * Perform forward DCT on one or more blocks of a component.
++ *
++ * The input samples are taken from the sample_data[] array starting at
++ * position start_row/start_col, and moving to the right for any additional
++ * blocks. The quantized coefficients are returned in coef_blocks[].
++ */
++
++METHODDEF(void)
++forward_DCT (j_compress_ptr cinfo, jpeg_component_info * compptr,
++	     JSAMPARRAY sample_data, JBLOCKROW coef_blocks,
++	     JDIMENSION start_row, JDIMENSION start_col,
++	     JDIMENSION num_blocks)
++/* This version is used for integer DCT implementations. */
++{
++  /* This routine is heavily used, so it's worth coding it tightly. */
++  my_fdct_ptr fdct = (my_fdct_ptr) cinfo->fdct;
++  DCTELEM * divisors = fdct->divisors[compptr->quant_tbl_no];
++  DCTELEM * workspace;
++  JDIMENSION bi;
++
++  /* Make sure the compiler doesn't look up these every pass */
++  forward_DCT_method_ptr do_dct = fdct->dct;
++  convsamp_method_ptr do_convsamp = fdct->convsamp;
++  quantize_method_ptr do_quantize = fdct->quantize;
++  workspace = fdct->workspace;
++
++  sample_data += start_row;	/* fold in the vertical offset once */
++
++  for (bi = 0; bi < num_blocks; bi++, start_col += DCTSIZE) {
++    /* Load data into workspace, applying unsigned->signed conversion */
++    (*do_convsamp) (sample_data, start_col, workspace);
++
++    /* Perform the DCT */
++    (*do_dct) (workspace);
++
++    /* Quantize/descale the coefficients, and store into coef_blocks[] */
++    (*do_quantize) (coef_blocks[bi], divisors, workspace);
++  }
++}
++
++
++#ifdef DCT_FLOAT_SUPPORTED
++
++
++METHODDEF(void)
++convsamp_float (JSAMPARRAY sample_data, JDIMENSION start_col, FAST_FLOAT * workspace)
++{
++  register FAST_FLOAT *workspaceptr;
++  register JSAMPROW elemptr;
++  register int elemr;
++
++  workspaceptr = workspace;
++  for (elemr = 0; elemr < DCTSIZE; elemr++) {
++    elemptr = sample_data[elemr] + start_col;
++#if DCTSIZE == 8		/* unroll the inner loop */
++    *workspaceptr++ = (FAST_FLOAT)(GETJSAMPLE(*elemptr++) - CENTERJSAMPLE);
++    *workspaceptr++ = (FAST_FLOAT)(GETJSAMPLE(*elemptr++) - CENTERJSAMPLE);
++    *workspaceptr++ = (FAST_FLOAT)(GETJSAMPLE(*elemptr++) - CENTERJSAMPLE);
++    *workspaceptr++ = (FAST_FLOAT)(GETJSAMPLE(*elemptr++) - CENTERJSAMPLE);
++    *workspaceptr++ = (FAST_FLOAT)(GETJSAMPLE(*elemptr++) - CENTERJSAMPLE);
++    *workspaceptr++ = (FAST_FLOAT)(GETJSAMPLE(*elemptr++) - CENTERJSAMPLE);
++    *workspaceptr++ = (FAST_FLOAT)(GETJSAMPLE(*elemptr++) - CENTERJSAMPLE);
++    *workspaceptr++ = (FAST_FLOAT)(GETJSAMPLE(*elemptr++) - CENTERJSAMPLE);
++#else
++    {
++      register int elemc;
++      for (elemc = DCTSIZE; elemc > 0; elemc--)
++        *workspaceptr++ = (FAST_FLOAT)
++                          (GETJSAMPLE(*elemptr++) - CENTERJSAMPLE);
++    }
++#endif
++  }
++}
++
++
++METHODDEF(void)
++quantize_float (JCOEFPTR coef_block, FAST_FLOAT * divisors, FAST_FLOAT * workspace)
++{
++  register FAST_FLOAT temp;
++  register int i;
++  register JCOEFPTR output_ptr = coef_block;
++
++  for (i = 0; i < DCTSIZE2; i++) {
++    /* Apply the quantization and scaling factor */
++    temp = workspace[i] * divisors[i];
++
++    /* Round to nearest integer.
++     * Since C does not specify the direction of rounding for negative
++     * quotients, we have to force the dividend positive for portability.
++     * The maximum coefficient size is +-16K (for 12-bit data), so this
++     * code should work for either 16-bit or 32-bit ints.
++     */
++    output_ptr[i] = (JCOEF) ((int) (temp + (FAST_FLOAT) 16384.5) - 16384);
++  }
++}
++
++
++METHODDEF(void)
++forward_DCT_float (j_compress_ptr cinfo, jpeg_component_info * compptr,
++		   JSAMPARRAY sample_data, JBLOCKROW coef_blocks,
++		   JDIMENSION start_row, JDIMENSION start_col,
++		   JDIMENSION num_blocks)
++/* This version is used for floating-point DCT implementations. */
++{
++  /* This routine is heavily used, so it's worth coding it tightly. */
++  my_fdct_ptr fdct = (my_fdct_ptr) cinfo->fdct;
++  FAST_FLOAT * divisors = fdct->float_divisors[compptr->quant_tbl_no];
++  FAST_FLOAT * workspace;
++  JDIMENSION bi;
++
++
++  /* Make sure the compiler doesn't look up these every pass */
++  float_DCT_method_ptr do_dct = fdct->float_dct;
++  float_convsamp_method_ptr do_convsamp = fdct->float_convsamp;
++  float_quantize_method_ptr do_quantize = fdct->float_quantize;
++  workspace = fdct->float_workspace;
++
++  sample_data += start_row;	/* fold in the vertical offset once */
++
++  for (bi = 0; bi < num_blocks; bi++, start_col += DCTSIZE) {
++    /* Load data into workspace, applying unsigned->signed conversion */
++    (*do_convsamp) (sample_data, start_col, workspace);
++
++    /* Perform the DCT */
++    (*do_dct) (workspace);
++
++    /* Quantize/descale the coefficients, and store into coef_blocks[] */
++    (*do_quantize) (coef_blocks[bi], divisors, workspace);
++  }
++}
++
++#endif /* DCT_FLOAT_SUPPORTED */
++
++
++/*
++ * Initialize FDCT manager.
++ */
++
++GLOBAL(void)
++jinit_forward_dct (j_compress_ptr cinfo)
++{
++  my_fdct_ptr fdct;
++  int i;
++
++  fdct = (my_fdct_ptr)
++    (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
++				SIZEOF(my_fdct_controller));
++  cinfo->fdct = (struct jpeg_forward_dct *) fdct;
++  fdct->pub.start_pass = start_pass_fdctmgr;
++
++  /* First determine the DCT... */
++  switch (cinfo->dct_method) {
++#ifdef DCT_ISLOW_SUPPORTED
++  case JDCT_ISLOW:
++    fdct->pub.forward_DCT = forward_DCT;
++    if (jsimd_can_fdct_islow())
++      fdct->dct = jsimd_fdct_islow;
++    else
++      fdct->dct = jpeg_fdct_islow;
++    break;
++#endif
++#ifdef DCT_IFAST_SUPPORTED
++  case JDCT_IFAST:
++    fdct->pub.forward_DCT = forward_DCT;
++    if (jsimd_can_fdct_ifast())
++      fdct->dct = jsimd_fdct_ifast;
++    else
++      fdct->dct = jpeg_fdct_ifast;
++    break;
++#endif
++#ifdef DCT_FLOAT_SUPPORTED
++  case JDCT_FLOAT:
++    fdct->pub.forward_DCT = forward_DCT_float;
++    if (jsimd_can_fdct_float())
++      fdct->float_dct = jsimd_fdct_float;
++    else
++      fdct->float_dct = jpeg_fdct_float;
++    break;
++#endif
++  default:
++    ERREXIT(cinfo, JERR_NOT_COMPILED);
++    break;
++  }
++
++  /* ...then the supporting stages. */
++  switch (cinfo->dct_method) {
++#ifdef DCT_ISLOW_SUPPORTED
++  case JDCT_ISLOW:
++#endif
++#ifdef DCT_IFAST_SUPPORTED
++  case JDCT_IFAST:
++#endif
++#if defined(DCT_ISLOW_SUPPORTED) || defined(DCT_IFAST_SUPPORTED)
++    if (jsimd_can_convsamp())
++      fdct->convsamp = jsimd_convsamp;
++    else
++      fdct->convsamp = convsamp;
++    if (jsimd_can_quantize())
++      fdct->quantize = jsimd_quantize;
++    else
++      fdct->quantize = quantize;
++    break;
++#endif
++#ifdef DCT_FLOAT_SUPPORTED
++  case JDCT_FLOAT:
++    if (jsimd_can_convsamp_float())
++      fdct->float_convsamp = jsimd_convsamp_float;
++    else
++      fdct->float_convsamp = convsamp_float;
++    if (jsimd_can_quantize_float())
++      fdct->float_quantize = jsimd_quantize_float;
++    else
++      fdct->float_quantize = quantize_float;
++    break;
++#endif
++  default:
++    ERREXIT(cinfo, JERR_NOT_COMPILED);
++    break;
++  }
++
++  /* Allocate workspace memory */
++#ifdef DCT_FLOAT_SUPPORTED
++  if (cinfo->dct_method == JDCT_FLOAT)
++    fdct->float_workspace = (FAST_FLOAT *)
++      (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
++				  SIZEOF(FAST_FLOAT) * DCTSIZE2);
++  else
++#endif
++    fdct->workspace = (DCTELEM *)
++      (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
++				  SIZEOF(DCTELEM) * DCTSIZE2);
++
++  /* Mark divisor tables unallocated */
++  for (i = 0; i < NUM_QUANT_TBLS; i++) {
++    fdct->divisors[i] = NULL;
++#ifdef DCT_FLOAT_SUPPORTED
++    fdct->float_divisors[i] = NULL;
++#endif
++  }
++}
 === added file 'src/libjpeg-turbo/jchuff.c'
 --- src/libjpeg-turbo/jchuff.c	1970-01-01 00:00:00 +0000
 +++ src/libjpeg-turbo/jchuff.c	2012-06-27 16:20:24 +0000
@@ -0,0 +1,1026 @@
++/*
++ * jchuff.c
++ *
++ * Copyright (C) 1991-1997, Thomas G. Lane.
++ * Copyright (C) 2009-2011, D. R. Commander.
++ * This file is part of the Independent JPEG Group's software.
++ * For conditions of distribution and use, see the accompanying README file.
++ *
++ * This file contains Huffman entropy encoding routines.
++ *
++ * Much of the complexity here has to do with supporting output suspension.
++ * If the data destination module demands suspension, we want to be able to
++ * back up to the start of the current MCU.  To do this, we copy state
++ * variables into local working storage, and update them back to the
++ * permanent JPEG objects only upon successful completion of an MCU.
++ */
++
++#define JPEG_INTERNALS
++#include "jinclude.h"
++#include "jpeglib.h"
++#include "jchuff.h"		/* Declarations shared with jcphuff.c */
++#include <limits.h>
++
++static unsigned char jpeg_nbits_table[65536];
++static int jpeg_nbits_table_init = 0;
++
++#ifndef min
++ #define min(a,b) ((a)<(b)?(a):(b))
++#endif
++
++
++/* Expanded entropy encoder object for Huffman encoding.
++ *
++ * The savable_state subrecord contains fields that change within an MCU,
++ * but must not be updated permanently until we complete the MCU.
++ */
++
++typedef struct {
++  size_t put_buffer;		/* current bit-accumulation buffer */
++  int put_bits;			/* # of bits now in it */
++  int last_dc_val[MAX_COMPS_IN_SCAN]; /* last DC coef for each component */
++} savable_state;
++
++/* This macro is to work around compilers with missing or broken
++ * structure assignment.  You'll need to fix this code if you have
++ * such a compiler and you change MAX_COMPS_IN_SCAN.
++ */
++
++#ifndef NO_STRUCT_ASSIGN
++#define ASSIGN_STATE(dest,src)  ((dest) = (src))
++#else
++#if MAX_COMPS_IN_SCAN == 4
++#define ASSIGN_STATE(dest,src)  \
++	((dest).put_buffer = (src).put_buffer, \
++	 (dest).put_bits = (src).put_bits, \
++	 (dest).last_dc_val[0] = (src).last_dc_val[0], \
++	 (dest).last_dc_val[1] = (src).last_dc_val[1], \
++	 (dest).last_dc_val[2] = (src).last_dc_val[2], \
++	 (dest).last_dc_val[3] = (src).last_dc_val[3])
++#endif
++#endif
++
++
++typedef struct {
++  struct jpeg_entropy_encoder pub; /* public fields */
++
++  savable_state saved;		/* Bit buffer & DC state at start of MCU */
++
++  /* These fields are NOT loaded into local working state. */
++  unsigned int restarts_to_go;	/* MCUs left in this restart interval */
++  int next_restart_num;		/* next restart number to write (0-7) */
++
++  /* Pointers to derived tables (these workspaces have image lifespan) */
++  c_derived_tbl * dc_derived_tbls[NUM_HUFF_TBLS];
++  c_derived_tbl * ac_derived_tbls[NUM_HUFF_TBLS];
++
++#ifdef ENTROPY_OPT_SUPPORTED	/* Statistics tables for optimization */
++  long * dc_count_ptrs[NUM_HUFF_TBLS];
++  long * ac_count_ptrs[NUM_HUFF_TBLS];
++#endif
++} huff_entropy_encoder;
++
++typedef huff_entropy_encoder * huff_entropy_ptr;
++
++/* Working state while writing an MCU.
++ * This struct contains all the fields that are needed by subroutines.
++ */
++
++typedef struct {
++  JOCTET * next_output_byte;	/* => next byte to write in buffer */
++  size_t free_in_buffer;	/* # of byte spaces remaining in buffer */
++  savable_state cur;		/* Current bit buffer & DC state */
++  j_compress_ptr cinfo;		/* dump_buffer needs access to this */
++} working_state;
++
++
++/* Forward declarations */
++METHODDEF(boolean) encode_mcu_huff JPP((j_compress_ptr cinfo,
++					JBLOCKROW *MCU_data));
++METHODDEF(void) finish_pass_huff JPP((j_compress_ptr cinfo));
++#ifdef ENTROPY_OPT_SUPPORTED
++METHODDEF(boolean) encode_mcu_gather JPP((j_compress_ptr cinfo,
++					  JBLOCKROW *MCU_data));
++METHODDEF(void) finish_pass_gather JPP((j_compress_ptr cinfo));
++#endif
++
++
++/*
++ * Initialize for a Huffman-compressed scan.
++ * If gather_statistics is TRUE, we do not output anything during the scan,
++ * just count the Huffman symbols used and generate Huffman code tables.
++ */
++
++METHODDEF(void)
++start_pass_huff (j_compress_ptr cinfo, boolean gather_statistics)
++{
++  huff_entropy_ptr entropy = (huff_entropy_ptr) cinfo->entropy;
++  int ci, dctbl, actbl;
++  jpeg_component_info * compptr;
++
++  if (gather_statistics) {
++#ifdef ENTROPY_OPT_SUPPORTED
++    entropy->pub.encode_mcu = encode_mcu_gather;
++    entropy->pub.finish_pass = finish_pass_gather;
++#else
++    ERREXIT(cinfo, JERR_NOT_COMPILED);
++#endif
++  } else {
++    entropy->pub.encode_mcu = encode_mcu_huff;
++    entropy->pub.finish_pass = finish_pass_huff;
++  }
++
++  for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
++    compptr = cinfo->cur_comp_info[ci];
++    dctbl = compptr->dc_tbl_no;
++    actbl = compptr->ac_tbl_no;
++    if (gather_statistics) {
++#ifdef ENTROPY_OPT_SUPPORTED
++      /* Check for invalid table indexes */
++      /* (make_c_derived_tbl does this in the other path) */
++      if (dctbl < 0 || dctbl >= NUM_HUFF_TBLS)
++	ERREXIT1(cinfo, JERR_NO_HUFF_TABLE, dctbl);
++      if (actbl < 0 || actbl >= NUM_HUFF_TBLS)
++	ERREXIT1(cinfo, JERR_NO_HUFF_TABLE, actbl);
++      /* Allocate and zero the statistics tables */
++      /* Note that jpeg_gen_optimal_table expects 257 entries in each table! */
++      if (entropy->dc_count_ptrs[dctbl] == NULL)
++	entropy->dc_count_ptrs[dctbl] = (long *)
++	  (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
++				      257 * SIZEOF(long));
++      MEMZERO(entropy->dc_count_ptrs[dctbl], 257 * SIZEOF(long));
++      if (entropy->ac_count_ptrs[actbl] == NULL)
++	entropy->ac_count_ptrs[actbl] = (long *)
++	  (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
++				      257 * SIZEOF(long));
++      MEMZERO(entropy->ac_count_ptrs[actbl], 257 * SIZEOF(long));
++#endif
++    } else {
++      /* Compute derived values for Huffman tables */
++      /* We may do this more than once for a table, but it's not expensive */
++      jpeg_make_c_derived_tbl(cinfo, TRUE, dctbl,
++			      & entropy->dc_derived_tbls[dctbl]);
++      jpeg_make_c_derived_tbl(cinfo, FALSE, actbl,
++			      & entropy->ac_derived_tbls[actbl]);
++    }
++    /* Initialize DC predictions to 0 */
++    entropy->saved.last_dc_val[ci] = 0;
++  }
++
++  /* Initialize bit buffer to empty */
++  entropy->saved.put_buffer = 0;
++  entropy->saved.put_bits = 0;
++
++  /* Initialize restart stuff */
++  entropy->restarts_to_go = cinfo->restart_interval;
++  entropy->next_restart_num = 0;
++}
++
++
++/*
++ * Compute the derived values for a Huffman table.
++ * This routine also performs some validation checks on the table.
++ *
++ * Note this is also used by jcphuff.c.
++ */
++
++GLOBAL(void)
++jpeg_make_c_derived_tbl (j_compress_ptr cinfo, boolean isDC, int tblno,
++			 c_derived_tbl ** pdtbl)
++{
++  JHUFF_TBL *htbl;
++  c_derived_tbl *dtbl;
++  int p, i, l, lastp, si, maxsymbol;
++  char huffsize[257];
++  unsigned int huffcode[257];
++  unsigned int code;
++
++  /* Note that huffsize[] and huffcode[] are filled in code-length order,
++   * paralleling the order of the symbols themselves in htbl->huffval[].
++   */
++
++  /* Find the input Huffman table */
++  if (tblno < 0 || tblno >= NUM_HUFF_TBLS)
++    ERREXIT1(cinfo, JERR_NO_HUFF_TABLE, tblno);
++  htbl =
++    isDC ? cinfo->dc_huff_tbl_ptrs[tblno] : cinfo->ac_huff_tbl_ptrs[tblno];
++  if (htbl == NULL)
++    ERREXIT1(cinfo, JERR_NO_HUFF_TABLE, tblno);
++
++  /* Allocate a workspace if we haven't already done so. */
++  if (*pdtbl == NULL)
++    *pdtbl = (c_derived_tbl *)
++      (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
++				  SIZEOF(c_derived_tbl));
++  dtbl = *pdtbl;
++
++  /* Figure C.1: make table of Huffman code length for each symbol */
++
++  p = 0;
++  for (l = 1; l <= 16; l++) {
++    i = (int) htbl->bits[l];
++    if (i < 0 || p + i > 256)	/* protect against table overrun */
++      ERREXIT(cinfo, JERR_BAD_HUFF_TABLE);
++    while (i--)
++      huffsize[p++] = (char) l;
++  }
++  huffsize[p] = 0;
++  lastp = p;
++
++  /* Figure C.2: generate the codes themselves */
++  /* We also validate that the counts represent a legal Huffman code tree. */
++
++  code = 0;
++  si = huffsize[0];
++  p = 0;
++  while (huffsize[p]) {
++    while (((int) huffsize[p]) == si) {
++      huffcode[p++] = code;
++      code++;
++    }
++    /* code is now 1 more than the last code used for codelength si; but
++     * it must still fit in si bits, since no code is allowed to be all ones.
++     */
++    if (((INT32) code) >= (((INT32) 1) << si))
++      ERREXIT(cinfo, JERR_BAD_HUFF_TABLE);
++    code <<= 1;
++    si++;
++  }
++
++  /* Figure C.3: generate encoding tables */
++  /* These are code and size indexed by symbol value */
++
++  /* Set all codeless symbols to have code length 0;
++   * this lets us detect duplicate VAL entries here, and later
++   * allows emit_bits to detect any attempt to emit such symbols.
++   */
++  MEMZERO(dtbl->ehufsi, SIZEOF(dtbl->ehufsi));
++
++  /* This is also a convenient place to check for out-of-range
++   * and duplicated VAL entries.  We allow 0..255 for AC symbols
++   * but only 0..15 for DC.  (We could constrain them further
++   * based on data depth and mode, but this seems enough.)
++   */
++  maxsymbol = isDC ? 15 : 255;

glmark2

Merge lp:~linaro-graphics-wg/glmark2/image-readers into lp:glmark2/2011.11

Commit message

Description of the change

Preview Diff

Subscribers