~dokomix/dokomix/+git/taisei-basis-universal:taisei-1.12

Last commit made on 2020-08-05
Get this branch:
git clone -b taisei-1.12 https://git.launchpad.net/~dokomix/dokomix/+git/taisei-basis-universal

Branch merges

Branch information

Name:
taisei-1.12
Repository:
lp:~dokomix/dokomix/+git/taisei-basis-universal

Recent commits

8abe8b3... by Andrei Alexeyev <email address hidden>

meson: Force C++14

f24780c... by Arseny Kapoulkine

Optimize mipmap generation (BinomialLLC/basis_universal#113)

Squashed commit of the following:

commit 31439ce0e62ca97d41002bfe5fa6e7af90373f36
Author: Arseny Kapoulkine <email address hidden>
Date: Wed Mar 11 20:21:17 2020 -0700

    Optimize mipmap generation

    Instead of starting from the first image every time we generate a mip,
    skip 3 levels (this results in 8x8 => 1x1 reduction).

    This provides a good compromise between performance and quality - beyond
    8x8 the extra rounding errors introduced by temporary copies become
    insignificant.

    This makes the process of generating a mip chain much faster; the full
    2K color texture compression drops from 4.7 to 3.7 seconds after this.

ac4453e... by Arseny Kapoulkine

Optimize color_distance function (BinomialLLC/basis_universal#112)

Squashed commit of the following:

commit 72d34a2496feaa861e91f0a5ea5c994c0491ffce
Merge: a5c9c1f c012fe6
Author: Arseny Kapoulkine <email address hidden>
Date: Thu Apr 2 14:14:59 2020 -0700

    Merge branch 'master' into color-distance

commit a5c9c1f50d135882d68626c4dd420b88eb1a8288
Author: Arseny Kapoulkine <email address hidden>
Date: Fri Mar 20 09:14:12 2020 -0700

    Use color_distance4 in two more code paths

    These are only exercised in comp_level 3+, which is why they weren't
    converted originally - but the conversion process is straightforward.

commit 010998759ff7549e6804bbfd6f71e74cf6ff6623
Author: Arseny Kapoulkine <email address hidden>
Date: Wed Mar 11 00:27:05 2020 -0700

    Optimize color_distance function

    This change optimizes the perceptual color distance function in a few
    steps:

    1. Instead of converting each color to YCrCb, and then taking the
    deltas, we compute the deltas in RGB and then convert deltas to YCrCb
    space. This is using the fact that (R1 - Y1) - (R2 - Y2) = (R1 - R2) -
    (Y1 - Y2), and Y delta can be computed from R/G/B deltas using the same
    coefficients.

    2. Instead of computing color distances one at a time, we introduce an
    SSE2-optimized function that computes the distance from one color to a
    block of 4. It also returns the index of the minimum error since that's
    used by almost every caller. All performance-sensitive locations where
    this is feasible are converted to color_distance4.

    3. In find_optimal_selector_clusters_for_each_block, a lot of time was
    taken computing color distances, and they didn't form a neat groups of
    4. However, because we only have 4 colors to compare against, and 16
    colors to choose from, we have to compute a grand total of 64 unique
    color deltas; the cluster count, however, is typically ~200 and we
    computed 16 deltas for each cluster - so it's cheaper to precompute all
    64 deltas and just add them up.

    Together these optimizations reduce the time to compress a 2Kx2K color
    image from 6.6 sec (post #105) to 4.7 sec, including image load and mipmap
    generation time.

    Note that the results aren't bit-exact due to the small floating point
    drift in perceptually correct color distance, but PSNR is the same as
    before.

ee95645... by Arseny Kapoulkine

Optimize ETC1 selector palette matching (BinomialLLC/basis_universal#105)

Squashed commit of the following:

commit 42bc2cbabd5a86e319176d61597ad23afec4aaa3
Author: Arseny Kapoulkine <email address hidden>
Date: Thu Feb 6 00:04:31 2020 -0800

    Optimize ETC1 selector error computation

    Using a similar observation to the previous change, we don't need to
    evaluate up to 16 color distances for each of the 64 palette entries -
    instead, we can precompute the per-pixel errors introduced by changing
    the selector ahead of time.

    This reduces the amount of color_distance evaluation which is pretty
    expensive, and further improves the encode times - on a 2Kx2K image on a
    Core i7-8700K, previous change reduced the time from 8.5 seconds to 7.3
    seconds, and this one further reduces it to 6.6 seconds.

commit 3acffe254b379aeba6f8f84bac711129257ee545
Author: Arseny Kapoulkine <email address hidden>
Date: Wed Feb 5 23:42:37 2020 -0800

    Optimize ETC1 selector palette matching

    Right now when compression level isn't 0, we take each 4x4 block and try
    to replace the selector indices with one of the 64 entries in the
    palette. If we find a resulting selector block that has a low enough
    error, we use that instead.

    The way the code is implemented right now, we decode each block from
    scratch for each entry, which means we decode each pixel 64 times.

    However, each pixel's decoded result only depends on the endpoints (that
    are static) and on the selector in that pixel (which has only 4 possible
    values), so for each pixel we need at most 4 decodes.

    This change switches the code around so that we pre-decode all 4
    variants of the entire block ahead of time, and then just index into
    this when evaluating matches.

    This results in bit-identical images since we look at the exact same
    data, but the encoding is faster - on a 2Kx2K PNG on a Core i7-8700K,
    this change reduces the encoding time including a full mip chain from
    8.5 seconds to 7.3 seconds.

e917bd8... by Andrei Alexeyev <email address hidden>

Add some missing C API wrappers

5dad062... by Andrei Alexeyev <email address hidden>

fix C++ crap

43d4103... by Andrei Alexeyev <email address hidden>

Add a C wrapper for the transcoder (untested)

060e52b... by Andrei Alexeyev <email address hidden>

Add meson build system

102275a... by Rich Geldreich

Fixing filename

5f43e1f... by Rich Geldreich

Rebuilt