Finding the "right levels" for quantization with anim GIFs

Questions and postings pertaining to the development of ImageMagick, feature enhancements, and ImageMagick internals. ImageMagick source code and algorithms are discussed here. Usage questions which are too arcane for the normal user list should also be posted here.
Post Reply
SineSwiper
Posts: 23
Joined: 2011-07-19T04:20:20-07:00
Authentication code: 8675308

Finding the "right levels" for quantization with anim GIFs

Post by SineSwiper »

Okay, I'm not really getting anywhere on this thread, so I figured I would post here. I'm trying to dive deeper into the movie to anim GIF guide and trying to find ways of improving the technique using a tri-level ordered-dither. The problem is two-fold:

1. How many "levels" of colors are needed for the animation as a whole?
2. How can we leverage the average human's visual perception to add or subtract levels of different colors?

While I've dived into item #1 with only partial success, I think I'm on to something comparable with item #2. According to this article (and others), especially the graph below, humans have a harder time seeing blue than red and green, and green seems to be the easiest color to spot. (I am going to assume this applies to differences in color levels for RGB as well.) Due to the "blue" cone's absorbance towards violet (blue+red), rather than true blue, this gives red an advantage, but not as much as green.

Image

Unfortunately, I do not have a more scientific set of mathematical ratios, but for the time being, I can use a 2:1 ratio for G:R, and a 2:1 ratio for R:B. Going back to the plane.avi, we can compare a single-level vs tri-level ordered dither:

Code: Select all

$ convert plane.avi -ordered-dither o8x8,12,24,6 -append -format %k info:
102
$ convert plane.avi -ordered-dither o8x8,12 -append -format %k info:
73
$ convert plane.avi -ordered-dither o8x8,14 -append -format %k info:
92
$ convert plane.avi -ordered-dither o8x8,15 -append -format %k info:
109
$ convert -quiet -delay 1 plane.avi -ordered-dither o8x8,15 +map plane_od1.gif
$ convert -quiet -delay 1 plane.avi -ordered-dither o8x8,12,24,6 +map plane_od2.gif
$ cat > test.html
<img src="plane_od1.gif">
<img src="plane_od2.gif">
^D
$
So, we have two anim GIFs with around the same number of colors, but one is a single-level OD and the other is a tri-level OD, using the above ratio. (Because of the number of colors, the single-level had to go from 12 to 15 to match up with the number of colors, but both 12 and 15 have similar results.) What do we get? Well, the effect is subtle, but the tri-level OD appears to be of a slightly better quality than the single-level one. OD1 gives us a little bit more dither pixels that are visible to the eye. So, how can we abuse this for the full-scale 256-color GIF?

Code: Select all

$ convert plane.avi -ordered-dither o8x8,23 -append -format %k info:
235
$ convert plane.avi -ordered-dither o8x8,23,46,11 -append -format %k info:
288
$ convert plane.avi -ordered-dither o8x8,23,35,15 -append -format %k info:
252
$ convert plane.avi -ordered-dither o8x8,21,42,10 -append -format %k info:
247
$ convert -quiet -delay 1 plane.avi -ordered-dither o8x8,23 +map plane_od1.gif
$ convert -quiet -delay 1 plane.avi -ordered-dither o8x8,23,35,15 +map plane_od_tri1.5.gif
$ convert -quiet -delay 1 plane.avi -ordered-dither o8x8,21,42,10 +map plane_od_tri2.0.gif
$ cat > test.html
<img src="plane_od1.gif">
<img src="plane_od_tri1.5.gif">
<img src="plane_od_tri2.0.gif">
^D
$
So, using all of the palette, we get a 23/23/23 at 235 colors, a 23/35/15 (1.5:1 ratio) at 252 colors, and a 21/42/10 (2:1 ratio with slight reduction of base) at 247 colors. Right off the bat, we're already squeezing more colors into the palette, due to the finer control. The images look almost identical, and the file sizes are about the same (plus or minus a few KB). Let's try lowering the tri-color base a little bit:

Code: Select all

$ convert plane.avi -ordered-dither o8x8,18,27,12 -append -format %k info:
160
$ convert plane.avi -ordered-dither o8x8,18,36,9 -append -format %k info:
188
$ convert -quiet -delay 1 plane.avi -ordered-dither o8x8,18,27,12 +map plane_od_tri1.5.gif
$ convert -quiet -delay 1 plane.avi -ordered-dither o8x8,18,36,9 +map plane_od_tri2.0.gif
Awesome! We squeezed out a bunch of colors while maintaining a visually identical quality. The blue tends to get a little darker, but there isn't any signs of dithering differences. And the file sizes 646K vs. 567K/583K for 1.5/2.0:1 ratios, respectively. That's a 79K difference on the high end, with a direct translation when we OptimizeTransparency. How low can we go?

Code: Select all

$ convert plane.avi -ordered-dither o8x8,15,23,10 -append -format %k info:
128
$ convert plane.avi -ordered-dither o8x8,15,30,8 -append -format %k info:
145
$ convert -quiet -delay 1 plane.avi -ordered-dither o8x8,15,23,10 +map plane_od_tri1.5.gif
$ convert -quiet -delay 1 plane.avi -ordered-dither o8x8,15,30,8 +map plane_od_tri2.0.gif
Okay, now we're starting to see some extra dithering artifacts, but the difference is very slight. However, that slight difference gives us a 506K image, which is 140K difference from the original! What about dropping the base on a single-level OD?

Code: Select all

$ convert plane.avi -ordered-dither o8x8,15 -append -format %k info:
109
$ convert plane.avi -ordered-dither o8x8,18 -append -format %k info:
149
$ convert -quiet -delay 1 plane.avi -ordered-dither o8x8,18 +map plane_od2.gif
$ convert -quiet -delay 1 plane.avi -ordered-dither o8x8,15 +map plane_od3.gif
$ cat > test.html
23: <img src="plane_od1.gif"><br>
18: <img src="plane_od2.gif"><br>
15: <img src="plane_od3.gif">
<img src="plane_od_tri1.5.gif">
<img src="plane_od_tri2.0.gif">
Is there a difference? Yes, definitely! The base 15 single OD version looks fairly crappy compared all of them. Even the base 18 single OD version, at 569K, isn't as good as the base 15 tri-color OD versions.

So, in summary, this is a valid technique to reduce the size of the animation without reducing the quality, or if you prefer, raising the quality of the image without raising the size of the animation. However, if anybody has any other clues to item #1 and the "official" ratios, I'd like to improve upon this.
SineSwiper
Posts: 23
Joined: 2011-07-19T04:20:20-07:00
Authentication code: 8675308

Re: Finding the "right levels" for quantization with anim GI

Post by SineSwiper »

Hmmm, okay, I started investigating colorspaces and may have found a more "official ratio". It looks like L*a*b* is my answer:
Wikipedia wrote:Unlike the RGB and CMYK color models, Lab color is designed to approximate human vision. It aspires to perceptual uniformity, and its L component closely matches human perception of lightness.
Great... now I don't have to make guesses at human perception ratios. So, some more testing, this time using a closer to "real world" movie clip. In this case, we're more worried about the highest quality within the 256 color palette.

(format guessing excluded)

Code: Select all

$ convert WorldNotEnough.pam -ordered-dither o8x8,11 -append -format %k info:
232
$ convert WorldNotEnough.pam -quiet -delay 1x10 -ordered-dither o8x8,11 +map -coalesce -deconstruct -layers RemoveDups -layers Optimize twine_rgb_11.gif
$ convert WorldNotEnough.pam -ordered-dither o8x8,10,21,5 -append -format %k info:
241
$ convert WorldNotEnough.pam -quiet -delay 1x10 -ordered-dither o8x8,10,21,5 +map -coalesce -deconstruct -layers RemoveDups -layers Optimize twine_rgb_10-21-5.gif
$ convert WorldNotEnough.pam -ordered-dither o8x8,11,17,7 -append -format %k info:
251
$ convert WorldNotEnough.pam -quiet -delay 1x10 -ordered-dither o8x8,11,17,7 +map -coalesce -deconstruct -layers RemoveDups -layers Optimize twine_rgb_11-17-7.gif
$ convert WorldNotEnough.pam -colorspace YUV -ordered-dither o8x8,27,12,18 -append -format "%r - %k - %z" info:
DirectClassYUV - 437 - 8
$ convert WorldNotEnough.pam -colorspace YUV -ordered-dither o8x8,27,12,18 -append -colorspace RGB -format "%r - %k - %z" info:
DirectClassRGB - 437 - 8
$ convert WorldNotEnough.pam -colorspace YUV -ordered-dither o8x8,21,9,14 -append -colorspace RGB -format "%r - %k - %z" info:
DirectClassRGB - 254 - 8
$ convert WorldNotEnough.pam -quiet -delay 1x10 -colorspace YUV -ordered-dither o8x8,21,9,14 +map -coalesce -deconstruct -layers RemoveDups -layers Optimize twine_yuv_21-9-14.gif
A few RGB examples, like before, as well as a YUV thing I was testing with earlier (YUV with 1.5 ratio favoring luma and punishing blue). Since I'm playing with colorspaces, the color level guessing requires a conversion back to RGB to make sure it's looking an accurate count of RGB colors. This doesn't matter for YUV, but Lab has a much higher gamut of colors than aren't defined in RGB, so the conversion to RGB will be lossy. Now for the Lab example:

Code: Select all

$ convert WorldNotEnough.pam -colorspace Lab -ordered-dither o8x8,11 -append -format "%r - %k - %z" info:
DirectClassLab - 201 - 8
$ convert WorldNotEnough.pam -colorspace Lab +depth -ordered-dither o8x8,11 -append -format "%r - %k - %z" info:
DirectClassLab - 201 - 16
$ convert WorldNotEnough.pam +depth -colorspace Lab -ordered-dither o8x8,11 -append -colorspace RGB -format "%r - %k - %z" info:
DirectClassRGB - 117 - 16
$ convert WorldNotEnough.pam +depth -colorspace Lab -ordered-dither o8x8,11 -append -colorspace RGB -depth 8 -format "%r - %k - %z" info:
DirectClassRGB - 117 - 8
$ convert WorldNotEnough.pam +depth -colorspace Lab -ordered-dither o8x8,16 -append -colorspace RGB -depth 8 -format "%r - %k - %z" info:
DirectClassRGB - 243 - 8
$ convert WorldNotEnough.pam +depth -colorspace Lab -ordered-dither o8x8,17 -append -colorspace RGB -depth 8 -format "%r - %k - %z" info:
DirectClassRGB - 275 - 8
$ convert WorldNotEnough.pam -colorspace Lab -ordered-dither o8x8,16 -append -colorspace RGB -format "%r - %k - %z" info:
DirectClassRGB - 243 - 8
$ convert WorldNotEnough.pam -colorspace Lab -ordered-dither o8x8,17 -append -colorspace RGB -format "%r - %k - %z" info:
DirectClassRGB - 275 - 8
$ convert WorldNotEnough.pam -delay 1x10 -colorspace Lab -ordered-dither o8x8,16 +map -coalesce -deconstruct -layers RemoveDups -layers Optimize twine_lab_16.gif
One thing to notice is that you clearly see the RGB color loss, going from 201 Lab colors to 117 RGB colors. Also, while a 16-bit lab depth is ideal for perfect RGB conversion, I'm guessing that IM does it internally already, so there's no need for depth parameters.

And the result? Terrible. A noticeable amount of dithering. But, we are treating all three factors equally. Remember that the L* is "lightness", and the other two are the "color-opponent dimensions". So, given that we can see the pixels because of higher difference of lightness, we can try different flavors of L vs. ab:

Code: Select all

$ convert WorldNotEnough.pam -quiet -delay 1x10 -colorspace Lab -ordered-dither o8x8,18,15,15 +map -coalesce -deconstruct -layers RemoveDups -layers Optimize twine_lab_18-15-15.gif
$ convert WorldNotEnough.pam -quiet -delay 1x10 -colorspace Lab -ordered-dither o8x8,19,14,14 +map -coalesce -deconstruct -layers RemoveDups -layers Optimize twine_lab_19-14-14.gif
$ convert WorldNotEnough.pam -quiet -delay 1x10 -colorspace Lab -ordered-dither o8x8,21,13,13 +map -coalesce -deconstruct -layers RemoveDups -layers Optimize twine_lab_21-13-13.gif
$ convert WorldNotEnough.pam -quiet -delay 1x10 -colorspace Lab -ordered-dither o8x8,23,12,12 +map -coalesce -deconstruct -layers RemoveDups -layers Optimize twine_lab_23-12-12.gif
$ convert WorldNotEnough.pam -quiet -delay 1x10 -colorspace Lab -ordered-dither o8x8,25,11,11 +map -coalesce -deconstruct -layers RemoveDups -layers Optimize twine_lab_25-11-11.gif
This is a basic moving up the L scale and moving down the ab scale. The last one is all the way to a 2.27:1 ratio. Yet, it's the best one so far. Even when comparing against all of the other formats, the Lab 25,11,11 one just gets the dithering much closer to what it's supposed to be. It's noticeable especially on dark spots. Blackness just diffuses better. Well, if the lowest ab is the winner, how low can we go here?

Code: Select all

$ convert WorldNotEnough.pam -quiet -delay 1x10 -colorspace Lab -ordered-dither o8x8,27,10,10 +map -coalesce -deconstruct -layers RemoveDups -layers Optimize twine_lab_27-10-10.gif
$ convert WorldNotEnough.pam -quiet -delay 1x10 -colorspace Lab -ordered-dither o8x8,30,9,9 +map -coalesce -deconstruct -layers RemoveDups -layers Optimize twine_lab_30-9-9.gif
$ convert WorldNotEnough.pam -quiet -delay 1x10 -colorspace Lab -ordered-dither o8x8,32,8,8 +map -coalesce -deconstruct -layers RemoveDups -layers Optimize twine_lab_32-8-8.gif
$ convert WorldNotEnough.pam -quiet -delay 1x10 -colorspace Lab -ordered-dither o8x8,38,7,7 +map -coalesce -deconstruct -layers RemoveDups -layers Optimize twine_lab_38-7-7.gif
Here's where green pixels start to invade. The L*=25 has 11*11=121 colors to choose from vs. 100 for the ab=10 and 81 for ab=9. It gets kinda ridiculous after that. So, somewhere between ab = 11 to 12 seems to be the sweet spot. I'd imagine that's universal, unless a video seems to play with only a certain side of colors. (The plane is actually a good example for that, since it is mostly blue.)

I've included a screenshot of my test page, with 12 faces of Shirley Manson staring at you :) (It's saved in PNG, since it's a lossless format.)

PLEASE include some of this data on the Video Handling and Color Quantization and Dithering pages! The Lab colorscale is THE colorspace for making Quantization changes of any sort. If it's not the automatic default, it should at least be broadcasted on both of those pages.
The 8472
Posts: 1
Joined: 2014-03-15T16:31:06-07:00
Authentication code: 6789

Re: Finding the "right levels" for quantization with anim GI

Post by The 8472 »

For anyone interested in using IM's ordered dithering logic, i've built on these findings. I've devised the following algorithm:

1. Convert to 16bit L*a*b*

2. calculate minima, maxima and standard deviation for the 3 channels separately.

For image sequences this requires the combined minima/maxima of all frames, i.e. use -append

3. do some adjustments to the standard deviations:

a) since the input image (presumbly coming from an RGB color space) can never make full use of the the a/b channels the standard deviation measured in those channels is not directly comparable to the standard deviation of the L channel. The values here correct for that by stretching it to only fit the observed minima/maxima sourced from a fully saturated RGB rainbow.

Code: Select all

stddev_a /= 0.88f-0.16f
stddev_b /= 0.87f-0.07f
b) normalize the stdevs so that one of the three components has a weight of 1.0

Code: Select all

weight_L = sqrt(stddev_L / max(stddev_L, stddev_a, stddev_b))
weight_a = sqrt(stddev_a / max(stddev_L, stddev_a, stddev_b))
weight_b = sqrt(stddev_b / max(stddev_L, stddev_a, stddev_b))
Now we have weights that signify the "importance" of the individual Lab channels in the picture, which can be used for the individual ordered dithering levels.

4. normalize the color channels individually.

Code: Select all

"-level-colors 'cielab(' + (min_L*100) + '%,' + (min_a*100) + '%,' + (min_b*100)+ '%),cielab(' + (max_L*100) + '%,' + (max_a*100) + '%,' + (max_b*100)+ '%)'"
ImageMagick's -ordered-dither uses fixed thresholds at regular intervals. With 2 levels this means means black and white, the most extreme colors are used. Both of those would normally lie far outside the range of actual levels of the image, especially on the a and b channels.
Normalizing the per-channel ranges means that even the extreme values chosen by the dithering algorithm will not lie outside the ranges used by the source image

5. perform ordered dithering with levels according to <some level> * weight_L, weight_a, weight_b.

To find a good level you will have to do an iterative search as suggested by other threads in this forum.

E.g. if we have 1.0, 0.5, 0.25 as weights then we can try the following values:
20, 10, 5
19, 10, 5
18, 9, 5
17, 9, 4
etc.

The iterative search can be done with one of the two following goals
a) less than N colors per frame. This results in per-frame palettes but allows for more colors overall
b) less than N colors globally for a single global color palette
where N <= 255

6. undo the normalization of the color channels to restore original colors

Code: Select all

"+level-colors 'cielab(' + (min_L*100) + '%,' + (min_a*100) + '%,' + (min_b*100)+ '%),cielab(' + (max_L*100) + '%,' + (max_a*100) + '%,' + (max_b*100)+ '%)'"
Since this is a linear transform and and Lab is a perceptually linear color space it should only result in minimal color space distortions

7. convert from 16bit Lab to 8bit sRGB, output as sRGB gif



Important Note: Since converting Lab to sRGB collapses some distinct Lab colors into the same RGB values, especially while also converting from 16bit to 8bit values the iterative search will have to be done over the final output values.
I.e. the iterative search in step 5 will have to include steps 6 and 7 for each iteration.
Post Reply