Archive for February, 2010

Starting wavelet difference reduction

Friday, February 26th, 2010

Its been slightly under a month since my last post. In the interim I have done much battling with a teething baby that has resulted in a lack of motivation to attack a complicated problem. The past couple of nights, though, I have set back to the WDR compressor with the intention of getting it working.

I now, fortunately, have a limited compressor/decompressor going.

In the previous article I described the basics behind a Daub 5/3 forward wavelet transform. In this article I will be describing how to do the significance pass of the WDR algorithm.

The basic idea is to run through the high pass sections of the image (LH, HL and HH) with a specific threshold. At each threshold level the algorithm needs to find the number of pixels to the next pixel that is greater than (or equal to) the threshold.

For example, take an image that has these values stored in its high pass:

0     0   -2    0     0    8    0   0

0     9     0  -3     0    4    0   0

0     0     1    7     0    0   6    0

0     5     0    0    0   -8   0    1

0     0     0    0    0    0   2    0

0     0   -4     0    6    0   0    0

0     3     0    0    0    7   0    0

0   -8     0    0    0    0    5    0

0    0     0    0   -2    0    0   0

First start with an arbitrary threshold. In this example I will choose 8. We now start from the beginning and scan through for the first value greater-than-or-equal-to 8. Pixel 6 is the first, pixel 10 is the next, pixel 30 the 3rd and so on. In WDR we want the difference between those pixels. So We would store 6, 4 and 20 for the 3 pixels above. The other obvious part is the sign. In fact we’d want to store those first 3 pixels as +6, +4 and -20. When you have found all the pixels in the image that are above your threshold then you write a + to indicate the end of a run (Though I’m pretty sure it would be quite simple to eliminate the trailing +). You then halve the threshold and process again (Though ignoring the pixels you have already processed).

Thats the basics. Of course its far more complicated than that. For one each high pass section of the image is read in a different way and not as simply as I explained above. This is explained in many papers on WDR (An example). The scan orders are as follows:

For LH:

1   2    3    4

8   7   6    5

9   10 11 12

16 15 14 13

(ie You go form left to right on odd lines and right to left on even forming a weaving pattern)

For HL:

1 8   9   16

2 7  10 15

3 6  11  14

4 5  12  13

(ie Similar to LH but vertical scans instead of horizontal)

For HH:

1    2   6    7

3    5   8   13

4    9  12  14

10 11 15  16

(ie This one is performed by zig-zagging across the texture).

The reason the image is processed in this way is to try and maximise the runs of pixels that you will get. Again this is all explained well in various papers (Though it may well be interesting to play with these to see if you get better compressions by running across the pixels in different ways).

So now you have your list of differences and signs for a set of thresholds. The final part of the WDR algorithm is to bit encode your data. This is something that is poorly explained in many of the articles that I have read, I would truly have been lost if it was not this paper. If we look at the bit patterns of each of the numbers we retrieved earlier (ie 6 = 110, 4 = 100 and 20 = 10100) we can see that the highest bit is always 1 this is the case for all our numbers (If this doesn’t seem obvious now scribble out some binary numbers and you will see why its quite obvious!). Thus when bit encoding we can ignore the leading 1 and store the subsequent bits (ie 10, 00, 0100). Brilliant. We do, however, need to store the sign along side this number. For this reason we use a 2 bit per digit scheme. – = 11, + = 10, 1 = 01, 0 = 00. Armed with this information we can store the numbers in a bit stream as 100100 — 100000 — 11000100. When we build this up over each pass for each threshold we end up saving quite a few bits. This is the main meat of the compression.

Of course writing the bits is not the most fun of things. For this reason I wrote my own templated bitstream class for handling easy addition and lookup of the bit patterns I was after.

My code looked something like this:

 int count	= 0;
int max	 = (width >> 1) * (height >> 1);
while( count < max ) { const int pixelStart	= buffStart	+ iter->GetX() + (iter->GetY() * pitch);
const int16_t val	 = plane.pixels[pixelStart];
const int16_t absVal	= abs( val );
if ( absVal > threshold && absVal < (threshold << 1))
{
	// Push sign.
	bitStreamOut.wdrBitStream += (val < 0) ? BitEncoding::Bit_Minus : BitEncoding::Bit_Plus;
	// Push numbers into bit stream.
	uint8_t bitNum	 = 0;//HighestBitSet( pixelPos );;
	uint8_t bitMax	 = HighestBitSet( pixelPos );
	while( bitNum < bitMax )
	{
		bitStreamOut.wdrBitStream	+= (((pixelPos & (1 << bitNum)) >> bitNum) == 0) ? BitEncoding::Bit_Zero : BitEncoding::Bit_One;
		bitNum++;
	}
	pixelPos	= 1;
}
pixelPos++;
iter->Next();
count++;
}
//Emit +remainingPixels+ to indicate end …
bitStreamOut.wdrBitStream += BitEncoding::Bit_Plus;
uint8_t bitNum	 = 0;//HighestBitSet( pixelPos );;
uint8_t bitMax	 = HighestBitSet( pixelPos );
while( bitNum < bitMax )
{
	bitStreamOut.wdrBitStream	+= (((pixelPos & (1 << bitNum)) >> bitNum) == 0) ? BitEncoding::Bit_Zero : BitEncoding::Bit_One;
	bitNum++;
}
bitStreamOut.wdrBitStream += BitEncoding::Bit_Plus;

Something to note in the code ismy iter->Next() which is one of my 3 iterators that define the way we wind through each highpass section.
We now have a compressed bit stream. I used the now ubiquitous lena image for testing this out.
This is the result of the Wavelet transform:

The arrangement of this image is slightly backward. This is due to the way windows reverses images. You can however see the HL and HH passes on the top and the LL and LH passes on the bottom (instead of the other way round).
From there we need to decompress the bit stream and end up with something like the original image:

This decompression took me a while to get right. Basically, though, you need to go back over the bit stream dropping the appropriate bits in the correct places until you get back an image like the former. Running the inverse wavelet transform should result in the above image.
Easy for me to say. Firstly we need to drop the correct values in the correct places. To do this we need the original threshold value we started with. The bit stream will always start with a plus or minus. Following that we have the bits of the difference value. When we find another sign we are on to the next pixel. Don’t forget that the 1 needs to be added to the left of the number! When we get a number that takes us to the end of the image we will find an extra plus and we can go back to the beginning at half the threshold.
So the first value we get out is a 6 and we have a sign of +. This means we are writing a +8 at the 6th pixel position. We then read a +4 so we are writing a +8 at the 10th position. We next get that -20 which means we write a -8 at the 30th position and so on.
The quick (er than me) will noticed that this means that the value of 9 in our earlier example will get written back as an 8 (or a value of 7 written as a 4). This is where the refinement pass comes in. I will explain this in a further article (ie when I get it working ).
My code for doing this looked something like the following:

iter->SetToStart();

int pixelPos                = 0;
while( pixelPos < size )
{
	int sign                    = bitStream.wdrBitStream[bit];
	bit++;

	int pixelDiff               = 0;
	int bitPos                  = 0;
	BitEncoding::Bit readBit    = BitEncoding::Bit_Zero;
	while( readBit < BitEncoding::Bit_Plus )
	{
		readBit = bitStream.wdrBitStream[bit];
		switch( readBit )
		{
		case BitEncoding::Bit_One:
			pixelDiff   |= readBit << bitPos;
		case BitEncoding::Bit_Zero:
			bitPos++;
			break;

		case BitEncoding::Bit_Minus:
		case BitEncoding::Bit_Plus:
			break;

		}
		bit++;
	}
	pixelDiff   |= 1 << bitPos;   // Always a 1 at the top end.

	pixelPos    += (pixelDiff - 1);

	if ( pixelPos < size )
	{
		// Step forward through the iterator pixelPos times.
		uint32_t count  = 0;
		while( count < pixelDiff )           {               count++;               iter->Next();
		}

		const uint32_t writePos   = buffStart   + (iter->GetX() + iter->GetY() * pitch));
		planeOut.pixels[writePos] = (sign == BitEncoding::Bit_Plus) ? threshold : -threshold;

		bit--;
	}
}

Of course you can run the wavelet transform on the low-pass sub-image created from previous steps. This will give you something like the following:

Decompressing this image returns you to the following:

In this you can see some colour errors. I think this is caused by my lack of a refinement pass however its well worth noting the details on her hat brim and her boa. Both of these have returned and look good in the re-sized image.

Anyway … should anyone actually be reading my ramblings or have any questions then say hi in the comments!

Update:   I identified a bug whereby the software would get confused by a 0 and a 1 appearing in the WDR stream as both end up as the same value.  Zeros really shouldn’t be turning up so this has now been fixed.