Chiltern cycleway

June 30th, 2019

It started off after Bez did a 150 mile cycle around west yorkshire with an insane amount of climbing. I suggested that I’d love to give a crack of the chilterns cycleway. Within 24 hours, the date was set; Saturday 29th of June 2019. Less than one week until the ride. We check the weather forecast and it says there will be a mid-week heatwave followed by a nice cool 21 degrees with the possibility of a few showers. Thats perfect. As the week wore on, however, the forecast temperature increased more and more until they were suggesting it might get up to 31 degrees. Still, the date was set. The route was slightly modified to avoid some of the worst bridleways reducing the route to a, mere, 170 miles.

Saturday morning started early. Bez was up at 3:10am I got up at about 3:50am. After a bit of early morning faffage we got on the road arriving in Ewelme, next to RAF Benson, at about 5:30 am. 15 minutes of fixing bags to the bikes later, and we were on our way in the early morning sun.

Almost immediately we hit the first climb of the day. Its quite gentle as climbs go and a nice, though slightly on the long side, warm up to start the day. We headed on towards the M40 along the top of the chilterns ridgeway making good speed into a minor headwind. We crossed the M40 and turned towards Stokenchurch along the A40. After passing through Stokenchurch we turned off the A40 and headed towards the next climb of the day through Radnage. Again it didn’t pose too much of a challenge thanks to our nice fresh legs. We pressed on towards Princes Risborough.

Our next climb of the day was supposed to be Kop Hill but we decided to do a small detour to do an even harder climb, Whiteleaf hill (Number 23 on Simon Warren’s 100 climbs list). This definitely burned but again it was still early and so we were still feeling good and fresh. We pressed on towards Wendover through lovely rolling fields.

Around the time we arrived at Wendover we were beginning to discuss our first break of the day but decided to press on to find a cafe that was next to the road. A few miles futher on we entered Wigginton and spotted that the local village shop had a cafe attached. What a perfect place to stop, eat, drink, refill our water, apply suncream and it also gave me a good chance to get my “bike stereo” set up. We were now 37 miles in and the temperature was increasing rapidly.

After a lovely stop we headed off listening to some drum and bass and pressed on towards Luton. There was only one major climb along this route at Aldbury but we were still feeling fresh and we climbed it quickly and continued past the Ashridge estate on the road. As we passed south of Luton the temperature was really beginning to build. We were now 5 hours and 60 miles into the cycle and my phone was saying it was 27 degrees and we could easily believe it. Our water was beginning to be depleted at a significantly increased rate. As we past through Lilley, looking for a cafe to stop at and get a refill, Bez stopped to ask a lovely lady if she had any recommendations. She refilled Bez’s bottles and recommended a little tea shop in Hexton so we pressed on. Sadly we turned off just before arriving at Hexton and pressed on through the most northerly part of the cycle. It was 15 miles and a little over an hour before we arrived in Hexton and found the little cafe recommended earlier. It was time to eat some food.

What an amazing lunch in a lovely little quirky cafe. I had stilton and bacon sandwiches and Bez had boiled potatoes, quiche and salad. The heat was really beginning to build and we were really appreciative of the break. After a little over an hour, we headed off up the hill out of Hexton and headed off back through Lilley, avoiding the bridleways that were part of the proper route.

As we stopped, so that Bez could get a photo of one of the Chilterns cycleway signs, the talk turned to icecream.  Although neither of us were huge icecream eaters we both agreed that as the temperature shot past 30, now was the time to be eating some.  A few miles further on we reached a right turn in Whitwell and as we began the turn up the hill, Bez spotted a likely looking shop down the road, so we diverted to get those icecreams going!  Bez went for a twister and I went for a calippo.  Oh god it was nice inhaling the icey cold in that heat, even if the brain freeze attack I suffered was one of the most debilitating i’ve ever suffered necessitating a sit down.

Post icecream, we pressed on towards Harpenden over some rolling hills arriving on the outskirts of the town around 2.45pm. We headed up the Lea valley walk enjoying the shade that the trees provided even though it was much harder going on the softer surface. While cycling this route Bez enxcouraged me to add a fizzy pill to my water to help me replace salts that I was clearly losing in the, quickly becoming, extreme heat. We pressed on through Harpenden and out along the Nickey line and back under the M1 heading towards the Gaddesdens.

We were now discussing how much chips would be a great plan. As we pressed on towards Great Gaddesden we were not seeing any pubs and we stopped by a group of walkers to ask for information. We were told of a couple of pubs in the near area that sadly turned out not to be on our route. However as we descended into Great Gaddesden we found a garden centre just before the next climb. I was beginning to hurt and lose my sense of humour so we stopped at the cafe for some refreshments. We were now 106 miles in and I was in to unknown territory, having only ever cycled 101 miles before. Bez dug into cake and smoothie while I stuck with apple juice and my peanuts. My phone was now telling me that it was 34 outside and I was beginning to dream of a pint of cider and those aforementioned chips. Feeling ever so slightly refreshed, we pressed on up the hill suffering in the heat. We had a long slow climb up towards Berkhamstead but it was a thankfully minor incline.

We descended into Berkhamstead and decided to find a pub. Finally, after some indecision on my part, we stopped at a likely looking pub that had outside seating and shade. A pint of cider and a, frankly undersized, plate of chips were imbibed. While there I popped into the local tsco to grab some paracetemol and ibuprofen to help with the leg burn. It was an odd town and not one I’d reocmmend. As we were kitting up to head on, some woman in a blue Audi shouted something at Bez though the only word we heard was Lycra. Still the pit stop was done, the temperature was beginning to drop and we were ready to press on. It was now about 5:45pm. We’d been out for 12 hours!

We headed out of Berkhamstead before turning left up a hill and out of the town. The next section was nice cruising along the tops of the hills toward the river Chess below Little Chalfont. As we crossed the river we approached a cross roads from which we could see the insanely steep hill in front of us. From a standing start we were off and up the hill. We only had a short climb but the hill averaged about 14%. It was a true leg killer but thanks to the chips (and cider!) we made quick work of it and turned off onto a bridleway. We bumped along the bridleway for a mile or so before taking a wrong turn into a field where we had a quick chat with a couple of lads who were incredibly suprised at the 120 miles we had cycled by this stage. We pressed on up the bridleway as I worried that my bike was beginning to make some very strange noises. When we finally exited the bridleway I inspected my bike to find out my front and rear quick releases had managed to undo themselves! I also discovered that the bearings in my rear wheel were beginning to fail. Oh well. I have to press on.

We headed round Amersham, through Little Missenden and up the hill towards Great Kingshill before heading on towards Speen. The ascent into Speen was long and gentle with a steep descent out of the town with a very sharp corner that took me by surprise. We climbed out of that descent stopping at a crossroads with a distinct doof of some kind of dance party in the distance. From there we headed over the hill towards Saunderton and through for a blissful run into West Wycombe down some thin but fast roads. The sun was beginning to get low by this stage and it was getting noticeably darker. As we reached the bottom of the next climb at the amusingly named, to me anyway, Bullock Farm lane we had cycled roughly 140 miles and it was now about 8:30pm. We pressed on up the challenging hill but Bez started to suffer from knee pain. We pressed on to the top and across a dip towards the M40. We stopped in the evening sun while Bez let his knee rest before pressing on across the M40 towards Hambleden. The descent to Hambleden was crazily steep but from previous rides I knew what was approaching.

We headed out of Hambleden towards Dudley Lane. Even when feeling a hell of a lot fresher this hill was a difficult one. The first part of Dudley Lane averages 11% peaking at around 15% and the legs were really beginning to burn. It eases off after the steep section but keeps on going. Finally we reached the top and after a quick break we headed on down hill towards Henley. After a quick dash into Henley we stopped by the Thames for a quick photo with the Henley Regatta being set up on the opposite bank. It was 9:45pm and the sun had now set. It was getting dark quickly so we headed straight on to the final 20 miles. Not far out of Henley we hit a steep leg burner, Chalk Hill, which was thankfully short and pressed on towards Ewelme. For the next 11 miles we headed over rolling terrain that averaged slowly upwards. It was about this time that we were losing the will to live. Bez was sounding like his sense of humour had entirely gone but at this point we just had to press on. We couldn’t really see anything in the dark and the last few miles were up hill along NCN 5 through dark forests. The ascent was not challenging but we were pretty broken by this stage and every pedal turn was an effort.

We finally reached the top of the final hill and had 6 odd miles to the end of our loop. We headed on down, along roads that had somewhat more ascent than we had been expecting or hoping. The miles truly began to drag on. 4 miles now remained. We were putting everything into it as we pressed on. Surely it can’t be far to go now? 3.5 miles!? Both of us had well and truly lost our sense of humour now but at this point we just had to press on. 3 miles. 2.5 miles. 2 miles. 1.5 miles. 1 mile, half a mile, quarter of a mile. Then .. we’d done it! 171 miles now completed we just needed to return to the car. A surprisingly long mile later we finally arrived back and it was done. What an epic day. It was now 11:25pm and I was beginning to feel very cold as my now inactive muscles stopped producing heat.

We took all the equipment off the bikes, put them in the car and headed off home. We were both completely brain dead after the day’s exertion but happy with the achievment. What a day! I’m not going to forget that for a long time!

My First DnB mix

March 23rd, 2013

I clipped it carelessly but I’m quite happy with it given I’ve never mixed DnB before about 2pm today 😀

Hope someone enjoys 🙂

Spectrograms and how to get a good result

May 17th, 2012

This is something I spent a lot of time working on for Oxford Wave Research’s SpectrumView app.

I saw a question on stackoverflow today on exactly all the problems I spent a good while working out over time so I thought I’d share my response on my blog for posterity. It now follows:

Well it all depends on the frequency range you’re after. An FFT works by taking 2^n samples and providing you with 2^(n-1) real and imaginary numbers. I have to admit I’m quite hazy on what exactly these values represent (I’ve got a friend who has promised to go through it all with me in lieu of a loan I made him when he had financial issues ;)) other than an angle around a circle. Effectively they provide you with an arccos of the angle parameter for a sine and cosine for each frequency bin from which the original 2^n samples can be, perfectly, reconstructed.

Anyway this has the huge advantage that you can calculate magnitude by taking the euclidean distance of the real and imaginary parts (sqrtf( (real * real) + (imag * imag) )). This provides you with an unnormalised distance value. This value can then be used to build a magnitude for each frequency band.

So lets take an order 10 FFT (2^10). You input 1024 samples. You FFT those samples and you get 512 imaginary and real values back (the particular ordering of those values depends on the FFT algorithm you use). So this means that for a 44.1Khz audio file each bin represents 44100/512 Hz or ~86Hz per bin.

One thing that should stand out from this is that if you use more samples (from whats called the time or spatial domain when dealing with multi dimensional signals such as images) you get better frequency representation (in whats called the frequency domain). However you sacrifice one for the other. This is just the way things go and you will have to live with it.

Basically you will need to tune the frequency bins and time/spatial resolution to get the data you require.

First a bit of nomenclature, the 1024 time domain samples I referred to earlier is called your window. Generally when performing this sort of process you will want to slide the window on by some amount to get the next 1024 samples you FFT. The obvious thing to do would be to take samples 0->1023, then 1024->2047, and so forth. This unfortunately doesn’t give the best results. Ideally you want to overlap the windows to some degree so that you get a smoother frequency change over time. Most commonly people slide the window on by half a window size. ie your first window will be 0->1023 the second 512->1535 and so on and so forth.

Now this then brings up one further problem. While this information provides for perfect inverse FFT signal reconstruction it leaves you with a problem that frequencies leak into surround bins to some extent. To solve this issue some mathematicians (far more intelligent than me) came up with the concept of a window function. The window function provides for far better frequency isolation in the frequency domain though leads to a loss of information in the time domain (ie its impossible to perfectly re-construct the signal after you have used a window function, AFAIK).

Now there are various types of window function ranging from the rectangular window (effectively doing nothing to the signal) to various functions that provide far better frequency isolation (though some may also kill surrounding frequencies that may be of interest to you!!). There is, alas, no one size fits all but I’m a big fan (for spectrograms) of the blackmann-harris window function. I think it gives the best looking results!

However as I mentioned earlier the FFT provides you with an unnormalised spectrum. To normalise the spectrum (after the euclidean distance calculation) you need to divide all the values by a normalisation factor (I go into more detail here).

this normalisation will provide you with a value between 0 and 1. So you could easily multiple this value by 100 to get your 0 to 100 scale.

This, however, is not where it ends. The spectrum you get from this is rather unsatisfying. This is because you are looking at the magnitude using a linear scale. Unfortunately the human ear hears using a logarithmic scale. This rather causes issues with how a spectrogram/spectrum looks.

To get round this you need to convert these 0 to 1 values (I’ll call it ‘x’) to the decibel scale. The standard transformation is 20.0f * log10f( x ). This will then provide you a value whereby 1 has converted to 0 and 0 has converted to -infinity. your magnitudes are now in the appropriate logarithmic scale. However its not always that helpful.

At this point you need to look into the original sample bit depth. At 16-bit sampling you get a value that is between 32767 and -32768. This means your dynamic range is fabsf( 20.0f * log10f( 1.0f / 65536.0f ) ) or ~96.33dB. So now we have this value.

Take the values we’ve got from the dB calculation above. Add this 96.33 value to it. Obviously the maximum amplitude (0) is now 96.33. Now didivde by that same value and you nowhave a value ranging from -infinity to 1.0f. Clamp the lower end to 0 and you now have a range from 0 to 1 and multiply that by 100 and you have your final 0 to 100 range.

And that is much more of a monster post than I had originally intended but should give you a good grounding in how to generate a good spectrum/spectrogram for an input signal.

and breathe

Further reading

As an aside I found kiss FFT far easier to use, my code to perform a forward fft is as follows:

    CFFT::CFFT( unsigned int fftOrder ) :
        BaseFFT( fftOrder )
    {
    	mFFTSetupFwd	= kiss_fftr_alloc( 1 << fftOrder, 0, NULL, NULL );
    }

    bool CFFT::ForwardFFT( std::complex< float >* pOut, const float* pIn, unsigned int num )
    {
    	kiss_fftr( mFFTSetupFwd, pIn, (kiss_fft_cpx*)pOut );
    	return true;
    }

Startup company! :D

January 10th, 2011

So its been a while since I posted anything up here *cough*.

Over the past year things have gone a bit strange. As a result I have no left my former employer and am going into business with the chap who was my boss.

We will now be available for contract work.

Primarily we are both very experienced on statistical analysis of audio streams (For acoustic event recognition for one) and would like to do a lot more more research in this area. We will also be looking to extend our research capabilities into video as I have a fair bit of experience at dealing with 2D signals (ie Video).

On top of that we will also be available for contract app development for PC, Mac, Linux, iPhone, Android and Windows mobile 7.

Anyone who wants more information about my skills should check my linkedin page at:

http://uk.linkedin.com/pub/oscar-forth/8/857/22

And/Or Anil’s at:

http://uk.linkedin.com/in/anilalexander

Bloody Dell

May 10th, 2010

Well 2 weeks of trying to fins bugs in a piece of software I’ve written at work only to discover that only the Dell machine was getting the problem.

This morning, thanks to this wonderful blog post from Mark Russinovich we finally tracked the problem to a non-paged pool leak. On closer inspection (ie going through the steps Mark lists) we realised that there was a leak in WavXDocMgr.sys. After much hunting around we discovered that this is in the Dell Control Point Security Manager. We uninstalled it and the leak has gone.

Couple that with the multiple hardware failure we had on this machine and I can honestly say “Cheers Dell”. You’ve caused us to truly f**k off a customer and absorbed waaaay more of my time on one problem than I would have ever thought possible.

Simplifying the Wavelet transform

March 26th, 2010

The code I’ve already given for doing wavelet transforms is very simple but I treat horizontal and vertical passes as seperate functions. It occurred to me as I looked at those functions that they are, essentially, the same. The only real difference is how many pixels you step over to get to the next or previous one. This was a massive realisation to me. It meant that I could do all my wavelet transforms through a single function.

Here is the Foward transform function:

const int32_t postDwtNum	= dwtMax / 2;

const int32_t nextReadOffset2		= nextReadOffset * 2;
//const int32_t nextWriteOffset2		= nextWriteOffset * 2;

int32_t readOffset	= startOffset;

int32_t s			= readOffset;
int32_t d			= readOffset	+ (postDwtNum * nextWriteOffset);

const DataType d1	= m_SignalData[readOffset + nextReadOffset]	- (((m_SignalData[readOffset]	+ m_SignalData[readOffset + nextReadOffset2])	/ (DataType)2));
const DataType s1	= m_SignalData[readOffset]					+ (((d1							+ d1)											/ (DataType)4));

out.m_SignalData[d]	= d1;
out.m_SignalData[s]	= s1;

s	+= nextWriteOffset;
d	+= nextWriteOffset;

readOffset	+= nextReadOffset2;

int dwt = 2;
while( dwt < dwtMax - 2 )
{
	const DataType d1	= m_SignalData[readOffset + nextReadOffset]	- (((m_SignalData[readOffset]				+ m_SignalData[readOffset + nextReadOffset2])	/ (DataType)2));
	const DataType s1	= m_SignalData[readOffset]					+ (((out.m_SignalData[d - nextWriteOffset]	+ d1)											/ (DataType)4));

	out.m_SignalData[d]	= d1;
	out.m_SignalData[s]	= s1;

	s	+= nextWriteOffset;
	d	+= nextWriteOffset;

	readOffset	+= nextReadOffset2;

	dwt += 2;
}
{
	const DataType d1	= m_SignalData[readOffset + nextReadOffset]	- (((m_SignalData[readOffset]				+ m_SignalData[readOffset])	/ (DataType)2));
	const DataType s1	= m_SignalData[readOffset]					+ (((out.m_SignalData[d - nextWriteOffset]	+ d1)						/ (DataType)4));

	out.m_SignalData[d]	= d1;
	out.m_SignalData[s]	= s1;
}
return true;

What a relisation this was! This meant that I could now generalise the algorith to handle multi dimension signals. A 1D signal is pretty simple and is a single call to this function per wavelet "level"

int32_t level	= 0;
while( level < numLevels )
{
	Signal< DataType >::FDWT( 0, 1, 1, GetWidth() >> level, out );
	level++;
}

A 2D signal is a little more complicated as you need to do it for each horizontal line of an image and then each vertical line of an image. The code however still goes through that one function and looks something like this:

// Now run DWT.
int32_t level	= 0;
while( level < numLevels )
{	
	int32_t y		= 0;
	int32_t yMax	= GetHeight() >> level;
	while( y < yMax )
	{
		out.Signal< DataType >::FDWT( y * GetWidth(), 1, 1, GetWidth() >> level, out2 );			// Horizontals
		y++;
	}
	
	int32_t x		= 0;
	int32_t xMax	= GetWidth() >> level;
	while( x < xMax )
	{
		out2.Signal< DataType >::FDWT( x, GetWidth(), GetWidth(), (GetHeight() >> level), out );	// Verticals
		x++;
	}
	level++;
}

Now this then led me on to thinking about a 3D signal. To do the forward transform a 3D signal would now mean for each depth slice performing the 2D transform as above. After that you would then perform the transform a long a depth line or constant row and column. Thats a hell of a lot of calculations but I "think" (Though I have not tested this at all) that the code would look something like this:

int32_t level	= 0;
while( level < numLevels )
{
	int32_t z		= 0;
	int32_t zMax	= GetDepth();
	while( z < zMax )
	{
		int32_t y		= 0;
		int32_t yMax	= GetHeight();
		while( y < xMax )
		{
			out2.Signal< DataType >::FDWT( (y * GetWidth()) + (z * widthHeight), 1, 1, GetWidth() >> level, out );	// Horizontals
			x++;
		}

		int32_t x		= 0;
		int32_t xMax	= GetWidth();
		while( x < xMax )
		{
			out.Signal< DataType >::FDWT( x, GetWidth(), GetWidth(), GetHeight() >> level, out2 ); // Verticals
			x++;
		}
		z++;
	}

	int32_t y		= 0;
	int32_t yMax	= GetHeight();
	while( y < yMax )
	{
		int32_t x		= 0;
		int32_t xMax	= GetWidth();
		while( x < xMax )
		{
			out2.Signal< DataType >::FDWT( x + (y * GetWidth()), GetWidth() * GetHeight(),  GetWidth() * GetHeight(), GetDepth() >> level, out ); // "Depthicals"
			x++;
		}
		y++;
	}
	

	level++;
}

This 3D method gives a potential compression schemes for video. Though I'd hate to think the kind of horsepower you'd need to have available to process a video real-time.

Allin, I cannot believe I didn't spot this simplification before hand ...

WDR Refinement pass and compression.

March 26th, 2010

So it turned out that the refinement pass was actually pretty easy to do.

Lets say I’m compressing a pixel with a value of -129 and I’m using a threshold of 128. I push into a vector the value |129| – 128. ie the absoloute value minus the threshold. I subsequently do this for every pixel above my threshold.

pixelsForRefinement.push_back( absVal - threshold );

I then halve the threshold and do the same sort of processing again. Now I run through the values that were pushed into my vector on the pass before hand and I do a simple check. Is the value in the vector (in the example case 1) greater than 128? No so I write a 0 to the refinement stream. If it IS greater than I write a 1 to the refinement stream and subtract the threshold value from the value stored in the vector.

// We now need to refine the pixels that were added on the previous pass
uint32_t refinement		= 0;
uint32_t refinementMax	= lastNumPixelsForRefinement;
while( refinement < refinementMax )
{
	const int16_t val				= pixelsForRefinement[refinement];
	const BitEncoding::Bit bit		= (val >= threshold) ? BitEncoding::Bit_One : BitEncoding::Bit_Zero;
	pixelsForRefinement[refinement]	-= (threshold * (uint32_t)bit);
	
	bitStreamOut.refineBitStream	+= bit;

	refinement++;
}
lastNumPixelsForRefinement	= pixelsForRefinement.size();

Effectively this means that for each pass through I refine the already WDR’d pixels by the current threshold.

The result is a perfect re-construction from the wavelet information (I shan’t bother posting up a before and after picture as they look the same ;)).

Now to get some form of compression I need to be able to early out from the algorithm and, basically, only store high frequency information (as well as the low frequency image in the first place). This posed a slight problem, however as I need to be able to perform the compression on each wavelet step as I go along. The system I had built thus far performed the compression on all the LH etc segments for each wavelet decomposition level. I needed to do some code re-organisation to get it do them in the correct order (ie LH, HL and HH). It wasn’t too difficult.

At this point though I realised I could MASSIVELY simplify the wavelet transforms. This I will explain in the next article.

Starting wavelet difference reduction

February 26th, 2010

Its been slightly under a month since my last post. In the interim I have done much battling with a teething baby that has resulted in a lack of motivation to attack a complicated problem. The past couple of nights, though, I have set back to the WDR compressor with the intention of getting it working.

I now, fortunately, have a limited compressor/decompressor going.

In the previous article I described the basics behind a Daub 5/3 forward wavelet transform. In this article I will be describing how to do the significance pass of the WDR algorithm.

The basic idea is to run through the high pass sections of the image (LH, HL and HH) with a specific threshold. At each threshold level the algorithm needs to find the number of pixels to the next pixel that is greater than (or equal to) the threshold.

For example, take an image that has these values stored in its high pass:

0     0   -2    0     0    8    0   0

0     9     0  -3     0    4    0   0

0     0     1    7     0    0   6    0

0     5     0    0    0   -8   0    1

0     0     0    0    0    0   2    0

0     0   -4     0    6    0   0    0

0     3     0    0    0    7   0    0

0   -8     0    0    0    0    5    0

0    0     0    0   -2    0    0   0

First start with an arbitrary threshold. In this example I will choose 8. We now start from the beginning and scan through for the first value greater-than-or-equal-to 8. Pixel 6 is the first, pixel 10 is the next, pixel 30 the 3rd and so on. In WDR we want the difference between those pixels. So We would store 6, 4 and 20 for the 3 pixels above. The other obvious part is the sign. In fact we’d want to store those first 3 pixels as +6, +4 and -20. When you have found all the pixels in the image that are above your threshold then you write a + to indicate the end of a run (Though I’m pretty sure it would be quite simple to eliminate the trailing +). You then halve the threshold and process again (Though ignoring the pixels you have already processed).

Thats the basics. Of course its far more complicated than that. For one each high pass section of the image is read in a different way and not as simply as I explained above. This is explained in many papers on WDR (An example). The scan orders are as follows:

For LH:

1   2    3    4

8   7   6    5

9   10 11 12

16 15 14 13

(ie You go form left to right on odd lines and right to left on even forming a weaving pattern)

For HL:

1 8   9   16

2 7  10 15

3 6  11  14

4 5  12  13

(ie Similar to LH but vertical scans instead of horizontal)

For HH:

1    2   6    7

3    5   8   13

4    9  12  14

10 11 15  16

(ie This one is performed by zig-zagging across the texture).

The reason the image is processed in this way is to try and maximise the runs of pixels that you will get. Again this is all explained well in various papers (Though it may well be interesting to play with these to see if you get better compressions by running across the pixels in different ways).

So now you have your list of differences and signs for a set of thresholds. The final part of the WDR algorithm is to bit encode your data. This is something that is poorly explained in many of the articles that I have read, I would truly have been lost if it was not this paper. If we look at the bit patterns of each of the numbers we retrieved earlier (ie 6 = 110, 4 = 100 and 20 = 10100) we can see that the highest bit is always 1 this is the case for all our numbers (If this doesn’t seem obvious now scribble out some binary numbers and you will see why its quite obvious!). Thus when bit encoding we can ignore the leading 1 and store the subsequent bits (ie 10, 00, 0100). Brilliant. We do, however, need to store the sign along side this number. For this reason we use a 2 bit per digit scheme. – = 11, + = 10, 1 = 01, 0 = 00. Armed with this information we can store the numbers in a bit stream as 100100 — 100000 — 11000100. When we build this up over each pass for each threshold we end up saving quite a few bits. This is the main meat of the compression.

Of course writing the bits is not the most fun of things. For this reason I wrote my own templated bitstream class for handling easy addition and lookup of the bit patterns I was after.

My code looked something like this:

 int count	= 0;
int max	 = (width >> 1) * (height >> 1);
while( count < max ) { const int pixelStart	= buffStart	+ iter->GetX() + (iter->GetY() * pitch);
const int16_t val	 = plane.pixels[pixelStart];
const int16_t absVal	= abs( val );
if ( absVal > threshold && absVal < (threshold << 1))
{
	// Push sign.
	bitStreamOut.wdrBitStream += (val < 0) ? BitEncoding::Bit_Minus : BitEncoding::Bit_Plus;
	// Push numbers into bit stream.
	uint8_t bitNum	 = 0;//HighestBitSet( pixelPos );;
	uint8_t bitMax	 = HighestBitSet( pixelPos );
	while( bitNum < bitMax )
	{
		bitStreamOut.wdrBitStream	+= (((pixelPos & (1 << bitNum)) >> bitNum) == 0) ? BitEncoding::Bit_Zero : BitEncoding::Bit_One;
		bitNum++;
	}
	pixelPos	= 1;
}
pixelPos++;
iter->Next();
count++;
}
//Emit +remainingPixels+ to indicate end …
bitStreamOut.wdrBitStream += BitEncoding::Bit_Plus;
uint8_t bitNum	 = 0;//HighestBitSet( pixelPos );;
uint8_t bitMax	 = HighestBitSet( pixelPos );
while( bitNum < bitMax )
{
	bitStreamOut.wdrBitStream	+= (((pixelPos & (1 << bitNum)) >> bitNum) == 0) ? BitEncoding::Bit_Zero : BitEncoding::Bit_One;
	bitNum++;
}
bitStreamOut.wdrBitStream += BitEncoding::Bit_Plus;

Something to note in the code ismy iter->Next() which is one of my 3 iterators that define the way we wind through each highpass section.
We now have a compressed bit stream. I used the now ubiquitous lena image for testing this out.
This is the result of the Wavelet transform:

The arrangement of this image is slightly backward. This is due to the way windows reverses images. You can however see the HL and HH passes on the top and the LL and LH passes on the bottom (instead of the other way round).
From there we need to decompress the bit stream and end up with something like the original image:

This decompression took me a while to get right. Basically, though, you need to go back over the bit stream dropping the appropriate bits in the correct places until you get back an image like the former. Running the inverse wavelet transform should result in the above image.
Easy for me to say. Firstly we need to drop the correct values in the correct places. To do this we need the original threshold value we started with. The bit stream will always start with a plus or minus. Following that we have the bits of the difference value. When we find another sign we are on to the next pixel. Don’t forget that the 1 needs to be added to the left of the number! When we get a number that takes us to the end of the image we will find an extra plus and we can go back to the beginning at half the threshold.
So the first value we get out is a 6 and we have a sign of +. This means we are writing a +8 at the 6th pixel position. We then read a +4 so we are writing a +8 at the 10th position. We next get that -20 which means we write a -8 at the 30th position and so on.
The quick (er than me) will noticed that this means that the value of 9 in our earlier example will get written back as an 8 (or a value of 7 written as a 4). This is where the refinement pass comes in. I will explain this in a further article (ie when I get it working ).
My code for doing this looked something like the following:

iter->SetToStart();

int pixelPos                = 0;
while( pixelPos < size )
{
	int sign                    = bitStream.wdrBitStream[bit];
	bit++;

	int pixelDiff               = 0;
	int bitPos                  = 0;
	BitEncoding::Bit readBit    = BitEncoding::Bit_Zero;
	while( readBit < BitEncoding::Bit_Plus )
	{
		readBit = bitStream.wdrBitStream[bit];
		switch( readBit )
		{
		case BitEncoding::Bit_One:
			pixelDiff   |= readBit << bitPos;
		case BitEncoding::Bit_Zero:
			bitPos++;
			break;

		case BitEncoding::Bit_Minus:
		case BitEncoding::Bit_Plus:
			break;

		}
		bit++;
	}
	pixelDiff   |= 1 << bitPos;   // Always a 1 at the top end.

	pixelPos    += (pixelDiff - 1);

	if ( pixelPos < size )
	{
		// Step forward through the iterator pixelPos times.
		uint32_t count  = 0;
		while( count < pixelDiff )           {               count++;               iter->Next();
		}

		const uint32_t writePos   = buffStart   + (iter->GetX() + iter->GetY() * pitch));
		planeOut.pixels[writePos] = (sign == BitEncoding::Bit_Plus) ? threshold : -threshold;

		bit--;
	}
}

Of course you can run the wavelet transform on the low-pass sub-image created from previous steps. This will give you something like the following:

Decompressing this image returns you to the following:

In this you can see some colour errors. I think this is caused by my lack of a refinement pass however its well worth noting the details on her hat brim and her boa. Both of these have returned and look good in the re-sized image.

Anyway … should anyone actually be reading my ramblings or have any questions then say hi in the comments!

Update:   I identified a bug whereby the software would get confused by a 0 and a 1 appearing in the WDR stream as both end up as the same value.  Zeros really shouldn’t be turning up so this has now been fixed.

The difficulties of Wavelets

January 31st, 2010

I’ve now spent a few weekends working on my WDR system. The one overriding thing that gets to me is just how bad the information is out there. There are lots of things that aren’t quite spelled out well enough for non-maths gods like me.

Anyway. On to the basics of the transform. Anyone who has dealt with wavelets knows that you decompose the image into 4 sections.

LL LH
HL HH

You then go on to to the same process to the LL portion of the image. Rinse and repeat this method for as long or as short as you like (You don’t have to even do more than the first decomposition). Effectively what you now have is high pass information and low pass information. The low pass is the LL section. The highpass sections give you the horizontal high pass, the vertical high pass and the diagonal high pass (Effectively edge detect in 3 different directions).   These are generated by first applying the wavelet transform horizontally and then applying it on the resulting image vertically.  The joy of the wavelet transform is that it is reversible. Exactly so. This means you can re-construct the original image from the various high and low passes losslessly.   Its worth bearing in mind that this can be trivially extended into 3 dimensions to compress video.   I keep wondering whether you’d be able to re-sample frame rates nicely using this sort of video transform but thats an aside.

Ok so lets get on to a wavelet transform.    I’m going to use the Daub5/3 wavelet because it can be done entirely with integers which means there are no rounding issues.  It is truly lossless.

The equation is as follows (Taken from: A Low-power, Low-memory System for Wavelet-based Image Compression by Walker, Nguyen & Chen):
Daub5/3 Equation

My first thought on seeing that was WTF!?  However I carried on reading through the excellent paper linked above.

It turns out that there is a more efficient (And simpler) form of this wavelet transform known as a lifting wavelet transform.  The transform looks like the following:

Daub 5/3 Lifting Equation

Brilliant.  This looks like a far more friendly bit of maths.  So lets take a look at what these things mean.
  • s0 is the Image we are performing the transform on.  It is a very standard image in the 8-bit 0->255 range.
  • s1 is the low passed output of the transform.
  • d1 is the high passed output of the transform
  • The whole n thing is a bit confusing.  2n is every other pixel in the source image.  It is not, however, every other pixel in the destination image.  The indication of 2 indicates a step of 1 pixel when dealing with s1 and d1.  This is massively confusing for the likes of me.

First thing to notice is that the calculation for s1 contains usage of d1.  Not just d1 but the  previous pixel’s d1.

(At this juncture I’d like to point out that the aforementioned paper gives another equation as an “integer-to-integer” equation.  Try as I might I could never get this working.  The equation I’ve just posted works perfectly).

I’m going to move on to some code now.

const int n = (x) + (y * pixelPitch);

const int n2 = (x / 2) + (y * pixelPitch);

const int s = n2;

const int d = n2 + widthDiv2;

const int16_t d1 = pixelsIn[n + 1] –

(((pixelsIn[n] + pixelsIn[n + 2]) >> 1));

const int16_t s1 = pixelsIn[n] +

(((pixelsOut[d – 1] + d1) >> 2));

pixelsOut[d] = d1;

pixelsOut[s] = s1;

The above is a simple code version of the wavelet transform.    There is a problem with it, however.  You’ll note that due to the fact it uses the previous high pass pixel outputed in the s1 calculation and the next pixel from the input in the d1 calculation.

This means that we get a break down at either end of a line.  Fortunately this is fairly easy to rectify (Though I can find no mention of it anywhere).  For the first pixel on a line you can just use d1 twice.  For the last pixel you use the current pixel twice.  Its an easy solution and works well.

When doing a vertical decomposition the equation is much the same but instead of doing -1, +1 and +2 by adding/subtracing the pixel pitch (The width of the entire original texture) instead of 1.  Doing this twice over performs the appropriate 2D forward wavelet transform.

Inverting the wavelet transform is pretty simple as well.  You can easily get the reverse of the transform by re-arranging the equations.  Bear in mind, however, that you need to step from the right hand side of the image towards the left (ie backwards) to re-construct the image.  The trick above for handling the special cases at either end of the line works just as well for the inverse transform.

The code for the central part of the inverse transform will look something like this:

const int n = (x) + (y * pixelPitch);

const int n2 = (x / 2) + (y * pixelPitch);

const int s = n2;

const int d = n2 + widthDiv2;

const int16_t s0 = pixelsIn[s] –

(((pixelsIn[d – 1] + pixelsIn[d]) >> 2));

const int16_t s1 = pixelsIn[d] +

(((s0 + pixelsOut[n + 2]) >> 1));

pixelsOut[n] = s0;

pixelsOut[n + 1] = s1;

Anyway I hope thats slightly useful.  It certainly will be to me when I inevitably forget what I’ve done 🙂

I’ve started blogging!

January 29th, 2010

I intend this blog to represent my various programming related interests.  I have been working on a Daub 5/3 based WDR compression scheme.   When I get the chance I will describe how exactly I got this working.    However I have re-started my alcohol drinking (post detox) so it may take longer than I intend 😉