It’s summer, which means: I’ve got time for some crazy experiments! As you may know, two years ago I played around with voice recognition. My original idea for this summer was to do some robotics stuff, but unfortunately I’ve had problems getting the serial connection to work. But don’t worry, because I’ve found something else to do. I noticed a nice red box lying around in my room, so I thought: «let’s see how difficult it is to get my computer to see it».

For extra fun, I also decided to impose two rules: no reading up on existing algorithms or «weird maths» (I’ll have enough time for this when I study Computer Vision at university), and no «magic» (ie., implementing everything myself and not using existing functions for edge detection or whatever).

First steps

Of course, my first choice for trying this was Python, so I installed PIL and SciPy, and took a photo of the box. With just a function call, they give me access to each pixel’s RGB value, so it’s really easy to have it change all points that aren’t red into black and other funny stuff.

However, iterating through all the pixels in the image was quite slow (in the order of several seconds) so I decided to switch to C and OpenCV, which turned out to be really nice to work with. After a while I had written down the first algorithm I could think of, and it kinda worked (see video at right). I want to thank Michal for the idea of converting the image to the HSV colour space for colour checking.

Noise reduction

After I had a first success, I worked on some new multi-pass code with lots of loops to see if I can get some noise reduction. I started by turning everything which isn’t detected as red into black so that I get a better overview of what’s going on. Then I improved this to let it find non-matching pixels surrounded by red ones (on the top and the left they must be adjacent, but on the right and bottom the nearest red may be some distance away; this works even for groups of pixels since the image is processed from left to right and from top to bottom). Finally, I added some more code to remove any little group of red pixels surrounded by non-matching ones. The effect of this will be seen later.


For some reason the detection gets mad if there is artificial light (or maybe it’s just that I haven’t found the right colour values to check for). No light at all (except for that of the computer screens) works better -after reducing the minimum levels of saturation and brightness I was checking for-, but it gives a lot of noise. My first program therefore fails miserably under such conditions.

However, now that I have the noise reduction this is no longer a problem. See by yourself: image at right. (The really bright points are those which didn’t match but have been converted to red since they are surrounded).

Second try at box detection

I could just have put my previous box detection algorithm on top of it, but I decided to make use of the fact that the noise reduction is successfully removing everything except for the box. Thus, instead of searching for red lines likes before I just localize the left-most, right-most, highest and lowest red points.

Using those four points, I calculate the centre of the box (and I could also reliably determine the position of the four corners, if I added some more code to find whether the box is horizontally aligned or not).

Finally I calculate the relative movement of the centre point and this way I can control a pointer (bottom right of the video). Here we go:


(And it works as well at night! – Code is here)