BIND 10 zones in memory

on 22 Sep 2012 by Mukund (@muks)

BIND 10 is getting close to a formal alpha release soon. At ISC, we are going to blog about its features, how administrators can put it to use on their networks, and how developers can use its libraries. As a warm-up exercise, I want to write about how BIND 10 stores zones and zone data in memory. You'll follow this article easily if you are a programmer who knows about red-black trees and have set up your own DNS server. Let's begin with an explanation anyway. :)

A DNS zone contains (for the majority of us) the DNS configuration of a domain. System administrators create and maintain zone files containing a variety of data about a domain that are served by an authoritative DNS server. These may include records of various types such as A, AAAA, MX, etc. Here is a real zone file for akira.org (for the new Akira project), so that you can see what data is in it:

$ORIGIN akira.org.
$TTL 3600
@       IN      SOA     ns1.banu.com.   webmaster.banu.com. (
                        2012092202 ; serial
                        21600      ; refresh after 6 hours (for slaves)
                        3600       ; retry after 1 hour (for slaves)
                        604800     ; expire after 1 week (for slaves)
                        3600 )     ; minimum TTL of 1 hour (for resolvers)

                        IN NS   ns1.banu.com.
                        IN NS   ns2.banu.com.

                        IN A    178.63.213.54
                        IN AAAA 2a01:4f8:110:54e1::14

                        IN MX   10 messaging.banu.com.

www                     IN A    178.63.213.54
www                     IN AAAA 2a01:4f8:110:54e1::14

; SPF record
akira.org.              IN SPF  "v=spf1 a mx -all"

Download akira.org.zone.

In this example, there are A and AAAA records for akira.org. and www.akira.org. that are used to find the IP address of a name. The MX record for akira.org. points to where email for this name should be delivered. There are also other record types. A typical zone file in a company may contain hundreds of such records. Note that some names have a trailing dot (.). This indicates that they are absolute to the root of the DNS namespace. Others don't and these are relative to the zone's origin (akira.org.) name. As an example, www by itself in this zone means the www.akira.org. name.

An authoritative DNS server has to be able to deal with numerous zones, all of which take a place in the DNS hierarchy. It has to be able to efficiently traverse its zone storage data structure for DNS names in order, and quickly locate a particular domain or sub-domain's records. It also has to be careful about how much space such a data structure would occupy in memory.

In BIND 10, we use a modified red-black tree which we call a DomainTree. I will not describe the properties and performance of red-black trees here; you can read about them in the linked article. DomainTree is based on the concept of a red-black tree to inherit its search-tree properties. The main change that was made to the red-black tree was the addition of a down pointer to every node, to make it a tree of trees. There are some additional operations on the DomainTree to maintain its properties, but we'll get to it later. DomainTrees are used to map DNS names to other data.

There are two types of DomainTrees that we use in BIND 10 (three actually, but we'll ignore the third which is used for storing NSEC3 data for now).

If this is confusing, ignore all of it. Just pick up ZoneTree which represents a single zone such as akira.org. above. A ZoneTree is a DomainTree where keys are names and values are record data (a pointer to a RdataSet).

Here is what the ZoneTree of the zone presented above looks like (just the keys are shown):

akira.org. zone

Dry, right? That's because it has records for only two names in that zone (akira.org. and www.akira.org.). It gets pretty when it gets full. :)

I'll explain the legend used and tree's structure in a moment, but first you have to see some examples. Here is the ZoneTree for a sample root (.) zone:

root zone

Download root.zone to see its records.

Here is the ZoneTree for a sample example.org. zone (click to zoom):

example.org. zone

Download example.org.zone to see its records.

Now, the properties of this tree:

Now let's look at ZoneTableTree. Whereas ZoneTree is a tree that contains everything in a single zone, ZoneTableTree is a singleton global tree where the key is the origin name and the value is a pointer to the corresponding ZoneTree (which contains that origin's zone). So when we want to lookup something, we first find the corresponding origin's node in the ZoneTableTree, go to its ZoneTree and find the RDATA (record data) there. The RDATA itself is stored in a linked-list class called RdataSet which is a straightforward list of RRs separated by their types (A, AAAA, MX, SOA and so on).

Having populated the root zone (.), akira.org. and example.org. with the zone files above, the ZoneTableTree looks like this:

the ZoneTableTree

The graphs in this post were generated by BIND 10 and rendered using GraphViz.

Unzix 0.4.0 is now available

on 7 Apr 2012 by Mukund (@muks)

Unzix version 0.4.0 is now available. This release fixes portability with some older operating systems. See the Unzix 0.4.0 NEWS file, which contains its release notes.

Help make GIMP operations better (GSoC 2012)

on 27 Mar 2012 by Mukund (@muks)

Just like last year, one of the tasks that we want to offer students in this year's GSoC is porting GIMP image processing code to GEGL. This is a very interesting task and if you are a student with experience in C programming and an interest in computer graphics, be sure to consider it.

GIMP is being modified to use a new image processing library called GEGL. This will bring much requested features to GIMP such as support for more precision in image manipulation operations, and better support for different color formats. For this, we have to rewrite a lot of the image processing code in GIMP and move it to GEGL. The resulting code will do more or less the same things, but will work with a different data format, and inside a different framework (GEGL).

There are many such operations (in the main application and in plug-ins), each doing something entirely different from another. Porting each operation is a relatively simple and straightforward task, but it requires a good command over the C language and ability to review and understand unelegant code.

As this is a large task, it could be sub-divided into smaller tasks (groups of operations) to be taken up by more than one student.

The student is asked to subscribe to the gimp-developer mailing list and send the following:

  1. Your background as it applies to this task (please don't send any personal information)
  2. A code review and algorithm description of some GIMP plug-ins (e.g., cubism, fractal trace, plasma)
  3. A code review and algorithm description of the following GEGL op: gaussian blur
  4. Sample implementation of a new GEGL op. This could be anything of your choice, even Hello World. Please send it as a patch against the GEGL master branch.

If you need any help with any of these tasks, please ask on the gimp-developer mailing list, or by chatting in the #gimp IRC channel.

OpenBSD bug in the random() function

on 20 Mar 2012 by Mukund (@muks)

I now work for ISC through Banu (which handles my payroll and some other stuff in India). These days, among other things, we're trying to get BIND 10 to work on OpenBSD. Yesterday, I stumbled upon a bug in random() on OpenBSD (tested with OpenBSD 5.0 on amd64). The bug is that if you call srandom(0) to initialize the RNG, random() will always return 0. Here is a testcase:

#include <stdio.h>
#include <stdlib.h>

int
main (int argc, char *argv[])
{
  int i;

  srandom (0);

  for (i = 0; i < 32; i++)
    printf ("%ld\n", random ());

  return 0;
}

Download random.c. This should print 32 zeroes on OpenBSD and 32 random numbers on correct implementations.

POSIX describes the output of random():

This does not seem to be deliberately introduced behavior. The OpenBSD code for srandom() looks like this:

void
srandom(unsigned int x)
{
  int i;
  int32_t test;
  div_t val;

  if (rand_type == TYPE_0)
    state[0] = x;
  else {
    state[0] = x;
    for (i = 1; i < rand_deg; i++) {
      /*
       * Implement the following, without overflowing 31 bits:
       *
       *	state[i] = (16807 * state[i - 1]) % 2147483647;
       *
       *	2^31-1 (prime) = 2147483647 = 127773*16807+2836
       */
      val = div(state[i-1], 127773);
      test = 16807 * val.rem - 2836 * val.quot;
      state[i] = test + (test < 0 ? 2147483647 : 0);
    }
    fptr = &state[rand_sep];
    rptr = &state[0];
    for (i = 0; i < 10 * rand_deg; i++)
      (void)random();
  }
}

(Taken from lib/libc/stdlib/random.c and modified to fit.)

In this scope, rand_type is TYPE_3 by default. When srandom(0) is called, state[0] is set to 0 and it follows that state[1]...state[rand_deg-1] are set to 0 inside the for loop. random() then will always return 0. The bug is technically in the srandom() implementation, but the output of random() is where it is visible to a libc user.

The FreeBSD implementation of srandom() looks like this:

01: static inline uint32_t good_rand (x)
02:      int32_t x;
03: {
04: #ifdef  USE_WEAK_SEEDING
05:   /*
06:    * Historic implementation compatibility.
07:    * The random sequences do not vary much with the seed,
08:    * even with overflowing.
09:    */
10:   return (1103515245 * x + 12345);
11: #else   /* !USE_WEAK_SEEDING */
12:   /*
13:    * Compute x = (7^5 * x) mod (2^31 - 1)
14:    * wihout overflowing 31 bits:
15:    *      (2^31 - 1) = 127773 * (7^5) + 2836
16:    * From "Random number generators: good ones are hard to find",
17:    * Park and Miller, Communications of the ACM, vol. 31, no. 10,
18:    * October 1988, p. 1195.
19:    */
20:   int32_t hi, lo;
21: 
22:   /* Can't be initialized with 0, so use another value. */
23:   if (x == 0)
24:     x = 123459876;
25:   hi = x / 127773;
26:   lo = x % 127773;
27:   x = 16807 * lo - 2836 * hi;
28:   if (x < 0)
29:     x += 0x7fffffff;
30:   return (x);
31: #endif  /* !USE_WEAK_SEEDING */
32: }
33: 
34: void
35: srandom(x)
36:      unsigned long x;
37: {
38:   int i, lim;
39: 
40:   state[0] = (uint32_t)x;
41:   if (rand_type == TYPE_0)
42:     lim = NSHUFF;
43:   else {
44:     for (i = 1; i < rand_deg; i++)
45:       state[i] = good_rand(state[i - 1]);
46:     fptr = &state[rand_sep];
47:     rptr = &state[0];
48:     lim = 10 * rand_deg;
49:   }
50:   for (i = 0; i < lim; i++)
51:     (void)random();
52: }
53:

(Taken from lib/libc/stdlib/random.c and modified to fit.)

Here, successive state[i]s in the for loop are set to the output of good_rand(), and good_rand() checks if a 0 is passed (line 23 above) and uses a different value if that's the case. Because of this check, FreeBSD doesn't have this bug.

The glibc implementation of __srandom_r(), which srandom() wraps, looks like this:

01: int
02: __srandom_r (seed, buf)
03:      unsigned int seed;
04:      struct random_data *buf;
05: {
06:   int type;
07:   int32_t *state;
08:   long int i;
09:   int32_t word;
10:   int32_t *dst;
11:   int kc;
12: 
13:   if (buf == NULL)
14:     goto fail;
15:   type = buf->rand_type;
16:   if ((unsigned int) type >= MAX_TYPES)
17:     goto fail;
18: 
19:   state = buf->state;
20:   /* We must make sure the seed is not 0.  Take arbitrarily 1 in this case.  */
21:   if (seed == 0)
22:     seed = 1;
23:   state[0] = seed;
24:   if (type == TYPE_0)
25:     goto done;
26: 
27:   dst = state;
28:   word = seed;
29:   kc = buf->rand_deg;
30:   for (i = 1; i < kc; ++i)
31:     {
32:       /* This does:
33:            state[i] = (16807 * state[i - 1]) % 2147483647;
34:          but avoids overflowing 31 bits.  */
35:       long int hi = word / 127773;
36:       long int lo = word % 127773;
37:       word = 16807 * lo - 2836 * hi;
38:       if (word < 0)
39:         word += 2147483647;
40:       *++dst = word;
41:     }
42: 
43:   buf->fptr = &state[buf->rand_sep];
44:   buf->rptr = &state[0];
45:   kc *= 10;
46:   while (--kc >= 0)
47:     {
48:       int32_t discard;
49:       (void) __random_r (buf, &discard);
50:     }
51: 
52:  done:
53:   return 0;
54: 
55:  fail:
56:   return -1;
57: }
58:

(Taken from stdlib/random_r.c and modified to fit.)

This code checks if seed is 0 (line 21 above) and uses a different value if that's the case. So, even glibc doesn't have this bug. srandom(0); and srandom(1); with glibc have the same effect. Note that srandom(0); and srandom(123459876); on FreeBSD don't have the same effect as state[0] is still 0 on FreeBSD when the seed is 0.

I don't think this bug is severe, but the OpenBSD developers will likely want it to be fixed for the sake of correctness. Yesterday, I mentioned it in the #openbsd IRC channel on Freenode. I've requested my colleague to report it to the OpenBSD mailing lists as he's already subscribed.

Licenses: OpenBSD and FreeBSD code above are covered by BSD licenses. glibc code is covered by the GNU Lesser General Public License version 2.1 or above. As these are just minor fragments for the sake of illustrating the bug, I'm not including the full licenses here.

Update: The OpenBSD developers have fixed the bug quickly! And it has also been assigned CVE-2012-1577.

Interview of ColorHug maker, Richard Hughes

on 30 Jan 2012 by Mukund (@muks)

For a while now, I've wanted Banu to do interviews of makers of things with free and open designs. Being a fan of PingMag MAKE, it was apparent that there was a lot of hard work, learning, fun and satisfaction to be had in making. It's too bad that PingMag shutdown, but they still inspire. So when the ColorHug comes along—an open hardware product related to graphics—there's no better time to start interviewing. Solder when the iron's hot!

The ColorHug is a colorimeter that can be used to calibrate computer displays. It was created by Richard Hughes (hughsie). It is a fully open hardware project, and the design, drivers and firmware are available on the Gitorious code hosting website. From the branches and commit logs it appears that others have taken an interest in its development too, and have begun to contribute to it.

ColorHug 1 ColorHug 2 ColorHug 3

Without further ado, here's Richard Hughes. :)

Question Who are you? What have you been doing so far? How did you get interested in computers, electronics and building stuff?

My name is Richard Hughes, and I'm a programmer in the desktop group at Red Hat. I created and now maintain the following projects: upower, PackageKit, colord, shared-color-profiles, shared-color-targets, gnome-color-manager, gnome-power-manager and gnome-packagekit. From right since I was a little child, I was always taking things apart and making new things, and so my degree choice of Electronic Engineering shocked none of my friends or family. Whilst at University I got into writing code, and specifically open source code. I started hacking on low level userspace bits and the kernel, and then after my masters had finished I took a job at a large UK defence contractor. It was pretty much the opposite environment to open source, and as soon as Red Hat asked if wanted to hack on cool stuff full time I jumped at the chance. Although I'm hugely privileged to spend all day writing free software, I've always missed making things, and I figured I could do something with open hardware as a hobby.

Question How did you end up in computer graphics / color management?

When I bought a digital SLR camera, my wife paid for me to go on a course to learn how to use the camera properly. During this course I used OSX for the first time, and came to the realisation that the color stuff just worked. No messing around on the command line, no technical jargon, it just worked. Color Management on Linux was in sorry state of affairs then, and I thought I could do something about that.

Question What is the ColorHug? What is a colorimeter?

A colorimeter is a device that attaches to the screen and measures the actual colors output by the computer. It allows us to see what color is actually being shown to the user, rather than what color the computer thinks it's showing the user. Using lots of these samples, we can apply a correction profile to the screen to make images look as they ought to look. As LCD panels get older, they get yellow and dull, and even CRT monitors have phosphors that degrade over time. This means you have to calibrate every few months to ensure that the colors match what they should.

The ColorHug is an open source colorimeter. It's designed from scratch, and every part is 100% open source. All the other colorimeters you can buy in shops have proprietary code that means we have to reverse engineer the hardware to make it work on Linux, and then we can't modify the hardware to do something else, or fix bugs and add features like you can with open source hardware.

Question What information is in color profiles?

A color profile is really just a binary blob of data that describes the color response of an input or output device. You can have color profiles that just encode the curves of colors as a set of matrices, or you can have color profiles that are made up of of huge lookup tables. You can also store arbitrary data in a profile, and so you normally store a lot of extra useful stuff there too like the model number of the device or the paper type for the printer.

Question What are the tech specs of the ColorHug? Does it use USB? Does it need a battery?

The device is a small USB peripheral that connects to the host computer and takes XYZ measurements of a specified accuracy. Longer measurements lead to a more accurate result, but take more time. The ColorHug has no battery and takes the few hundred milliamps it needs from the USB bus. The device has two LEDs that can either be switched on or off, or be programmed to do a repeated flashing with specified on and off durations. The device supports up to 64 calibration matrices, and by default 3 are provided which are mapped to LCD, CRT and projector, with 3 additional reserved types.

Question Tell us about the ColorHug's design.

The ColorHug is actually a PIC microcontroller that is interfaced with a TCS3200 light to frequency chip. The frequency is proportional to the amount of light, and so by enabling the red, green and blue photodiodes in the sensor we can combine these with a bit of clever maths into an XYZ color value. The PIC just controls the sensor and accurately counts pulses at high speed, and then responds to requests from the USB host to do different low level things.

Question What data structures / file formats do you work with?

Bit of an odd question, the answer is LOTS. :)

Question What are some of the unique algorithms used in the ColorHug?

I'm pretty sure there's nothing unique in the device, it's really simple hardware along with some school vector and matrix maths. The ColorHug is able to store custom CCMX matrices on the device, which is something that I was surprised that no other colorimeter vendor seems to do.

Question Can you walk us through the software path through various components of a Linux stack, when using typical ColorHug facilities?

Well, just taking a sample in gcm-picker is a good example.

gcm-picker opens the ColorHug device using colord, and asks colord for a number of XYZ samples. colord opens the device, and sets up some initial parameters (e.g. sample time) and then does a write to the USB device. When the write is complete, the host then reads the data back from the USB device and transports it back to the application over DBus.

Question What hardware tools do you need to build/debug/test this device?

To build and test the device, I need:

Question What software tools did you use to build this?

I used gEDA to design the schematic and PCB, and from someone that's used very expensive PCB design software like Mentor Graphics, it was super easy. The gEDA guys are doing a really good job. From a software side I use the "free" (but not open) Windows MPLAB compiler and then wrote all the firmware loader software myself.

Question What skills did you have to use in this project? Is this your first hardware project?

I used to work on designing test equipment for military flight computers, so I've got a fair bit of experience doing moderately clever things with very small components. That said, a few people have built the ColorHug board now who are just hobbyists, and it's refreshing to know the board is easy to build.

Question How did you go about building this project?

The "hello world" program is often the hardest part of a project. I probably spent about 3 weeks working with a PIC evaluation board just to turn on a LED from custom flashed firmware before I could be happy that the processor was suitable for the job. The other bits are easy, once you have that first flashing LED. I soldered the sensor onto a SOIC8DIP8 convertor board and then used some breadboard to wire it onto the evaluation board. From there I got the sensor working, and could sense the different colors.

Question What are the main components of the ColorHug? What are their roles?

ColorHug schematic

Question How does the main IC sensor work? How does light get transformed into values you can use?

The sensor is actually an array of 64 smaller sensors, which are arranged in a grid of Red, Green, Blue and White. When the PIC reads the frequency counts for a given sample, it converts then into 3 deviceRGB numbers, which are converted using a calibration 3×3 matrix into the XYZ values. The XYZ values can also undergo another 3×3 matrix on the device, which converts the numbers to a given set of display primaries. This lets the ColorHug be calibrated against an sRGB screen and used on wide-gamut displays.

TCS230D sensor

Question How do you calibrate ColorHugs?

Calibration involves taking about 500 samples of dRGB and then using the Argyll CMS program ccxxmake to crunch the numbers into the best calibration matrix. This adapts the dRGB color into an XYZ value.

Question How did you design the enclosure?

The enclosure is a standard black ABS box from Hammond, with two cut outs that I do with several wooden and metal jigs. When I've got a bit of profit, I'm hoping to buy some equipment to do the holes in a better and quicker way.

Question How does the hardware process commands sent via USB by the software?

From a device point of view it just sits and waits for an input packet (64-byte buffer) and then parses the input. It then does whatever is required and then writes out an output packet. Much more details about what kind of data is sent is available in the firmware specification header file ColorHug.h.

Here's a basic flow chart:

Client: set the sensor multiplier (the frequency divider) to 100%
Hardware: OK

Client: set the sensor integral time to 65000 processor cycles
Hardware: OK

Client: flash the green LED twice
Hardware: OK

Client: take an XYZ reading with the default XYZ matrix for an LCD display
Hardware: OK, 123.45 67.89 34.56

Client: set the sensor multiplier to 0% (to turn the sensor off)
Hardware: OK

Question Is ColorHug free hardware? What did you keep in mind when designing it so that it was easily hackable?

It's all 100% free. When designing the PCB, I had to keep in mind that people who wanted to build the PCB themselves were probably not experts in SMD rework. So, the PCB is larger than it could be, and I've deliberately used large 0805 components rather than the more standard 0603 size. The code is designed for accidental shorts, for instance the spare processor pins are wired up as inputs, and there's a watchdog timer to stop the hardware getting wedged. The LEDs help enormously when debugging, as they output Morse code in event of a hardware error.

ColorHug PCB neaar side ColorHug PCB far side

Question What is Hughski Limited? How did you decide to start it?

Hughski Limited is a tiny company I set up to reduce the amount of personal liability I had. I'm naturally quite a risk-averse person, and so if some crazy lawyer decided to sue the company because his ColorHug poisoned his cat and ran away with his wife then they could claim all £66 of profit in the business rather than risking my personal savings. It's also a good way to work out how much tax you've got to pay at the end of the year if you try to keep the business and private finances separate.

Question I see that it's a UK registered company. What challenges did you face in creating this company? How much work was it? Are there other employees than you?

Actually creating the company is very easy and only a few hundred pounds. The hard bit is setting up the tax and getting all the permits you need to start trading. There are technically no paid employees, although I solder the boards and my wife sticks on stickers and screws the units together.

Question How do you intend to attract more people to use the ColorHug?

At the moment, there is a waiting list to buy the ColorHug, and so I'm not intending to advertise at all. I'm hoping that when people get their ColorHug and tell their friends, that will be all the advertising I need.

Question How do you source your components?

Most of the components are from China and Taiwan. Most are bought using big companies such as Mouser and Farnell, but some (like the IR cut-off filter) are custom made in China, and these are hard to obtain.

Question Do you intend to do other devices in the future?

I'm playing with the idea of making an open source three axis CNC machine, as I've found the hardest part of making the ColorHug was actually drilling the case to any kind of precision.

Question How does it feel to work on free hardware?

It's a bit unreal really. I was only going to make 50 devices, and I hoped that I would not be left with devices spare. Now I've got a few hundred orders and the community is starting to grow. That bit feel great.

Question Are the software parts entirely free? Are there any closed tools which you have to use?

I use the closed source MPLAB to compile the bootloader and firmware, but only because the open source SDCC compiler didn't work for me. At some point I'll have to put in the hours and fix SDCC, but until then I just need to get hardware out of the door.

Question Now that you have completed this project, what observations did you make in retrospect?

Well, the amount of cash it took to make a professional looking product is huge. The main problem is that most companies have a minimum order quantity of about 10,000 units, which is a ton of money for me on a project with no precedence. I've been lucky to find local companies that will take on small orders at reasonable prices.

In hindsight, I should have also worked longer on the prototype unit before announcing it to the world, as it takes quite a long time to convert a design designed for one unit, to a design that can be made in batches of 100. In the amazon-one-click world we live in, people don't like it when you announce its going to be 12 weeks until they get hardware.

Question Did you have fun? Does the ColorHug rock?

I'm still having fun, and the hardware seems to work for people, which is the main thing. Hopefully soon I can afford to pay myself something for my time, as up until now I've been buillding the units for free.

ColorHug 4 ColorHug 5 ColorHug 6

Question I am a GIMP user. Assuming GIMP supports a color managed workflow, how does one configure and use ColorHug on a Linux desktop?

If you're new to all this color management stuff, just fire up gnome-control-center and click the color panel. Then click "Learn more" and read all the documentation I wrote for GNOME 3.2. The ColorHug is just another supported device, so there's really nothing special you need to do.

Richard's computer
Richard's computer
Where it all happens

Richard Hughes
Richard Hughes
GNOME developer and free hardware hacker

That is all, readers. :) We hope you enjoyed this interview. If you have someone in mind to be interviewed, or you yourself want to be interviewed about your free hardware or software project, please email me the details or tweet @banubears. The ColorHug is offered for purchase at £48 currently. If you use Linux and are into graphics or hardware hacking, buy one and support its development. Happy hacking!

Arithmetic fun with mod_rewrite

on 20 Dec 2011 by Mukund (@muks)

I've been coding for the Banu Shop which will be opened soon. We store product images after a bit of processing in filesystem directories of this syntax:

.../path/to/images/$int1/$int2/foo.jpg

where int2 is the product identifier (positive integer) and int1 = int2/10000. This two-level directory structure is so that we don't have tens of thousands of sub-directories in a single directory. There are other approaches such as using hashes and substrings of them, but our approach is fine for us. So sample image filepaths are:

.../path/to/images/0/1/purple.jpg
.../path/to/images/0/700/blue.jpg
.../path/to/images/0/1023/orange.jpg
.../path/to/images/2/20470/guava.jpg
.../path/to/images/15/150730/che.jpg

HTML product pages refer to their images using relative urls like "./images/978-81-250-3947-1.jpg". So we eventually need a URL like:

https://banu.com/shop/25/greek-myths/images/978-81-250-3947-1.jpg

to be rewritten to a passthrough URL:

https://banu.com/some/path/to/images/0/25/978-81-7371-382-8.jpg

We already use mod_rewrite for a lot of the rewrites, but it can't directly generate int1 in the URLs because there's arithmetic involved (int1 = int2/10000). After some very hacky solutions, I found that mod_rewrite does have a facility for custom URL mapping. The prg: style external program is a no-go as it would be too inefficient. But the internal compiled map function was very appealing.

This is just a dump of code for anyone else who wants to get something like this done, or any other arithmetic for that matter. The following Apache httpd module is loaded as a plug-in and generates int1 for the URLs above, which can be used in the rewrite rules.

#include <stdlib.h>
#include <apr_strings.h>
#include <httpd.h>
#include <http_protocol.h>
#include <http_config.h>
#include <http_core.h>
#include <http_log.h>

typedef char * (map_t) (request_rec *r, char *key);
APR_DECLARE_OPTIONAL_FN (void, ap_register_rewrite_mapfunc,
                         (char *name, map_t *func));

static char *
banu_product_id_map (request_rec *req,
                     char        *key)
{
  unsigned long d;

  d = atol (key);
  d /= 10000;

  return apr_ltoa (req->pool, d);
}

static int
pre_config (apr_pool_t *pool,
            apr_pool_t *plog,
            apr_pool_t *ptemp)
{
  APR_OPTIONAL_FN_TYPE (ap_register_rewrite_mapfunc) *fn;

  fn = APR_RETRIEVE_OPTIONAL_FN (ap_register_rewrite_mapfunc);
  if (!fn) {
    ap_log_error (APLOG_MARK, APLOG_CRIT, 0, 0,
                  "mod_banu: Error registering map function");
    return HTTP_INTERNAL_SERVER_ERROR;
  }

  fn ("banu_product_id_map", banu_product_id_map);

  return OK;
}


static void
banu_hooks (apr_pool_t *pool)
{
  static const char * const pre_modules[] = {
    "mod_rewrite.c",
    NULL
  };

  ap_hook_pre_config (pre_config, pre_modules, NULL, APR_HOOK_MIDDLE);
}

module AP_MODULE_DECLARE_DATA banu_module = {
  STANDARD20_MODULE_STUFF,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  banu_hooks
};

Download mod_banu.c.

Update: I've updated the code to divide the id by 10000 instead of 1000 before. Also, here's how to setup Apache httpd to use the module. First, compile and install the module (as root):

apxs -ci mod_banu.c

Then, add the following line to your main httpd.conf:

LoadModule banu_module modules/mod_banu.so

Then, configure the rewrite rule in your virtual host. For the example above, I use something like:

RewriteMap bm int:banu_product_id_map
RewriteRule ^/shop/([0-9]+)/([0-9a-zA-Z\.\_\-]+)/images/(.*)$ \
            /static/shop/products/${bm:$1}/$1/images/$3       \
            [last,passthrough]

Lunar eclipse of December 10, 2011

on 10 Dec 2011 by Mukund (@muks)

Photographs of today's lunar eclipse, taken in Chennai, India. Click to see the large images.

Lunar eclipse photo 1 Lunar eclipse photo 2 Lunar eclipse photo 3 Lunar eclipse photo 4 Lunar eclipse photo 5 Lunar eclipse photo 6

Drawing circles

on 14 Oct 2011 by Mukund (@muks)

How does one draw a circle? You wipe the dust off your Foley & van Dam book and look up the circle drawing algorithm, or simply use Cairo. But I had to explain to someone how to draw circles and going directly to the Bresenham method wasn't reasonable. I started with a basic implementation and we discussed ways to optimize it in #gimp. The code here was written a few days ago, and is posted for anyone who searches for it.

Circles are invisible. When we say we want to draw a circle on the screen, we want to set pixels (light up equally sized rectangular bits of the screen) at integer coordinates to represent an approximation of the circle. Let's start off by writing some code which implements a main(), handles creating an image, setting pixels in it and writing it to a Portable graymap (.pgm) file (which can be viewed with programs like Eye of GNOME, GIMP, ImageMagick's display, etc.). With that out of the way, we can work on draw_circle() alone.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>

typedef struct {
  size_t         width;
  size_t         height;
  unsigned char *data;
} Image;

static Image *
image_new (size_t width,
           size_t height)
{
  Image *image;

  image = malloc (sizeof *image);
  image->width = width;
  image->height = height;
  image->data = malloc (width * height);

  return image;
}

static void
image_free (Image *image)
{
  free (image->data);
  free (image);
}

static void
image_fill (Image         *image,
            unsigned char  value)
{
  memset (image->data, value, image->width * image->height);
}

/**
 * image_set_pixel:
 *
 * Sets a pixel passed in signed (x, y) coordinates, where (0,0) is at
 * the center of the image.
 **/
static void
image_set_pixel (Image         *image,
                 ssize_t        x,
                 ssize_t        y,
                 unsigned char  value)
{
  size_t tx, ty;
  unsigned char *p;

  tx = (image->width / 2) + x;
  ty = (image->height / 2) + y;

  p = image->data + (ty * image->width) + tx;

  *p = value;
}

static void
image_save (const Image *image,
            const char  *filename)
{
  FILE *out;

  out = fopen (filename, "wb");
  if (!out)
    return;

  fprintf (out, "P5\n");
  fprintf (out, "%zu %zu\n", image->width, image->height);
  fprintf (out, "255\n");

  fwrite (image->data, 1, image->width * image->height, out);

  fclose (out);
}

static void
draw_circle (Image *image, int radius, unsigned char value);

int
main (int argc, char *argv[])
{
  Image *image;

  image = image_new (600, 600);

  image_fill (image, 0xff);
  draw_circle (image, 200, 0);
  image_save (image, "circle.pgm");

  image_free (image);

  return 0;
}

main() basically creates a 600×600 pixels image, fills it with white, and asks draw_circle() to draw a black circle of radius 200 pixels centered in the image. Note that image_set_pixel() takes signed coordinates, where (x=0, y=0) is at the center of the image. So for an image created with image_new (400, 400), calling image_set_pixel (image, 0, 0, 0x80) will set the pixel at (200, 200) to 0x80.

So let's implement various versions of draw_circle() now. The well-known equation of a circle is \(x^2+y^2=r^2\). The variables are shown relative to each other in the figure below:

Circle equation

This type of an equation is called an implicit equation. You can substitute values for \(x\), \(y\) and \(r\) to see if the equation holds true. If it does, then the point \((x,y)\) is on the circle. So the implicit equation lets you determine if a point is on the circle or not.

Scanning a circle

In the figure above, you can see that for every line parallel to the \(y\)-axis through some \(x \in (-r,r)\), the circle intersects it in two points. If we scan \(x\) from \(-r\) to \(r\) and plot the two intersection points \((x,y_1)\) and \((x,y_2)\) in every vertical scanline, at the end of the scan we'll have drawn a circle. For this, we have to determine \(y_1\) and \(y_2\) for every \(x\). The implicit equation is no good for it, because it is only useful in checking values and not for determining them. So let's rearrange terms to find \(y\):

$$ \begin{eqnarray*} x^2+y^2 & = & r^2 \\ y^2 & = & r^2 - x^2 \\ y & = & \sqrt{r^2 - x^2} \end{eqnarray*} $$

The square-root returns a positive and negative value for \(y\), so \(y_1=y\) and \(y_2=-y\), and the points at which the circle intersects a scanline through \(x\) are \((x,y)\) and \((x,-y)\). So let's write a draw_circle() based on this:

static void
draw_circle (Image         *image,
             int            radius,
             unsigned char  value)
{
  int x, y;

  for (x = -radius; x <= radius; x++) {
    y = (int) sqrt ((double) (radius * radius) - (x * x));

    image_set_pixel (image, x, y, value);
    image_set_pixel (image, x, -y, value);
  }
}

Download circle1.c. This program generates the following image:

Circle #1

This looks like a circle, but it is discontinuous near the \(x\)-axis. At those places, for every new scanline \(x_{n+1}=x_{n}+1\), \(y\) changes by more than \(1\) (\(y_{n+1}-y_n > 1\)).. sometimes far more than \(1\). Notice that there are discontinuities between angles 0 and 45 degrees, and no discontinuities between angles 45 and 90 degrees in the drawn circle.

In a circle, all points are at the same distance from the center. In the code above, we only compute a semi-circle and mirror it by plotting at \((x,y)\) and \((x,-y)\). In the same way, we can only compute points between the angles 45 degrees and 90 degrees and mirror it 8 ways by plotting \((x,y)\), \((x,-y)\), \((-x,y)\), \((-x,-y)\), \((y,x)\), \((y,-x)\), \((-y,x)\) and \((-y,-x)\). It plots 8 arcs, each of 45 degrees. So let's modify draw_circle() to do this:

static void
draw_circle (Image         *image,
             int            radius,
             unsigned char  value)
{
  int x, y;
  int l;

  l = (int) radius * cos (M_PI / 4);

  for (x = 0; x <= l; x++) {
    y = (int) sqrt ((double) (radius * radius) - (x * x));

    image_set_pixel (image, x, y, value);
    image_set_pixel (image, x, -y, value);
    image_set_pixel (image, -x, y, value);
    image_set_pixel (image, -x, -y, value);

    image_set_pixel (image, y, x, value);
    image_set_pixel (image, y, -x, value);
    image_set_pixel (image, -y, x, value);
    image_set_pixel (image, -y, -x, value);
  }
}

Download circle2.c. This program generates the following image:

Circle #2

So we're done, right? If code to draw a circle is all we need, then we are done. On the other hand, if you want to optimize it a bit so it performs acceptably on very slow processors, keep reading.

At this point I posted a link to the code in #gimp. pippin noted that filling the disk (the area inside the circle) can be much faster than drawing the circle. The square root computation is not needed as you're merely checking every pixel's coordinates to see if it lies inside the disk or outside. This just needs testing against the implicit equation of the disk \(x^2+y^2 \leq r^2\). Note how it is very similar to the implicit equation of the circle which just checks the boundary of the disk. As the implicit equation is checked for every pixel, filling the disk can be done in a ridiculously parallel manner. Here's a modified version of draw_circle() which plots a filled disk:

static void
draw_circle (Image         *image,
             int            radius,
             unsigned char  value)
{
  int x, y;

  for (y = -radius; y <= radius; y++)
    for (x = -radius; x <= radius; x++)
      if ((x * x) + (y * y) <= (radius * radius))
        image_set_pixel (image, x, y, value);
}

Download circle3.c. This program generates the following image:

Circle #3

Ideally you'd rewrite such code to inline the image_set_pixel() as the pixels are filled in an ordered manner. A good C compiler should also eliminate redundant subexpressions and move invariants out of loops, but some embedded compilers are not good. So you'd want to add temporaries in such cases:

static void
draw_circle (Image         *image,
             int            radius,
             unsigned char  value)
{
  unsigned char *buffer;
  size_t pad;
  int r2;
  int x, y;

  buffer = image->data + (((image->height / 2) - radius) * image->width);
  pad = (image->width / 2) - radius;
  r2 = radius * radius;

  for (y = -radius; y <= radius; y++) {
    int y2;
    unsigned char *p;

    y2 = y * y;
    p = buffer + pad;

    for (x = -radius; x <= radius; x++) {
      int x2 = x * x;

      if (x2 + y2 <= r2)
        *p = value;

      p++;
    }

    buffer += image->width;
  }
}

Download circle4.c which generates the same filled disk. Then the conversation in #gimp dropped away that night. I couldn't think of a way to avoid that sqrt(), and resigned I went to bed.

The next morning I woke up to the news that Steve Jobs had died. Reading the Slashdot story, I saw a comment about rounded rectangles. The linked article mentioned how Bill Atkinson had used the arithmetic progression \(n^2=1+3+5+7+\cdots+(2n-1)\). Of course! We don't need exact square roots, only the nearest integer root. And between the angles 45 and 90 degrees where we work, \(y_{n}-y_{n+1} \leq 1\). So we can estimate whether \(y_{n+1}\) is a decrement by comparing \(y_{n+1}^2\) with \(y_n^2-(2y_n-1)\). Awesome, eh? This progression \(n^2=1+3+5+7+\cdots+(2n-1)\) was in the Knuth book, but somehow this and the circle case both didn't take their clothes off and jump into the same pool to get me all excited. Rather than give up that something can't be solved further, it's better to think that there's always a solution. It was there from before we were born just waiting to be found.

Squares progression

Also, let's see what happens to \(y=\sqrt{r^2-x^2}\) when \(x_{n+1}=x_n+1\) (the next vertical scanline).

$$ \begin{eqnarray*} y_{n+1}^2 & = & r^2 - x_{n+1}^2 \\ & = & r^2 - (x_n+1)^2 \\ & = & r^2 - (x^2_n+2x_n+1) \\ & = & r^2 - x_n^2 - 2x_n - 1 \\ & = & y_n^2 - 2x_n - 1. \end{eqnarray*} $$

So let's update draw_circle() to implement this:

static void
draw_circle (Image         *image,
             int            radius,
             unsigned char  value)
{
  int x, y;
  int l;
  int r2, y2;
  int y2_new;
  int ty;

  /* cos pi/4 = 185363 / 2^18 (approx) */
  l = (radius * 185363) >> 18;

  /* At x=0, y=radius */
  y = radius;

  r2 = y2 = y * y;
  ty = (2 * y) - 1;
  y2_new = r2 + 3;

  for (x = 0; x <= l; x++) {
    y2_new -= (2 * x) - 3;

    if ((y2 - y2_new) >= ty) {
      y2 -= ty;
      y -= 1;
      ty -= 2;
    }

    image_set_pixel (image, x, y, value);
    image_set_pixel (image, x, -y, value);
    image_set_pixel (image, -x, y, value);
    image_set_pixel (image, -x, -y, value);

    image_set_pixel (image, y, x, value);
    image_set_pixel (image, y, -x, value);
    image_set_pixel (image, -y, x, value);
    image_set_pixel (image, -y, -x, value);
  }
}

Download circle5.c which generates the same circle. Notice how only additions, subtractions and shifts are enough in the loop. image_set_pixel() can be inlined here too, but I'm not going to do it.

So how do we draw rectangles with rounded corners? We currently draw the circle as 8 arcs, right? With the appropriate border radius, we simply translate and draw 2 arcs each at the 4 corners of the rectangle. :)