by: Matt Simerson
IP: 3.142.54.136
Saturday 23 Nov 24

I have an interesting situtation. I'm using dnscache for a network wide cache. It's a large network with lots of mail servers so we generate a lot of dns traffic. In order to manage the load generated by all our servers, I've taken to doing some very intesting things. The best way to explain this is probably a historical explanation.

We started with each unix/M$ host on the network knowing to resolve by looking at one of two caches (via resolv.conf or TCP/IP properties).

mail ----> cache1 ---> internet
     ----> cache2 ---> internet

This works well if two machines can handle all the recursive lookups your network generates. Two PIII 700MHz machines with 512MB each don't. Dnscache runs on the machines and happily consumes all the CPU as the machines hum along. I have logging completely disabled so multilog uses no CPU or disk. Two machines isn't enough.

An interesting side note is that we do a lot of domain hosting so we have a very active DNS state. We add and remove domains often, hundreds per day. When a new customer arrives we want all the machines on our network to be able to resolve that new domain before NetSol gets it pointed to us. This is pretty easy with dnscache, just create the domain.com file with our authoritative servers IP in it and restart dnscache. Ouch, that nukes all the cached data we've been collecting. :-(

So we have two problems, two servers can't handle the load and reloading the caches to learn about our new zones is painful because it empties our caches.

At this point a couple options spring to mind. The first is drop a load balancer in front of the caches and stick more machines into the cache pool. The cost is $15k for a set of load balancers and more machines. That solves the load issue but does nothing to resolve the cache dumping problem. I could use the patches that dump the cache to disk and read it back but that's expensive. Having to dump a 500MB cache every so often and read it back can't be fast. :-(

The second option is simply add more machines. The question becomes, if you simply add more machines, what's the best configuration? After a considerable amount of thought, I've implemented like this:

mail ---> cache1 ---> cache-master1
                        /\
     ---> cache2 ---> cache-master2
                        /\
                 ---> cache-master3

Cache1 and cache2 are both running as forwarders with the IP's of the cache masters as the roots. They have a 10MB cache and forward any non-cached requests to the cache-masters. They also have a script that takes a list of new domains and creates the appropriate files so that dnscache knows to look to our internal authoritative servers for new zones. Any zones in root/servers/ over a week old get deleted (excepting @).

Cache-masters are just standard dnscaches with gobs (512+MB) of RAM. This allows us to restart the caches on cache1 & cache2 at any time without it costing us our cached records. It also makes it very easy to scale our DNS system. Cache1 and cache2 are largely inactive. Even with thousands of requests per second, they running with almost no load on the system. The cache masters are handling all the grunt work. This makes it very easy to scale the system. Build another cache master, put it's IP into root/servers/@ on cache1 & cache2, restart dnscache and you're good to go.

Interesting observations:

1. Cache1 and cache2 use almost no CPU. Apparently it takes very little effort to merely forward a query and cache a response.
2. The "random" selection of a cache master (from servers/@) appears to do a good job of load balancing.
3. Because of observations 1 & 2, it's reasonable to have a cache and a "cache-master" running on the same machine. The master is the one doing the grunt work.
4. Because dnscache on cache 1 & 2 select a cache-master in a "random" fashion, you can do some ad-hoc load balancing by adjusting the IP's listed in the @ file. I'm using an @ file like this:

127.0.0.1 (local master cache)
127.0.0.1
10.0.0.4 (fastest master cache)
10.0.0.4
10.0.0.5 (slower master cache

By putting in an IP multiple times you'll increase it's likelihood of being chosen. It's still a random selection but upon initial observation, about 40% of traffic gets directed to the local master cache, 40% to my fast master cache and the last 20% to my slower cache.

Ultimately what ends up happening is cache1 and cache2 are simply dns load balancers that are smart enough to forward requests for certain domains to other places. Since that task leaves them with gobs of idle CPU, I've installed another dnscache on them and am also using them in my cache-master pool.

Since my caches get restarted often I can tweak the @ file on the two caches to spread the load around as desirable. I think it's pretty slick and has all the features we required. Best of all, it only required one more machine to be added to handle our current load. My dns topology now looks like this:

hosts ---> cache1         ---> cache-master-local ---> roots
                   random 
      ---> cache2         ---> cache-master-local ---> roots

                          ---> cache-master       ---> roots


When I max out these three machines ability to gracefully handle all the traffic it's a piece of cake to install cache-master2, update the @ files on cache1 & 2, and have a bunch more horsepower.

So, have I missed an easy solution or spend way too much time over-engineering this solution?

Matt