by: Matt Simerson
IP: 18.224.56.127
Friday 22 Nov 24

A permanent fix for the BIND 8 crashing problem.
Author: Matt Simerson

In a post to the BSDI-Users mailing list, a person asked the following very good question:

> OK, We finally got "bit" by the named crash bug... I looked thru the
> archives and never found a post on how to CORRECT the problem.

Having used BIND for many years, not only have I seen this problem but I've got to witness many other people experience the same frustrations with BIND. I took this question as an opportunity to kill two birds with one stone. The first is to simply answer the posters question and the other was simply to contribute to the documentation of djbdns. The following is a compilation of the ensuing emails.

---------------------------------------------------------------------------

Well, you did say "CORRECT" now didn't you? :-) Well, that leads us to two possible solutions:

1) Buggery - As others have suggested, accept the fact that BIND will grow as big as it darned well pleases and you just have to keep raising the limits to compensate. This is NOT the Unix way. Process limits are good things and I don't like general purpose daemons like BIND eating as much RAM as they want.

or

2) Solution - Edit named.conf, insert two lines into your options section in named.conf:

options {
recursion no;
fetch-glue no;
listen on ip.ad.dr.es 53;
};

UPDATE: Depending on your version of BIND, your listen-on directive might have to look like this: "listen-on { xx.xx.xx.xx; };"

After restarting BIND "ndc reload" it will only bind to ip.ad.dr.es and answer queries for which it's authoritative for. Assign another IP address to the NIC card in your name server and install dnscache on ip.ad.dr.es2. I wrote a little shell script that fetches the sources from the authors site, compiles, installs, sets up a dnscache program, starts it, installs a control script "services", tweaks resolv.conf, and adds itself to /etc/rc.local.

You can use the services script to control the daemon(s), "services start" will fire it up, "services stop" will shut it down. By default the script installs dnscache listening on 127.0.0.1 so you'll want to change the IP (two places) in the script to ip.ad.dr.es2. Run the script and you're good to go.

http://matt.simerson.net/computing/dns/install-dnscache-bsdi.txt

I wrote the script because I needed to install dnscache on several hundred BSDI machines. We did things a tiny bit different on our machines though. I started out by building a dnscache for the entire network. I put a 300MB dnscache on a Quad Xeon 550 with 2GB of RAM. It’s a bit overkill but it's what I had. After the last BIND exploit came out our Unix System Admins had tested dnscache and just took BIND off the servers and pointed them all at my (experimental) dnscache. This, as one might suspect, put one heck of a load on that box. It got immediately upgraded from test to production server. :-)

The dnscache server had settled in to consistently soaking up 45% of one CPU so I did a bit more research and learned that it has some very reasonable limits enabled by default (250 open file descriptors (network ports)) and a compile time limit of 200 max outgoing UDP requests. So, I cranked both up to 1000, recompiled and now I'm eating up 80% of a CPU and handling an average of 1270 dns queries per second. Pretty darned impressive compared to the 80/sec average we see on our i386 Solaris name servers (running BIND).

Recently while monitoring the traffic (packet sniffing) I was seeing an inordinate amount of the load coming from some of our mail servers so I took the worst offender, installed dnscache on it with a 1MB cache acting as a stub resolver that pointed to my network wide dnscache. It worked like a charm and dropped from 50 requests per second to my network cache to a little less than one per second. I wrote that script to automatically install dnscache on the rest of the mail servers.

The beauty is that dnscache runs chrooted, consumes a mere 1.4MB of RAM of which 1MB is used for cache. The next time an exploit comes out for BIND, you can just smile, nod, and go back to work. :-)

> Matt,
> This sounds like the answer to our question. Now, if I understand
> correctly, this can be used WITH BIND.

Yes.

> I assume that you are running the djbdns daemon?

Djbdns is actually a package of three DNS daemons. One is "dnscache" and it's exactly that, a caching DNS server. It's absolutely perfect for all those applications where you don't need to serve authoritative data but want the benefits of DNS caching. Most hosts running BIND are doing so because they want the caching benefits.

BIND is both a caching name server AND an authoritative name server all rolled into one rather sizable daemon. Unfortunately, despite being a reasonably good and speedy cache, it doesn't allocate resources (particularly RAM) in a very intelligent fashion. This is why sysadmins get to experience the "named crash" feature.

What happens with BIND is that upon boot time, it loads all the records it's authoritative for (if any) into its cache. It then continues adding all the records it recurses for into that cache pool until it exhausts its limits. At that point it crashes. To prevent that, just turn off recursive lookups in BIND and use something more sensible like dnscache (from the djbdns package) that you can assign limits to and have it gracefully obey them.

> I do have a couple of questions. Can I place the dnscache on several
> PCs?

Assuming them PC's are running a Unix-like OS, then yes. However, in the interest of bandwidth savings, you want your DNS servers arranged in a hierarchy. Most network managers should set theirs up like this:

BIG DNSCACHE
BIND 8 on ip.ad.dr.es
50-500MB dnscache on ip.ad.dr.es2
|
|
V
mail server www server <service> server
1MB dnscache 1MB dnscache 1MB dnscache
127.0.0.1 127.0.0.1 127.0.0.1

The top dnscache will be configured to allow queries from any machine on your network, including all your clients, dialups, PC's, and anything else on your network that wants resolution. This is really easy to do and is as simple as "touch /usr/local/dnscache/root/ip/192.168.0" to allow the class C "192.168.0.0/24" addresses to use the cache. So, for each block of addresses you own, do that.

What you want is to make sure that you never make more than one connection to the Internet for any given resource record (for the durations of that records TTL). So, on each of the other servers, you want the dnscaches to point at BIG DNSCACHE. You do this by installing them exactly as I posted before except install the cache on 127.0.0.1 and do the following:

echo "1" > /usr/local/dnscache/env/FORWARDONLY
echo "ip.ad.dr.es2" > /usr/local/dnscache/root/servers/@

What that'll do is tell your three little dnscaches that they're stub resolvers and they'll forward all their requests off to ip.ad.dr.ess2 (the IP address that dnscache listens to on BIG DNSCACHE.

> I assume from your comments below that this is what you did.

Basically yes. We're using dnscache now with some very impressive results. It's taking a little more time to convert a half million zones into a manageable format and then, most likely, we'll be converting our authoritative name servers from BIND to tinydns (from the djbdns package).

> I have
> a main server running BIND w/256M of ram and lots of DRIVE space.

Dnscache doesn't need more than a few MBs of drive space. Although overkill, if that machine is only serving DNS, you could give dnscache any RAM that's not needed by BIND. A more reasonable solution would be to look at how much memory BIND is using and then do a "ndc reload" to see how much memory BIND needs to load up all your zone data. Subtract the first from the second and you'll know roughly how much data BIND had cached. Multiply that number by 1.2 to size your new cache.

As an example, lets say you look at BIND and it's using 64MB of memory, which is fairly likely since that's where you saw it hit resource limits and crash. After the NDC reload you see that it's happily running at 20MB. You'll then know that you've got ~20MB of zone data and the other 44 megs in use were entries BIND cached from doing recursive lookups.

When we multiply 44 by 1.2 we get about a 50MB cache. This is a reasonable sounding number so we set dnscache on BIG DNSCACHE to use that much RAM:

echo "51200000" > /usr/local/dnscache/env/CACHESIZE
echo "55296000" > /usr/local/dnscache/env/DATALIMIT
svc -h /usr/local/dnscache

At this point, "ps ax | grep dnscache" should reflect a dnscache process using about 50MB of space. Dnscache will happily (and quickly) cache all dns requests for your network. Once the cache fills up it'll begin tossing out the oldest entries. It's in your best interests to cache queries until their TTL expires. The FAQ on cr.yp.to/djbdns/djbdns.html explains how to tune the cache size.

> I also
> run a mail server that has 128M of ram on it. I would like to add the
> dnscache to both servers. The mail server is still running BIND 4.9

First, get dnscache up and running on BIG DNSCACHE. Then put "ip.ad.dr.es2" from BIG DNSCACHE into /etc/resolv.conf of the mail server. Then shut down BIND on the mail server and edit the system startup files so that it doesn't get restarted at boot time. Install dnscache as suggested above (stub resolver) with the default 1MB cache. Once installed, put "nameserver 127.0.0.1" into /etc/resolv.conf, test a couple lookups, and you're done.

> Also, the line to add in the named.conf file for bind 8.2 Is that just
> the primary address of my named server or do I need it for my virtual
> sites I host too?

It depends on how your domains NS records are set up. If you've set up "virtual DNS" where each domain has "nsX.domain.com" instead of just using "nsX.isp.com" then yes, you'll need to add a listen-on for each IP. Your rule of thumb is to add every IP address to the named options that's a registered name server through Network Solutions.

> I assume that we can leave the DNS server running while we install the
> dnscache. There is no problem assigning a second address to the nic. I
> already have 20 or so assigned for my virtual sites

Yes, leave BIND up and running and acting as your authoritative DNS server. It's pretty good at that (bugs and security problems notwithstanding). You don't want to convert completely to djbdns until you're familiar enough with djbdns and it's data format. After using dnscache, you'll be wanting to switch. ;-)

> Thanks! I really look forward to getting this all put together! May
> even look at djbdns!

A couple final things: dnscache has logging on by default. A busy dnscache process will keep a hard drive very busy just writing logs so I disable all logging. I do so by editing /usr/local/dnscache/log/run and replacing the logging line with:

exec setuidgid bin multilog -*

While tuning the cache, it's helpful to log stats. You can easily do that like so:

exec setuidgid bin multilog t '-*' +'*stats *' ./main

Essentially that discards everything except lines that start with the word stats. Of course, you have to restart the logging process after making changes to run. That's pretty easy too:

svc -h /usr/local/dnscache/log

Other than the MAXUDP that I mentioned earlier, that's pretty much everything you'll ever need to know about dnscache.

Enjoy!
Matt