Validating Recursive Resolver Appliance This paper describes how to configure and use a Small Office/Home Office (SOHO) class wireless router as a DNSSEC Validating Recursive Resolver (DVRR). In most SOHO situations, DNSmasq, the DNS forwarder running on the ISP-provided gateway, serves as the DNS resolver for computers on the SOHO’s LAN. This paper suggests that DNSmasq is not as good solution as a DVRR for DNS resolution and shows how to build a DVRR that provides better throughput than DNSmasq and that additionally provides cryptographic validation of DNS lookups using the Domain Name System Security Extensions (DNSSEC). The ultimate goal is for this improved resolver to replace DNSmasq on ISP-provided equipment. Currently this is not possible, and so an additional inexpensive SOHO class wireless router must be acquired to host the DVRR. Once configured as described in this paper, DVRR runs unattended on the wireless router and provides validating DNS resolution to other computers on the LAN. This paper describes the operation of the device and describes how to build, install and configure the software necessary for operation of the DVRR. Motivation The goal of this project is to create a SOHO-class DNS resolver solution that provides DNS resolution with DNSSEC validation when possible and falls back to DNS resolution without validation when validation is not possible without requiring any human intervention. The problem being solved is that when conditions don’t support DNSSEC validation, the DNS resolver configuration must be changed to disable validation or all resolution will fail. Conversely, when DNSSEC validation is disabled and conditions change such that validation would be possible, the DNS resolver configuration should be changed to enable validation, or validation won’t be accomplished. Roughly speaking, we want the system to correctly and automatically handle 99% of the situations where the DNSSEC validation setting of the DNS resolver (e.g., validation enabled/disabled) must change. The primary goal of this project to provide tools that detect whether DNSSEC validation is possible or not, and that react to changes in the environment to correctly set the state of validation on the DNS resolver. A secondary goal is to demonstrate that running DNS resolution with DNSSEC validator in embedded environments is possible and in fact superior to previous methods. Most home networks have an inexpensive low power gateway device provided by the ISP. By adding DNSSEC support to these devices we can provide DNSSEC validation to the whole home network without having to touch any of the other devices on the network and without having to depend on the ISP. These same devices or more expensive ones are used in small offices as well. What these devices have in common is that there is no need to manage them as they are simply “plug-and-play”. Many users do minimal configuration on the devices when they are set up, such as changing the wireless name, selecting a security key, changing the password, etc. While it is possible that the user could detect DNSSEC validation support at the same time and enable DNSSEC validation by the click of a button, the fact remains that the capabilities of the network change over time. Consequently, we want the device to continually detect and condition itself to enable or disable validation as required by the current Validating Recursive Resolver Appliance 2 environment. As mentioned above, we cannot yet install the DVRR on the ISP-provided gateway, so we used low cost routers with embedded Linux operating systems to host the DVRR software. Problems encountered We quickly realized that simply running a validating recursive resolver on a router device is problematic. Power interruption, DHCP re-lease, and simply the passage of time can interrupt the validator’s function. Another challenge is acquiring the root trusted anchor that validation requires to bootstrap trust. The worst feature that a validating recursive resolver can present is to refuse to answer queries because validation failed and furthermore to remain stuck in that situation forever. As a router is packaged at a factory and then at some indeterminate time later unpackaged by a customer it is likely that time sensitive DNSSEC information has expired. Thus the router needs to start up by acquiring the root trust anchor in a secure manner. Embedded systems from different manufacturers have many variations, including choice of CPU chipset, with and without a clock battery backup, with and without writeable file systems, etc. For this reason we selected two different embedded system Linux distributions – DD-WRT and OpenWRT – that offer two very different operating environments. If we can get DNSSEC validation to work both environments we hope that DNSSEC validation will make its way to other embedded systems. We also chose two different routers – a NETGEAR 3500L and a Buffalo WZR-HP-G300NH2 to test with. DD-WRT and NETGEAR WNR3500L DD-WRT is a “minimalist” Linux-based distribution with a small non-volatile read-only file system. Enduser configuration changes are stored in Non-Volatile random access memory (“nvram”) and written to temporary files in a volatile file system that are lost when the device reboots. The challenge here is to get our tools to operate correctly in this environment by reading or writing configuration information in the right places and creating links from the non-volatile file system to these temporary files when required. We run the DD-WRT firmware on a NETGEAR WNR3500L router, due to the fact that DD-WRT employs a “squashfs” file system that compresses the disk image using LZMA compression. This allows the Linux OS and additional software to fit on the limited flash memory available. The NETGEAR WNR3500L router is based on the Broadcom 4716 chip, which contains a 453MHz MIPS architecture CPU. This router has an 8MB nonvolatile flash memory and a 64MB volatile random access memory. This router provides 802.11n wireless access and a 4 port gigabit switch. OpenWRT and Buffalo WZR-HP-G300NH2 Router As compared to DD-WRT, OpenWRT is more like a regular Linux system that it has a writeable nonvolatile file system. As writes to this file system are slow, log files and other frequently written files are located on a separate volatile file system that is mapped to the volatile memory. The challenge here is to provide configuration that is stable at boot time but does not interfere with device operation until current time has been established. Validating Recursive Resolver Appliance 3 We run the OpenWRT firmware on a Buffalo WZR-HP-G300NH2 router. This router has significantly more flash memory than the NETGEAR router and so provides more room for OpenWRT to fit. The Buffalo router is based on the Atheros AR7242 chip, which is a 400MHz MIPS architecture CPU. This router has a 32MB nonvolatile flash memory and a 64MB volatile random access memory. This router provides 802.11n and a 4 port gigabit switch. Network Configuration Figure 1 shows a typical configuration of a DVRR in a SOHO network. The ISP provided gateway establishes a NAT’d LAN labeled “Home LAN” to which the DNSSEC router is connected. The DNSSEC router creates a second NAT’d LAN, labeled “Home LAN 2” on a different subnet. DNSmasq runs on both routers to provide DHCP to the respective subnet. Unbound, running on the DNSSEC router, provides validating DNS resolution to both subnets. DNSSEC Validating Recursive Resolver ISP provided cable gateway Gateway Mode Home LAN LAN 4 Home LAN 2 LAN 3 Home LAN 2 LAN 2 Home LAN 2 LAN 1 Home LAN 2 Home LAN Home LAN CAT5 Cable To Internet LAN 5 DNSmasq (DHCP) DD-WRT + DNSmasq (DHCP) + unbound (DNS) + dnssec-trigger Figure 1 – Typical SOHO Network configuration with a DVRR The advantage of using unbound instead of DNSmasq for DNS resolution is shown in Table 1. Unbound’s performance equals or exceeds that of DNSmasq at recursively resolving queries. Unbound is slower when performing DNSSEC validation, however unbound caches validated answers, and can then provide validated resolution from cache that exceeds DNSmasq’s non-validated resolution performance, when answers to queries are in cache. We judge this level of validation performance (i.e., 200 q/s) adequate for use in a SOHO environment. Validating Recursive Resolver Appliance 4 Table 1 – Typical Performance DNSMasq versus Unbound Resolution Recursive resolution, cold cache Recursive resolution, warm cache DNSSEC validation, cold cache DNSMasq 700 q/s 700 q/s X Unbound 700 q/s 4,000 q/s 200 q/s Concept of Operation for NETGEAR WNR3500 U/L with DD-WRT The NETGEAR DVRR runs on DD-WRT Linux based firmware. DD-WRT primarily runs on a read-only file system called squashfs. Squashfs is a compressed read-only file system that allows a rather complete Linux distribution to fit on an 8MB flash memory. Certain files that must be writable are symbolically linked from the read-only squashfs to a read-write file system mounted on RAM. DD-WRT runtime configuration depends on name/value pairs written to non-volatile RAM (NVRAM) and accessed by scripts using the nvram shell command. Figure 2 depicts the DVRR in the DD-WRT environment. This figure shows only the files that are important for operation of the DVRR. Starting at the upper left hand corner of the figure, on-mountopt.sh is the first DVRR file invoked during boot. This file runs because it is symbolically linked from /etc/config/start-dns.startup. Files with the .startup extension are run during boot by DD-WRT. Continuing along the top of the figure, udhcpc is a DD-WRT process that monitors DHCP events and invokes the file /tmp/udhcpc when a DHCP event occurs. The process marked “*/5” is a cron job that runs every 5 minutes and invokes dnssec-cron-job.sh. And finally, the user can invoke javascript to control DVRR from cgi-bin web pages served by lighttpd. These processes are described in more detail below. On boot udhcpc */5 user dnssec-cron-job.sh Html/javascript on-mount-opt.sh dnssec-triggercontrol udhcpc-script.sh lighttpd /opt/www unbound-anchor /sbin/rc dnssec-triggerd myPageN.sh /tmp/unbound query_functions.cgi Unbound Recursive resolver root.key unboundmod.conf /opt/etc /var/log unbound_for wards_block. conf unbound validationenabled lighttpd access.log dnssec-cron-job.log lighttpd.log messages udhcpc.log unbound.log named.cache unbound.conf unbound_control.key unbound_control.pem unbound_server.key unbound_server.pem lighttpd.conf conf.d/01-default.conf conf.d/010-php-fcgi.conf Figure 2 – DNSSEC Validating Router DD-WRT Operating Environment Validating Recursive Resolver Appliance 5 unbound recursive resolver The unbound recursive resolver, in the center of Figure 2, is the focal point of DVRR operation. All of the remaining software exists solely to configure unbound to allow unbound to do resolution with DNSSEC validation. Depending on circumstances, unbound runs in one of the following modes: In the best case, unbound forwards queries to locally available resolvers for resolution, if they are at least security aware and so able to provide the necessary data to unbound so that unbound can perform DNSSEC validation using the answers from the local resolvers. Unbound may run as a local recursive resolver using a DNS roots authority if no suitable forwarders are available. In this situation, unbound still can do DNSSEC validation. Unbound may forward queries to certain designated resolvers using port 80 or port 443 if no other forwarders are available and these ports can be reached. The port 80 and port 443 resolvers are by default provided by nlnetlabs.nl. unbound can do DNSSEC validation in this situation as well. Unbound can do recursive resolution or forward to a local resolver with no DNSSEC validation if none of the above validating scenarios work. In this case, the benefits of validated DNS resolution are lost. To reiterate, the purpose of the remainder of the DVRR software is to configure unbound to run in the best mode possible, hopefully one of the first 3 modes described. /opt/sbin/on-mount-opt.sh On-mount-opt.sh (actually, /opt/sbin/on-mount-opt.sh) checks for a number of values in nvram and initializes them if they do not exist. On-mount-opt.sh initializes the volatile file system for DVRR operation by creating file folders and files on the /tmp writable file system. It creates the initial /tmp/unbound/unbound-mod.conf file specifying that unbound only do iteration, not validation of queries. It sets up the forwarders for unbound to use in /tmp/unbound/unbound_forwards_block.conf. It sets up or updates the unbound root trust anchor in /tmp/unbound/root.key. It starts unbound, dnssec-triggerd and lighttpd running. It intercepts DHCP events from UDHCPC by replacing the sym link /tmp/uhdcpc with a sym link to /opt/sbin/udhcpc-script.sh. udhcpc-script.sh is described below. Using DHCP obtained address on the WAN If the router is using DHCP to request its WAN address, on-mount-opt.sh synthesizes a DHCP event (because the first DHCP event occurred before on-mount-opt.sh could hook it) by running udhcpcscript.sh and passing ‘bound’ as the argument and the WAN address of the WAN DNS Server (e.g., DNSmasq) for the “dns” environment variable. udhcpc-script.sh passes the DVPP’s LAN address (e.g., the the address of unbound running on the DVPP) to /sbin/rc along with the “bound” argument. This causes DNSmasq running on the DVPP to pass the address of the DVPP router, and hence unbound, as the address of the DNS Server to any DHCP clients that request an IP on the LAN, which is what we want. udhcp-script.sh then calls dnssec-trigger-control passing the “submit” verb and the address of the WAN’s DNS server to try to forward DNS requests from the LAN to the WAN’s DNS server. In most cases we’ve seen currently, the WAN’s DNS server (e.g., DNSmasq running on the ISP’s gateway) does not support verification (it is neither validating nor security aware), and so dnssec-trigger tells unbound not Validating Recursive Resolver Appliance 6 to use this DNS server as a forwarder. In such cases, the DVPP unbound goes directly to root authorities. If the WAN’s DNS Server is validating (this is probably the “best case” for performance and does in fact obtain in Shinkuro’s office because we have cagily setup the WAN’s DHCP server to publish validating DNS resolvers), the DVPP unbound will forward queries to it and let it do validation as well (e.g., unbound will use the AD bit in DNS responses from the forwarder). If the WAN’s DNS Server is DNSSEC security aware but not validating, the DVPP unbound will forward queries to it for recursive resolution but will do validation itself (e.g., unbound will set the CD bit in queries to the resolver to get all the DNSSEC data and will do the validation using that). Finally, on-mount-opt.sh runs /opt/sbin/dnssec-cron-job.sh before it runs as a cron job to try to start validation as soon as possible. Once on-mount-opt.sh has finished running, the DVRR software, consisting of unbound, dnssec-triggerd and lighttpd, is set up and running. Using a Static address on the WAN If the router is setup to use a static WAN address, things are slightly different. on-mount-opt.sh also runs the udhcpc-script.sh script and passes “bound” as the argument but in this case passes any DNS forwarder addresses designated for the static configuration as the value of the “dns” environment variable. udhcpc-script.sh again passes the DVPP’s LAN address (e.g., the the address of unbound running on the DVPP) to /sbin/rc along with the “bound” argument. This causes DNSmasq running on the DVPP to pass the address of the DVPP router, and hence unbound, as the address of the DNS Server to any DHCP clients that request an IP on the LAN, which is what we want. udhcp-script.sh then calls dnssec-trigger-control passing the “submit” verb and the forwarder addresses configured for the WAN as given by the static DNS Server configuration. When we do static configuration in the Shinkuro office, we explicitly configure the static DNS Servers as validating resolvers. In this case, the DVPP unbound will forward queries to one of these validating resolvers and let it do resolution and validation. As described above, if a statically configured WAN DNS server does not support verification (being neither validating nor security aware), dnssec-trigger tells unbound not to use this DNS server as a forwarder and instead goes directly to root authorities. If a statically configured WAN’s DNS Server is DNSSEC security aware but not validating, the DVPP unbound will forward queries for recursive resolution but will do validation itself (e.g., unbound will set the CD bit in queries to the resolver to get all the DNSSEC data and will do the validation using that). As above, on-mount-opt.sh runs /opt/sbin/dnssec-cron-job.sh before it runs as a cron job to try to start validation as soon as possible. /opt/sbin/udhcpc-script.sh The udhcpc-script.sh script is run by the udhcpc process whenever a DHCP event occurs. Udhcpc calls the script with a single argument that has one of the following values: deconfig – when udhcpc starts and when a DHCP lease is lost bound – when a DHCP lease is granted Validating Recursive Resolver Appliance 7 renew – when a DHCP lease is renewed nak – when udhcpc receives a nak from the DHCP server In the event that a DHCP lease is in effect, udhcpc also passes the DNS server configuration identified in the lease in the “dns” environment variable. The udhcpc-script.sh script remembers the original value of the “dns” environment variable and changes the value of the “dns” environment variable to the local unbound dns server, which it gets from the nvram “lan_ipaddr” variable. This is the IP Address of the router on the LAN, typically something like 192.168.5.1. The udhcpc-script.sh file then invokes the /sbin/rc program, passing the single argument that was originally passed to it and the new value for the “dns” environment variable. Finally, the script sets the “dns” environment variable back to its original value and, if the value of the input argument was “bound”, submits the DNS Servers provided by DHCP to dnssec-trigger using the dnssec-trigger-control program and the submit verb. Otherwise, if the value of the input argument was not “bound”, the script calls dnssec-trigger-control with the reprobe verb to tell dnssec-trigger to reprobe the DNS configuration. The udhcpc-script.sh file creates a log file in /var/log/udhcpc.log that traces its activity. /opt/sbin/dnssec-cron-job.sh The dnssec-cron-job.sh script runs every 5 minutes as a cron job. It is also run at startup by on-mount-opt.sh. dnssec-cron-job.sh runs dnssec-trigger-control with the status verb to obtain a list of resolvers and forwarders that dnssec-triggerd is using. If any of them is marked “OK”, then dnssec-cronjob checks to see if unbound is currently not validating, and if so, turns validation on. Conversely, if none of the reported resolvers and forwarders is marked “OK”, dnssec-cron-job.sh checks to see if unbound is currently validating, and if so, turns validation off. dnssec-cron-job.sh then runs dnssec-trigger-control with the reprobe verb so that a new status will be available the next time it is run. /www/myPage13.sh and /www/cgi-bin/query_functions.cgi The DD-WRT distribution has an optional myPages distribution that allows augmenting the DD-WRT web page. Unfortunately, the httpd web server would not support cgi-bin, so in order to both provide status for the unbound DNS server and the dnssec-triggerd and to allow the user to control these features, we had to add lighttpd to implement cgi-bin. The file /www/myPage13.sh is a script that writes html and javascript when that page is invoked. The myPage13.sh file invokes cgi-bin script functions on the query_functions.cgi script to manually enable and disable DNSSEC validation and to manually force dnssec-triggerd to reprobe the DNS servers. The query_functions.cgi script is run by lighttpd, and is that process’s only function. Figure 3 shows a screenshot of the DNS control and status page. Validating Recursive Resolver Appliance 8 Figure 3 - Screenshot of DNS Control and Status Page DD-WRT Operational Considerations Files that are useful to review during operation of the router are listed in Table 2. These files exist on the volatile read-write file system in order to allow for configuration and dynamic reconfiguration of the system within DD-WRT’s primarily read-only file system environment. Table 2 – Configuration files on the volatile read-write file system File name /tmp/cron.d/cron-jobs /tmp/udhcpc /tmp/unbound/root.key /tmp/unbound/unboundmod.conf Description Written by on-mount-opt.sh. Species that /opt/sbin/dnssec-cron-job.sh is to run every 5 minutes. A symbolic link overwritten by on-mount-opt.sh to point to /opt/sbin/udhcpc-script.sh instead of /sbin/rc. root.key is the auto-trust trust anchor file. It is initially written by unbound-anchor, which is run by on-mount-opt at startup. root.key is subsequently updated daily by unbound for as long as the router continues to run. /etc/unbound.conf includes this file to set unbound’s module-config which determines the state of validation. This file is initially written by onmount-opt.sh to disable validation. This file is subsequently updated by dnssec-cron-job.sh to enable or disable validation and similarly by code associated with the mypage13.sh user page. Validating Recursive Resolver Appliance File name /tmp/unbound/unbound_ forwards_block.conf /tmp/unbound/validationenabled /tmp/www/* /var/log/access.log /var/log/dnssec-cronjob.log /var/log/lighttpd.log /var/log/messages /var/log/udhcpc.log /var/unbound.log 9 Description /etc/unbound.conf includes this file to set forward-zone. This file is written by on-mount-opt.sh at startup from the nvram unbound_forwards variable. The user must set nvram the nvram unbound_forwards variable from a console logged into the router, using the following syntax: nvram set unbound_forwards=”domain/ipaddress domain/ipaddress”. For example: nvram set unbound_forwards=”shinkuro.com/12.168.168.15” This file is written whenever validation is enabled and deleted when validation is disabled. It serves as a flag so that the validation state is not changed unnecessarily. These files are copied from /www/* by on-mount-opt.sh at startup. Logs accesses to lighttpd, which serves the DNS page user interface. Logs executions of the dnssec-cron-job.sh script every 5 minutes. Logs runtime output of lighttpd. The DD-WRT syslog output Logs executions of the udhcpc-script.sh script. Logs runtime output of unbound. Obtaining and Flashing a NETGEAR WNR3500L with DVRR Software The WNR3500L was available from Amazon for $65 U.S. at the time of this writing. The software flash file is available on request from the authors. A WNR3500L with the factory software loaded must first be flashed with a special file to allow DD-WRT format files to be loaded. This file, which must be flashed just the one time, to get a DD-WRT compatible flash loader installed, is downloadable from this link. Building the NETGEAR DD-WRT DVRR Software This is an involved process that we will briefly describe here. We do this work on Ubuntu workstations running in Virtual Box VMs. First, you must obtain a toolchain suitable for the target platform. We do this using Optware, using the instructions given in this link. You need to build the ldns, unbound and dnssec-trigger packages from the Optware repository (along with dependencies such as openssl). Once these package builds are in hand, you need to obtain the DD-WRT firmware modification kit from this link. Finally, you need to obtain a DD-WRT firmware image from this link. We have written some bash scripts that automate the rest of the process that you may obtain from one of the authors. You’ll also need a folder containing the scripts described above, again, available from one of the authors. The scripts are step-1-copy-bits-to-opt and (step 2) copy-bits-to-firmware.sh. Basically, the step 1 script copies files from the ldns, unbound and dnssec-trigger build folders built using Optware to a folder structure provided by us. The step 2 script unpackages the DD-WRT firmware image you obtained Validating Recursive Resolver Appliance above, copies the files from the folder we provide and the ldns, unbound, dnssec-trigger and other binaries that you built in Optware into the right places in the unpacked DD-WRT image, and then packages it all back up into a new firmware image. Authors Olafur Gudmundsson – Olafur at Shinkuro dot com Bob Novas – Bob at Shinkuro dot com 10