Validating Recursive Resolver Appliance-v02 (2)

advertisement
Validating Recursive Resolver Appliance
This paper describes how to configure and use a Small Office/Home Office (SOHO) class wireless router
as a DNSSEC Validating Recursive Resolver (DVRR). In most SOHO situations, DNSmasq, the DNS
forwarder running on the ISP-provided gateway, serves as the DNS resolver for computers on the
SOHO’s LAN. This paper suggests that DNSmasq is not as good solution as a DVRR for DNS resolution
and shows how to build a DVRR that provides better throughput than DNSmasq and that additionally
provides cryptographic validation of DNS lookups using the Domain Name System Security Extensions
(DNSSEC). The ultimate goal is for this improved resolver to replace DNSmasq on ISP-provided
equipment. Currently this is not possible, and so an additional inexpensive SOHO class wireless router
must be acquired to host the DVRR. Once configured as described in this paper, DVRR runs unattended
on the wireless router and provides validating DNS resolution to other computers on the LAN. This paper
describes the operation of the device and describes how to build, install and configure the software
necessary for operation of the DVRR.
Motivation
The goal of this project is to create a SOHO-class DNS resolver solution that provides DNS resolution
with DNSSEC validation when possible and falls back to DNS resolution without validation when
validation is not possible without requiring any human intervention. The problem being solved is that
when conditions don’t support DNSSEC validation, the DNS resolver configuration must be changed to
disable validation or all resolution will fail. Conversely, when DNSSEC validation is disabled and
conditions change such that validation would be possible, the DNS resolver configuration should be
changed to enable validation, or validation won’t be accomplished. Roughly speaking, we want the
system to correctly and automatically handle 99% of the situations where the DNSSEC validation setting
of the DNS resolver (e.g., validation enabled/disabled) must change. The primary goal of this project to
provide tools that detect whether DNSSEC validation is possible or not, and that react to changes in the
environment to correctly set the state of validation on the DNS resolver. A secondary goal is to
demonstrate that running DNS resolution with DNSSEC validator in embedded environments is possible
and in fact superior to previous methods.
Most home networks have an inexpensive low power gateway device provided by the ISP. By adding
DNSSEC support to these devices we can provide DNSSEC validation to the whole home network without
having to touch any of the other devices on the network and without having to depend on the ISP. These
same devices or more expensive ones are used in small offices as well. What these devices have in
common is that there is no need to manage them as they are simply “plug-and-play”. Many users do
minimal configuration on the devices when they are set up, such as changing the wireless name,
selecting a security key, changing the password, etc. While it is possible that the user could detect
DNSSEC validation support at the same time and enable DNSSEC validation by the click of a button, the
fact remains that the capabilities of the network change over time. Consequently, we want the device to
continually detect and condition itself to enable or disable validation as required by the current
Validating Recursive Resolver Appliance
2
environment. As mentioned above, we cannot yet install the DVRR on the ISP-provided gateway, so we
used low cost routers with embedded Linux operating systems to host the DVRR software.
Problems encountered
We quickly realized that simply running a validating recursive resolver on a router device is problematic.
Power interruption, DHCP re-lease, and simply the passage of time can interrupt the validator’s
function. Another challenge is acquiring the root trusted anchor that validation requires to bootstrap
trust. The worst feature that a validating recursive resolver can present is to refuse to answer queries
because validation failed and furthermore to remain stuck in that situation forever. As a router is
packaged at a factory and then at some indeterminate time later unpackaged by a customer it is likely
that time sensitive DNSSEC information has expired. Thus the router needs to start up by acquiring the
root trust anchor in a secure manner.
Embedded systems from different manufacturers have many variations, including choice of CPU chipset,
with and without a clock battery backup, with and without writeable file systems, etc. For this reason we
selected two different embedded system Linux distributions – DD-WRT and OpenWRT – that offer two
very different operating environments. If we can get DNSSEC validation to work both environments we
hope that DNSSEC validation will make its way to other embedded systems. We also chose two different
routers – a NETGEAR 3500L and a Buffalo WZR-HP-G300NH2 to test with.
DD-WRT and NETGEAR WNR3500L
DD-WRT is a “minimalist” Linux-based distribution with a small non-volatile read-only file system. Enduser configuration changes are stored in Non-Volatile random access memory (“nvram”) and written to
temporary files in a volatile file system that are lost when the device reboots. The challenge here is to
get our tools to operate correctly in this environment by reading or writing configuration information in
the right places and creating links from the non-volatile file system to these temporary files when
required.
We run the DD-WRT firmware on a NETGEAR WNR3500L router, due to the fact that DD-WRT employs a
“squashfs” file system that compresses the disk image using LZMA compression. This allows the Linux OS
and additional software to fit on the limited flash memory available.
The NETGEAR WNR3500L router is based on the Broadcom 4716 chip, which contains a 453MHz MIPS
architecture CPU. This router has an 8MB nonvolatile flash memory and a 64MB volatile random access
memory. This router provides 802.11n wireless access and a 4 port gigabit switch.
OpenWRT and Buffalo WZR-HP-G300NH2 Router
As compared to DD-WRT, OpenWRT is more like a regular Linux system that it has a writeable nonvolatile file system. As writes to this file system are slow, log files and other frequently written files are
located on a separate volatile file system that is mapped to the volatile memory. The challenge here is to
provide configuration that is stable at boot time but does not interfere with device operation until
current time has been established.
Validating Recursive Resolver Appliance
3
We run the OpenWRT firmware on a Buffalo WZR-HP-G300NH2 router. This router has significantly
more flash memory than the NETGEAR router and so provides more room for OpenWRT to fit.
The Buffalo router is based on the Atheros AR7242 chip, which is a 400MHz MIPS architecture CPU. This
router has a 32MB nonvolatile flash memory and a 64MB volatile random access memory. This router
provides 802.11n and a 4 port gigabit switch.
Network Configuration
Figure 1 shows a typical configuration of a DVRR in a SOHO network. The ISP provided gateway
establishes a NAT’d LAN labeled “Home LAN” to which the DNSSEC router is connected. The DNSSEC
router creates a second NAT’d LAN, labeled “Home LAN 2” on a different subnet. DNSmasq runs on both
routers to provide DHCP to the respective subnet. Unbound, running on the DNSSEC router, provides
validating DNS resolution to both subnets.
DNSSEC Validating
Recursive Resolver
ISP provided
cable gateway
Gateway
Mode
Home LAN
LAN 4
Home LAN 2
LAN 3
Home LAN 2
LAN 2
Home LAN 2
LAN 1
Home LAN 2
Home LAN
Home LAN
CAT5
Cable
To
Internet
LAN 5
DNSmasq (DHCP)
DD-WRT
+ DNSmasq (DHCP)
+ unbound (DNS)
+ dnssec-trigger
Figure 1 – Typical SOHO Network configuration with a DVRR
The advantage of using unbound instead of DNSmasq for DNS resolution is shown in Table 1. Unbound’s
performance equals or exceeds that of DNSmasq at recursively resolving queries. Unbound is slower
when performing DNSSEC validation, however unbound caches validated answers, and can then provide
validated resolution from cache that exceeds DNSmasq’s non-validated resolution performance, when
answers to queries are in cache. We judge this level of validation performance (i.e., 200 q/s) adequate
for use in a SOHO environment.
Validating Recursive Resolver Appliance
4
Table 1 – Typical Performance DNSMasq versus Unbound
Resolution
Recursive resolution, cold cache
Recursive resolution, warm cache
DNSSEC validation, cold cache
DNSMasq
700 q/s
700 q/s
X
Unbound
700 q/s
4,000 q/s
200 q/s
Concept of Operation for NETGEAR WNR3500 U/L with DD-WRT
The NETGEAR DVRR runs on DD-WRT Linux based firmware. DD-WRT primarily runs on a read-only file
system called squashfs. Squashfs is a compressed read-only file system that allows a rather complete
Linux distribution to fit on an 8MB flash memory. Certain files that must be writable are symbolically
linked from the read-only squashfs to a read-write file system mounted on RAM. DD-WRT runtime
configuration depends on name/value pairs written to non-volatile RAM (NVRAM) and accessed by
scripts using the nvram shell command.
Figure 2 depicts the DVRR in the DD-WRT environment. This figure shows only the files that are
important for operation of the DVRR. Starting at the upper left hand corner of the figure, on-mountopt.sh is the first DVRR file invoked during boot. This file runs because it is symbolically linked from
/etc/config/start-dns.startup. Files with the .startup extension are run during boot by DD-WRT.
Continuing along the top of the figure, udhcpc is a DD-WRT process that monitors DHCP events and
invokes the file /tmp/udhcpc when a DHCP event occurs. The process marked “*/5” is a cron job that
runs every 5 minutes and invokes dnssec-cron-job.sh. And finally, the user can invoke javascript to
control DVRR from cgi-bin web pages served by lighttpd. These processes are described in more detail
below.
On boot
udhcpc
*/5
user
dnssec-cron-job.sh
Html/javascript
on-mount-opt.sh
dnssec-triggercontrol
udhcpc-script.sh
lighttpd
/opt/www
unbound-anchor
/sbin/rc
dnssec-triggerd
myPageN.sh
/tmp/unbound
query_functions.cgi
Unbound
Recursive resolver
root.key
unboundmod.conf
/opt/etc
/var/log
unbound_for
wards_block.
conf
unbound
validationenabled
lighttpd
access.log
dnssec-cron-job.log
lighttpd.log
messages
udhcpc.log
unbound.log
named.cache
unbound.conf
unbound_control.key
unbound_control.pem
unbound_server.key
unbound_server.pem
lighttpd.conf
conf.d/01-default.conf
conf.d/010-php-fcgi.conf
Figure 2 – DNSSEC Validating Router DD-WRT Operating Environment
Validating Recursive Resolver Appliance
5
unbound recursive resolver
The unbound recursive resolver, in the center of Figure 2, is the focal point of DVRR operation. All of the
remaining software exists solely to configure unbound to allow unbound to do resolution with DNSSEC
validation. Depending on circumstances, unbound runs in one of the following modes:




In the best case, unbound forwards queries to locally available resolvers for resolution, if they
are at least security aware and so able to provide the necessary data to unbound so that
unbound can perform DNSSEC validation using the answers from the local resolvers.
Unbound may run as a local recursive resolver using a DNS roots authority if no suitable
forwarders are available. In this situation, unbound still can do DNSSEC validation.
Unbound may forward queries to certain designated resolvers using port 80 or port 443 if no
other forwarders are available and these ports can be reached. The port 80 and port 443
resolvers are by default provided by nlnetlabs.nl. unbound can do DNSSEC validation in this
situation as well.
Unbound can do recursive resolution or forward to a local resolver with no DNSSEC validation if
none of the above validating scenarios work. In this case, the benefits of validated DNS
resolution are lost.
To reiterate, the purpose of the remainder of the DVRR software is to configure unbound to run in the
best mode possible, hopefully one of the first 3 modes described.
/opt/sbin/on-mount-opt.sh
On-mount-opt.sh (actually, /opt/sbin/on-mount-opt.sh) checks for a number of values in nvram and
initializes them if they do not exist. On-mount-opt.sh initializes the volatile file system for DVRR
operation by creating file folders and files on the /tmp writable file system. It creates the initial
/tmp/unbound/unbound-mod.conf file specifying that unbound only do iteration, not validation of
queries. It sets up the forwarders for unbound to use in /tmp/unbound/unbound_forwards_block.conf.
It sets up or updates the unbound root trust anchor in /tmp/unbound/root.key. It starts unbound,
dnssec-triggerd and lighttpd running. It intercepts DHCP events from UDHCPC by replacing the sym link
/tmp/uhdcpc with a sym link to /opt/sbin/udhcpc-script.sh. udhcpc-script.sh is described below.
Using DHCP obtained address on the WAN
If the router is using DHCP to request its WAN address, on-mount-opt.sh synthesizes a DHCP event
(because the first DHCP event occurred before on-mount-opt.sh could hook it) by running udhcpcscript.sh and passing ‘bound’ as the argument and the WAN address of the WAN DNS Server (e.g.,
DNSmasq) for the “dns” environment variable. udhcpc-script.sh passes the DVPP’s LAN address (e.g., the
the address of unbound running on the DVPP) to /sbin/rc along with the “bound” argument. This causes
DNSmasq running on the DVPP to pass the address of the DVPP router, and hence unbound, as the
address of the DNS Server to any DHCP clients that request an IP on the LAN, which is what we want.
udhcp-script.sh then calls dnssec-trigger-control passing the “submit” verb and the address of the
WAN’s DNS server to try to forward DNS requests from the LAN to the WAN’s DNS server. In most cases
we’ve seen currently, the WAN’s DNS server (e.g., DNSmasq running on the ISP’s gateway) does not
support verification (it is neither validating nor security aware), and so dnssec-trigger tells unbound not
Validating Recursive Resolver Appliance
6
to use this DNS server as a forwarder. In such cases, the DVPP unbound goes directly to root authorities.
If the WAN’s DNS Server is validating (this is probably the “best case” for performance and does in fact
obtain in Shinkuro’s office because we have cagily setup the WAN’s DHCP server to publish validating
DNS resolvers), the DVPP unbound will forward queries to it and let it do validation as well (e.g.,
unbound will use the AD bit in DNS responses from the forwarder). If the WAN’s DNS Server is DNSSEC
security aware but not validating, the DVPP unbound will forward queries to it for recursive resolution
but will do validation itself (e.g., unbound will set the CD bit in queries to the resolver to get all the
DNSSEC data and will do the validation using that).
Finally, on-mount-opt.sh runs /opt/sbin/dnssec-cron-job.sh before it runs as a cron job to try to start
validation as soon as possible.
Once on-mount-opt.sh has finished running, the DVRR software, consisting of unbound, dnssec-triggerd
and lighttpd, is set up and running.
Using a Static address on the WAN
If the router is setup to use a static WAN address, things are slightly different. on-mount-opt.sh also runs
the udhcpc-script.sh script and passes “bound” as the argument but in this case passes any DNS
forwarder addresses designated for the static configuration as the value of the “dns” environment
variable. udhcpc-script.sh again passes the DVPP’s LAN address (e.g., the the address of unbound
running on the DVPP) to /sbin/rc along with the “bound” argument. This causes DNSmasq running on
the DVPP to pass the address of the DVPP router, and hence unbound, as the address of the DNS Server
to any DHCP clients that request an IP on the LAN, which is what we want. udhcp-script.sh then calls
dnssec-trigger-control passing the “submit” verb and the forwarder addresses configured for the WAN
as given by the static DNS Server configuration. When we do static configuration in the Shinkuro office,
we explicitly configure the static DNS Servers as validating resolvers. In this case, the DVPP unbound will
forward queries to one of these validating resolvers and let it do resolution and validation. As described
above, if a statically configured WAN DNS server does not support verification (being neither validating
nor security aware), dnssec-trigger tells unbound not to use this DNS server as a forwarder and instead
goes directly to root authorities. If a statically configured WAN’s DNS Server is DNSSEC security aware
but not validating, the DVPP unbound will forward queries for recursive resolution but will do validation
itself (e.g., unbound will set the CD bit in queries to the resolver to get all the DNSSEC data and will do
the validation using that).
As above, on-mount-opt.sh runs /opt/sbin/dnssec-cron-job.sh before it runs as a cron job to try to start
validation as soon as possible.
/opt/sbin/udhcpc-script.sh
The udhcpc-script.sh script is run by the udhcpc process whenever a DHCP event occurs. Udhcpc calls
the script with a single argument that has one of the following values:


deconfig – when udhcpc starts and when a DHCP lease is lost
bound – when a DHCP lease is granted
Validating Recursive Resolver Appliance


7
renew – when a DHCP lease is renewed
nak – when udhcpc receives a nak from the DHCP server
In the event that a DHCP lease is in effect, udhcpc also passes the DNS server configuration identified in
the lease in the “dns” environment variable.
The udhcpc-script.sh script remembers the original value of the “dns” environment variable and changes
the value of the “dns” environment variable to the local unbound dns server, which it gets from the
nvram “lan_ipaddr” variable. This is the IP Address of the router on the LAN, typically something like
192.168.5.1. The udhcpc-script.sh file then invokes the /sbin/rc program, passing the single argument
that was originally passed to it and the new value for the “dns” environment variable. Finally, the script
sets the “dns” environment variable back to its original value and, if the value of the input argument was
“bound”, submits the DNS Servers provided by DHCP to dnssec-trigger using the dnssec-trigger-control
program and the submit verb. Otherwise, if the value of the input argument was not “bound”, the script
calls dnssec-trigger-control with the reprobe verb to tell dnssec-trigger to reprobe the DNS
configuration.
The udhcpc-script.sh file creates a log file in /var/log/udhcpc.log that traces its activity.
/opt/sbin/dnssec-cron-job.sh
The dnssec-cron-job.sh script runs every 5 minutes as a cron job. It is also run at startup by
on-mount-opt.sh. dnssec-cron-job.sh runs dnssec-trigger-control with the status verb to obtain a list of
resolvers and forwarders that dnssec-triggerd is using. If any of them is marked “OK”, then dnssec-cronjob checks to see if unbound is currently not validating, and if so, turns validation on. Conversely, if none
of the reported resolvers and forwarders is marked “OK”, dnssec-cron-job.sh checks to see if unbound is
currently validating, and if so, turns validation off. dnssec-cron-job.sh then runs dnssec-trigger-control
with the reprobe verb so that a new status will be available the next time it is run.
/www/myPage13.sh and /www/cgi-bin/query_functions.cgi
The DD-WRT distribution has an optional myPages distribution that allows augmenting the DD-WRT web
page. Unfortunately, the httpd web server would not support cgi-bin, so in order to both provide status
for the unbound DNS server and the dnssec-triggerd and to allow the user to control these features, we
had to add lighttpd to implement cgi-bin. The file /www/myPage13.sh is a script that writes html and
javascript when that page is invoked. The myPage13.sh file invokes cgi-bin script functions on the
query_functions.cgi script to manually enable and disable DNSSEC validation and to manually force
dnssec-triggerd to reprobe the DNS servers. The query_functions.cgi script is run by lighttpd, and is that
process’s only function. Figure 3 shows a screenshot of the DNS control and status page.
Validating Recursive Resolver Appliance
8
Figure 3 - Screenshot of DNS Control and Status Page
DD-WRT Operational Considerations
Files that are useful to review during operation of the router are listed in Table 2. These files exist on the
volatile read-write file system in order to allow for configuration and dynamic reconfiguration of the
system within DD-WRT’s primarily read-only file system environment.
Table 2 – Configuration files on the volatile read-write file system
File name
/tmp/cron.d/cron-jobs
/tmp/udhcpc
/tmp/unbound/root.key
/tmp/unbound/unboundmod.conf
Description
Written by on-mount-opt.sh. Species that /opt/sbin/dnssec-cron-job.sh is
to run every 5 minutes.
A symbolic link overwritten by on-mount-opt.sh to point to
/opt/sbin/udhcpc-script.sh instead of /sbin/rc.
root.key is the auto-trust trust anchor file. It is initially written by
unbound-anchor, which is run by on-mount-opt at startup. root.key is
subsequently updated daily by unbound for as long as the router
continues to run.
/etc/unbound.conf includes this file to set unbound’s module-config
which determines the state of validation. This file is initially written by onmount-opt.sh to disable validation. This file is subsequently updated by
dnssec-cron-job.sh to enable or disable validation and similarly by code
associated with the mypage13.sh user page.
Validating Recursive Resolver Appliance
File name
/tmp/unbound/unbound_
forwards_block.conf
/tmp/unbound/validationenabled
/tmp/www/*
/var/log/access.log
/var/log/dnssec-cronjob.log
/var/log/lighttpd.log
/var/log/messages
/var/log/udhcpc.log
/var/unbound.log
9
Description
/etc/unbound.conf includes this file to set forward-zone. This file is
written by on-mount-opt.sh at startup from the nvram
unbound_forwards variable. The user must set nvram the nvram
unbound_forwards variable from a console logged into the router, using
the following syntax:
nvram set unbound_forwards=”domain/ipaddress domain/ipaddress”.
For example:
nvram set unbound_forwards=”shinkuro.com/12.168.168.15”
This file is written whenever validation is enabled and deleted when
validation is disabled. It serves as a flag so that the validation state is not
changed unnecessarily.
These files are copied from /www/* by on-mount-opt.sh at startup.
Logs accesses to lighttpd, which serves the DNS page user interface.
Logs executions of the dnssec-cron-job.sh script every 5 minutes.
Logs runtime output of lighttpd.
The DD-WRT syslog output
Logs executions of the udhcpc-script.sh script.
Logs runtime output of unbound.
Obtaining and Flashing a NETGEAR WNR3500L with DVRR Software
The WNR3500L was available from Amazon for $65 U.S. at the time of this writing. The software flash
file is available on request from the authors. A WNR3500L with the factory software loaded must first be
flashed with a special file to allow DD-WRT format files to be loaded. This file, which must be flashed just
the one time, to get a DD-WRT compatible flash loader installed, is downloadable from this link.
Building the NETGEAR DD-WRT DVRR Software
This is an involved process that we will briefly describe here. We do this work on Ubuntu workstations
running in Virtual Box VMs. First, you must obtain a toolchain suitable for the target platform. We do
this using Optware, using the instructions given in this link. You need to build the ldns, unbound and
dnssec-trigger packages from the Optware repository (along with dependencies such as openssl). Once
these package builds are in hand, you need to obtain the DD-WRT firmware modification kit from this
link. Finally, you need to obtain a DD-WRT firmware image from this link.
We have written some bash scripts that automate the rest of the process that you may obtain from one
of the authors. You’ll also need a folder containing the scripts described above, again, available from
one of the authors.
The scripts are step-1-copy-bits-to-opt and (step 2) copy-bits-to-firmware.sh. Basically, the step 1 script
copies files from the ldns, unbound and dnssec-trigger build folders built using Optware to a folder
structure provided by us. The step 2 script unpackages the DD-WRT firmware image you obtained
Validating Recursive Resolver Appliance
above, copies the files from the folder we provide and the ldns, unbound, dnssec-trigger and other
binaries that you built in Optware into the right places in the unpacked DD-WRT image, and then
packages it all back up into a new firmware image.
Authors
Olafur Gudmundsson – Olafur at Shinkuro dot com
Bob Novas – Bob at Shinkuro dot com
10
Download