Debugging DNS Resolution Issues on Linux

DNS resolution seems simple — you call getaddrinfo(), you get an IP back. But when it breaks, the failure can manifest in confusing ways: "it works in curl but not in my app", "sometimes it resolves, sometimes it doesn't", or "docker containers can't reach external services but the host can".

This guide walks through a repeatable debugging workflow I've used across dozens of DNS incidents.

1. Rule Out the Obvious

Before diving deep, check the basics:

# Can you reach the DNS server at all?
ping -c 2 8.8.8.8

# Is port 53 open?
nc -zv 8.8.8.8 53

# What DNS servers is the system configured to use?
resolvectl status

If these fail, your network connectivity or DNS server configuration is the problem — fix that first.

2. Trace Resolution with dig + trace

dig is the single most useful DNS debugging tool. The +trace flag follows the entire resolution chain from root servers down:

dig +trace example.com

# Query a specific nameserver
dig @8.8.8.8 example.com A

# Show only the answer section
dig +short example.com

What to look for in +trace output:

  • Root zone — do you get a response from root servers? If not, your resolver's root hints may be stale.
  • TLD servers — do .com/.org/etc. nameservers respond authoritatively?
  • Authoritative servers — do the domain's nameservers return records?
  • Timeouts at any hop — firewall or connectivity issue at that level.

3. Check the Resolver Stack

Modern Linux has a layered resolver architecture. Understanding which layer is failing saves hours:

# systemd-resolved
resolvectl query example.com
resolvectl statistics

# Check if resolved is running
systemctl status systemd-resolved

# Skip resolved and query directly
/etc/nsswitch.conf controls the lookup order:
  hosts: files dns          # check /etc/hosts first, then DNS
  hosts: files mdns4_minimal [NOTFOUND=return] dns  # mDNS before unicast DNS

Common pitfall: Applications using libcurl may use c-ares (a separate DNS library) instead of the system resolver. This means dig works but your app fails. Test with:

# Force your app to use the system resolver
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libnss_dns.so.2 your_app

4. Inspect /etc/resolv.conf

This file is often managed by systemd-resolved or NetworkManager. Don't edit it directly unless you know the management chain:

ls -la /etc/resolv.conf
# If it's a symlink, follow it
readlink -f /etc/resolv.conf

Typical configurations:

  • /run/systemd/resolve/stub-resolv.conf — points to 127.0.0.53 (systemd-resolved stub)
  • /run/systemd/resolve/resolv.conf — actual upstream DNS servers
  • Manually configured static file

5. Packet-Level Inspection

When nothing else makes sense, capture the packets:

# Capture DNS traffic on loopback (for systemd-resolved)
sudo tcpdump -i lo -n port 53

# Capture on physical interface
sudo tcpdump -i eth0 -n port 53

What to look for:

  • Query sent → no response? Firewall dropping outbound or DNS server not responding.
  • Response with NXDOMAIN? The domain genuinely doesn't exist or you're hitting the wrong nameserver.
  • Response with SERVFAIL? The authoritative server had an internal error.
  • Truncated response (TC flag set)? The response is too large for 512-byte UDP — the resolver should retry over TCP.

6. DNS in Docker Containers

Container DNS deserves its own checklist:

# Inside a container
cat /etc/resolv.conf
# Docker usually sets this to 127.0.0.11 (embedded DNS proxy)

# Check the embedded DNS proxy
docker exec <container> nslookup example.com

# If the embedded proxy fails, try bypassing it
docker run --dns 8.8.8.8 alpine nslookup example.com

# Check Docker's DNS configuration
docker info | grep -i dns

Common Docker DNS issues:

  • Custom networks with dns_search domains that shadow public domains
  • /etc/resolv.conf inside containers being overwritten by the Docker embedded DNS on restart
  • IPv6 DNS queries failing when the container has no IPv6 connectivity (add --sysctl net.ipv6.conf.all.disable_ipv6=1)

Summary Debugging Flow

1. ping/telnet to check basic connectivity
2. dig +trace to isolate the failing layer
3. resolvectl status / systemd-resolved check
4. nsswitch.conf and /etc/resolv.conf inspection
5. tcpdump for packet-level confirmation
6. Container: bypass embedded DNS proxy to isolate

Each layer narrows the search space. Start broad, rule out the obvious, then drill down.