Anycast DNS at small scale: lessons from running our own
April 23, 2026 · lab notes · dns · anycast · networking
Anycast DNS — the same IP advertised from multiple points-of-presence, with BGP letting clients reach the nearest one — is the operational pattern that powers 1.1.1.1, 8.8.8.8, and every major DNS-as-a-service provider. It’s also occasionally the right answer for a small operator.
We ran anycast on three PoPs for about eighteen months. Here’s what that looked like.
The motivation
Three reasons we considered anycast:
-
Latency. Our authoritative server was in Frankfurt. Queries from US East coast were 100ms+. Single-PoP DNS made our service feel slow on the first request.
-
Reliability. A single PoP means a single failure domain. If the Frankfurt VM has a bad day, the entire DNS goes down. With anycast, BGP automatically withdraws the route from a failed PoP.
-
DDoS absorption. A volumetric attack on a single PoP saturates that PoP. Anycast distributes the load — an attacker has to overwhelm all PoPs simultaneously.
The cost
Running anycast meant:
- Three PoPs, each with public BGP peering. We rented from a hosting provider that supported BGP customer announcements (most don’t, at small scale).
- An IP block we owned. You can’t anycast someone else’s address space. We had a /24 from a regional registry, which we acquired specifically for this.
- An ASN. Same registry, same project.
- BGP configuration. Each PoP runs FRR or BIRD, peers with the upstream, announces our /24 and the more-specific /32 of our anycast IP.
- Health checks at each PoP that withdraw the BGP announcement when the local DNS server is unhealthy.
The setup cost was real. Allocations, paperwork, and getting BGP peering at three providers took us about six weeks of calendar time, maybe forty hours of work.
What we got
Latency dropped meaningfully. North American clients went from 110ms to 18ms (Ashburn PoP). EU clients stayed roughly the same (still routing to Frankfurt). Asian clients improved less than we hoped, because we didn’t have an Asian PoP.
Reliability improved. We had two real incidents during the eighteen months: a hosting provider had a network problem at one PoP, and BGP withdrew us automatically. Other PoPs kept serving. End users saw nothing.
DDoS resilience improved in theory but was never tested. We weren’t a target.
What we’d do differently
A few things we got wrong:
Health check timing. Our initial config had BGP announcements with a 30-second TTL. When a PoP became unhealthy, BGP took up to 30 seconds to withdraw, plus another 30+ seconds for routes to converge globally. That’s a minute of bad answers during failures. We tightened to 5-second BFD which improved this.
Asymmetric routing surprises. With anycast, the same client can reach different PoPs for different queries — anycast is per-packet routing. UDP is fine. TCP is not — a TCP session that starts at PoP-A and gets routed to PoP-B mid-session breaks. We worked around it by ensuring all PoPs had identical zone data and TCP fallback worked correctly, but the first time we saw this in production, we spent an evening confused.
Zone synchronization. We assumed AXFR/IXFR would be fast. Mostly they are. Occasionally they aren’t — a transfer can take 30+ seconds for a large zone, and during that window one PoP has stale data. We monitor zone serial numbers across PoPs now and alert when they drift.
When anycast is worth it
Roughly: when a single PoP’s failure becomes business-critical, and when latency-sensitive clients are geographically distributed. For a single-region service, single-PoP DNS is fine. Add anycast when “DNS is down for some users” stops being acceptable.
For us, the threshold turned out to be around half a million queries per day with a globally distributed user base. Below that, single-PoP plus a good cloud DNS provider as a secondary was sufficient. Above that, we wanted control.
The operational complexity is real but bounded. Once running, anycast-DNS adds maybe one alert per quarter — usually upstream peering changes, occasionally a zone transfer hiccup.
It’s not a starter project. It’s a thing you graduate to.