Tier-1’s break Anycast DNS Zhihao Li, Neil Spring
Tier-1’s break Anycast DNS
Zhihao Li, Neil Spring
D-Root: 199.7.91.13• 111 Anycast replicas:
• 19 global (red): advertised without restriction • 92 local (black): advertised one hop in BGP
Anycast• Mental model:
• Packets sent to an anycast address travel to the nearest* replica, subject to global/local constraints.
• More replicas should mean lower latency, better distribution, reliability against denial-of-service attacks.
Anycast• Mental model:
• Packets sent to an anycast address travel to the nearest* replica, subject to global/local constraints.
• More replicas should mean lower latency, better distribution, reliability against denial-of-service attacks.
Reality
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2015 - 2016
0
500
1000
1500
2000
Av
era
ge
mil
es p
er q
uer
y t
rav
eled
Actual average distanceDistance to nearest global replicaDistance to nearest replica
• 4-5x optimal delay (to a local), 2x expected (nearest global)
Reality• Despite doubling the number of (local) replicas
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 0
500
1000
1500
2000
Av
era
ge
mil
es p
er q
uer
y t
rav
eled Actual average distance
Distance to nearest global replicaDistance to nearest replica
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2015 - 2016
020406080
100
# R
epli
cas
AllGlobal
Reality• 80% of queries should take under 1000 miles (16ms RTT) • 50% are traveling farther.
0 2000 4000 6000 8000 10000
Miles per query traveled, Oct 1 2016
0
20
40
60
80
100
CD
F
Distance to nearest replicaDistance to nearest global replicaActual average distance
Reality• Same data, first week in Oct 2016, log scale x-axis. • Even when there’s a global replica in your city…
1 10 100 1000 10000
Miles per query traveled, Oct 1 2016
0
20
40
60
80
100
CD
F
Distance to nearest replicaDistance to nearest global replicaActual average distance
How do we fix it?
• More sites? • More peerings? • Better policies? • Make local replicas global?
• What if ISPs chose cleverly from their providers? • Pathological behavior must be atypical, right?
• Is it even broken?
Similar observations
• Anycast Latency: How Many Sites Are Enough? Schmidt, Heidemann, Kuipers • Used Atlas probes (not traces) to look at C, F,
K, L root. • More sites doesn’t correlate with lower latency • Making local sites global didn’t help K
It’s the tier-1’s (I think)
Source (resolver) location• For addresses originated by Tier 1’s, what is their nearest
replica. Intensity by query volume.
UUNET
QWEST
TELIANET
COGENT
OPENTRANSIT
DTAG
LEVEL3
SEABONE
KPN
TELEFONICA
ATT
XO
GTT
ZAYO
SPRINTLINK
TATA
NTT
sew
a
paca
bbca
laca
chil
atg
a
mifl
abva
mcva
cpm
d
nyn
y
louk
zuch
ffde
viat
sgsg
hkcn
tojp
sew
a
pa
ca
bb
ca
laca
chil
atg
a
mifl
ab
va
mcv
a
cpm
d
nyn
y
louk
zuch
ffd
e
viat
sgsg
hkc
n
tojpGlobal replicas
0
0.5
1
Clie
nt port
ion
Request destination• For addresses originated by Tier 1’s, what is their chosen
replica. Intensity by query volume.
UUNET
QWEST
TELIANET
COGENT
OPENTRANSIT
DTAG
LEVEL3
SEABONE
KPN
TELEFONICA
ATT
XO
GTT
ZAYO
SPRINTLINK
TATA
NTT
sew
a
paca
bbca
laca
chil
atg
a
mifl
abva
mcva
cpm
d
nyn
y
louk
zuch
ffde
viat
sgsg
hkcn
tojp
sew
a
pa
ca
bb
ca
laca
chil
atg
a
mifl
ab
va
mcv
a
cpm
d
nyn
y
louk
zuch
ffd
e
viat
sgsg
hkc
n
tojpGlobal replicas
0
0.5
1
Query
port
ion
Would you like to see them again?
Often McLean, VA.• Traffic from tier-1 address space can arrive on other replicas,
but generally does not.
UUNET
QWEST
TELIANET
COGENT
OPENTRANSIT
DTAG
LEVEL3
SEABONE
KPN
TELEFONICA
ATT
XO
GTT
ZAYO
SPRINTLINK
TATA
NTT
sew
a
pa
ca
bb
ca
laca
chil
atg
a
mifl
ab
va
mcva
cpm
d
nyn
y
lou
k
zuch
ffde
viat
sgsg
hkcn
tojp
sew
a
paca
bbca
laca
chil
atg
a
mifl
abva
mcv
a
cpm
d
nyn
y
louk
zuch
ffde
viat
sgsg
hkc
n
tojp
Global replicas
0
0.5
1
Clie
nt port
ion
UUNET
QWEST
TELIANET
COGENT
OPENTRANSIT
DTAG
LEVEL3
SEABONE
KPN
TELEFONICA
ATT
XO
GTT
ZAYO
SPRINTLINK
TATA
NTT
sew
a
pa
ca
bb
ca
laca
chil
atg
a
mifl
ab
va
mcva
cpm
d
nyn
y
lou
k
zuch
ffde
viat
sgsg
hkcn
tojp
sew
a
pa
ca
bb
ca
laca
chil
atg
a
mifl
ab
va
mcv
a
cpm
d
nyn
y
lou
k
zuch
ffd
e
via
t
sgsg
hkc
n
tojp
Global replicas
0
0.5
1
Qu
ery
po
rtio
n
Could just be us.
Could just be us. No.
Could just be us. No.
This time using RIPE Atlas data, same Oct 1, 2016. Now counting vantage points whose queries transit a tier-1 (since we have traceroutes) instead of queries received.
A-Root• Better. Notably, DTAG sends to London, not Frankfurt.
SPRINTLINK
QWEST
KPN
DTAG
UUNET
ATT
LGI
XO
ZAYO
GTT
LEVEL3
TELEFONICA
SEABONE
OPENTRANSIT
TATA
TELIANET
COGENT
NTT
lax
nyc
lon
fra
hkg
lax
nyc
lon
fra
hkg
Global replicas
0
0.5
1
Qu
ery
po
rtio
n
SPRINTLINK
QWEST
KPN
DTAG
UUNET
ATT
LGI
XO
ZAYO
GTT
LEVEL3
TELEFONICA
SEABONE
OPENTRANSIT
TATA
TELIANET
COGENT
NTT
lax
nyc
lon
fra
hkg
lax
nyc
lon
fra
hkg
Global replicas
0
0.5
1
Qu
ery
po
rtio
n
C-Root• The best at matching tier-1-carried queries to a nearby site.
KPN
DTAG
TELEFONICA
SEABONE
LGI
ZAYO
OPENTRANSIT
QWEST
GTT
ATT
TATA
SPRINTLINK
UUNET
LEVEL3
XO
NTT
TELIANET
COGENT
lax
ord
iad
jfk mad
par
fra
bts
lax
ord
iad
jfk mad
par
fra
bts
Global replicas
0
0.5
1
Qu
ery
po
rtio
n
KPN
DTAG
TELEFONICA
SEABONE
LGI
ZAYO
OPENTRANSIT
QWEST
GTT
ATT
TATA
SPRINTLINK
UUNET
LEVEL3
XO
NTT
TELIANET
COGENT
lax
ord
iad
jfk mad
par
fra
bts
lax
ord
iad
jfk mad
par
fra
bts
Global replicas
0
0.5
1
Qu
ery
po
rtio
n
E-Root• Similar to D in that northern Virginia is preferred, despite
Paris, Frankfurt, London query sources.
KPN
DTAG
UUNET
TELIANET
SEABONE
LEVEL3
TELEFONICA
OPENTRANSIT
SPRINTLINK
LGI
ZAYO
GTT
COGENT
ATT
XO
TATA
NTT
pa
o
sfo
bu
r
ord
atl
mia
iad
lga
lhr
cdg
fra
qp
g
syd
pa
o
sfo
bu
r
ord
atl
mia
iad
lga
lhr
cdg
fra
qp
g
syd
Global replicas
0
0.5
1
Query
port
ion
KPN
DTAG
UUNET
TELIANET
SEABONE
LEVEL3
TELEFONICA
OPENTRANSIT
SPRINTLINK
LGI
ZAYO
GTT
COGENT
ATT
XO
TATA
NTT
pa
o
sfo
bu
r
ord
atl
mia
iad
lga
lhr
cdg
fra
qp
g
syd
pa
o
sfo
bu
r
ord
atl
mia
iad
lga
lhr
cdg
fra
qp
g
syd
Global replicas
0
0.5
1
Query
port
ion
F-Root• Mostly European RIPE probes served by Chicago despite an
Amsterdam replica.
QWEST
OPENTRANSIT
KPN
COGENT
DTAG
GTT
ATT
SPRINTLINK
TATA
UUNET
XO
ZAYO
TELEFONICA
TELIANET
NTT
LEVEL3
LGI
SEABONE
pao
ord
atl
lga
am
s
pao
ord
atl
lga
am
s
Global replicas
0
0.5
1
Qu
ery
po
rtio
n
QWEST
OPENTRANSIT
KPN
COGENT
DTAG
GTT
ATT
SPRINTLINK
TATA
UUNET
XO
ZAYO
TELEFONICA
TELIANET
NTT
LEVEL3
LGI
SEABONE
pao
ord
atl
lga
am
s
pao
ord
atl
lga
am
s
Global replicas
0
0.5
1
Qu
ery
po
rtio
n
i-Root• Still picking just one server, not typically the server with the
most clients.
XO
KPN
QWEST
GTT
TELEFONICA
LEVEL3
SEABONE
ZAYO
UUNET
OPENTRANSIT
DTAG
TELIANET
TATA
NTT
COGENT
ATT
LGI
sox
chi
mia
ash
was
mtv
poa
lon
par
am
xln
xgva
dex
mln
osl
gur
wie
sthlu
lrigtll fix ro
xjn
brw
xsp
bukx
ank
yan
bah
qtr
dbi
khi
mum
kat
thi
bkx
ula
sinhkx
prt
bnx
mix
tai
tok
vux
wel
sox
chi
mia
ash
was
mtv
poa
lon
par
am
xln
xgva
dex
mln
osl
gur
wie
sth
lul
rig
tll fix rox
jnb
rwx
spb
ukx
ank
yan
bah
qtr
dbi
khi
mum
kat
thi
bkx
ula
sin
hkx
prt
bnx
mix
tai
tok
vux
wel
Global replicas
0
0.5
1
Qu
ery
po
rtio
n
XO
KPN
QWEST
GTT
TELEFONICA
LEVEL3
SEABONE
ZAYO
UUNET
OPENTRANSIT
DTAG
TELIANET
TATA
NTT
COGENT
ATT
LGI
sox
chi
mia
ash
was
mtv
poa
lon
par
am
xln
xgva
dex
mln
osl
gur
wie
sthlu
lrigtll fix ro
xjn
brw
xsp
bukx
ank
yan
bah
qtr
dbi
khi
mum
kat
thi
bkx
ula
sinhkx
prt
bnx
mix
tai
tok
vux
wel
sox
chi
mia
ash
was
mtv
poa
lon
par
am
xln
xgva
dex
mln
osl
gur
wie
sth
lul
rig
tll fix rox
jnb
rwx
spb
ukx
ank
yan
bah
qtr
dbi
khi
mum
kat
thi
bkx
ula
sin
hkx
prt
bnx
mix
tai
tok
vux
wel
Global replicas
0
0.5
1
Qu
ery
po
rtio
n
J-Root• Fairly good, although preference for “tpe” despite no clients.
DTAG
LGI
KPN
TELIANET
COGENT
TATA
SPRINTLINK
TELEFONICA
UUNET
ATT
SEABONE
QWEST
GTT
ZAYO
XO
OPENTRANSIT
LEVEL3
NTT
yvrsfose
adfw
eau
ord
btl
atl
sjoilg m
iaia
dcb
bsjura
oaju
cpv
jpa
rkvm
ad
lgw
par
am
slju arn
cpt
waw
sof
kun
rixtll le
dw
ilevn
kwi
bom
mle
del
dac
dm
kip
hsinpek
tpe
tbh
sel
hnd
gum
mel
wlg
yvr
sfo
sea
dfw
eau
ord
btl
atl
sjo
ilg mia
iad
cbb
sju
rao
aju
cpv
jpa
rkv
mad
lgw
par
am
slju arn
cpt
waw
sof
kun
rix
tll led
wil
evn
kwi
bom
mle
del
dac
dm
kip
hsi
npek
tpe
tbh
sel
hnd
gum
mel
wlg
Global replicas
0
0.5
1
Qu
ery
po
rtio
n
DTAG
LGI
KPN
TELIANET
COGENT
TATA
SPRINTLINK
TELEFONICA
UUNET
ATT
SEABONE
QWEST
GTT
ZAYO
XO
OPENTRANSIT
LEVEL3
NTT
yvrsfose
adfw
eau
ord
btl
atl
sjoilg m
iaia
dcb
bsjura
oaju
cpv
jpa
rkvm
ad
lgw
par
am
slju arn
cpt
waw
sof
kun
rixtll le
dw
ilevn
kwi
bom
mle
del
dac
dm
kip
hsinpek
tpe
tbh
sel
hnd
gum
mel
wlg
yvr
sfo
sea
dfw
eau
ord
btl
atl
sjo
ilg mia
iad
cbb
sju
rao
aju
cpv
jpa
rkv
mad
lgw
par
am
slju arn
cpt
waw
sof
kun
rix
tll led
wil
evn
kwi
bom
mle
del
dac
dm
kip
hsi
npek
tpe
tbh
sel
hnd
gum
mel
wlg
Global replicas
0
0.5
1
Qu
ery
po
rtio
n
K-Root• Looks a bit like D.
SPRINTLINK
KPN
OPENTRANSIT
TELEFONICA
DTAG
ZAYO
LGI
GTT
UUNET
XO
QWEST
TELIANET
ATT
COGENT
SEABONE
TATA
NTT
LEVEL3
us-rn
ou
s-sgu
us-m
kccr-sjou
s-mia
us-ric
uy-m
vdis-re
yg
b-lo
nfr-p
ar
nl-a
ms
ch-g
vad
e-ka
ech
-zrhd
e-fra
it-mil
cz-prg
at-vie
pl-p
oz
pl-g
dy
hu
-bu
drs-b
eg
bg
-sof
gr-a
thlv-rixfi-h
el
za-jn
bru
-led
lb-b
ey
ru-m
ow
am
-evn
am
-ab
oir-th
rin
-bo
mkz-p
lxjp
-tyoa
u-b
ne
us-
rno
us-
sgu
us-
mkc
cr-s
jou
s-m
iau
s-ric
uy-
mvd
is-r
ey
gb
-lo
nfr
-pa
rn
l-a
ms
ch-g
vad
e-k
ae
ch-z
rhd
e-f
rait-
mil
cz-p
rga
t-vi
ep
l-p
oz
pl-g
dy
hu
-bu
drs
-be
gb
g-s
of
gr-
ath
lv-r
ixfi-
he
lza
-jn
bru
-le
dlb
-be
yru
-mo
wa
m-e
vna
m-a
bo
ir-t
hr
in-b
om
kz-p
lxjp
-tyo
au
-bn
e
Global replicas
0
0.5
1
Query
port
ion
SPRINTLINK
KPN
OPENTRANSIT
TELEFONICA
DTAG
ZAYO
LGI
GTT
UUNET
XO
QWEST
TELIANET
ATT
COGENT
SEABONE
TATA
NTT
LEVEL3
us-rn
ou
s-sgu
us-m
kccr-sjou
s-mia
us-ric
uy-m
vdis-re
yg
b-lo
nfr-p
ar
nl-a
ms
ch-g
vad
e-ka
ech
-zrhd
e-fra
it-mil
cz-prg
at-vie
pl-p
oz
pl-g
dy
hu
-bu
drs-b
eg
bg
-sof
gr-a
thlv-rixfi-h
el
za-jn
bru
-led
lb-b
ey
ru-m
ow
am
-evn
am
-ab
oir-th
rin
-bo
mkz-p
lxjp
-tyoa
u-b
ne
us-
rno
us-
sgu
us-
mkc
cr-s
jou
s-m
iau
s-ric
uy-
mvd
is-r
ey
gb
-lo
nfr
-pa
rn
l-a
ms
ch-g
vad
e-k
ae
ch-z
rhd
e-f
rait-
mil
cz-p
rga
t-vi
ep
l-p
oz
pl-g
dy
hu
-bu
drs
-be
gb
g-s
of
gr-
ath
lv-r
ixfi-
he
lza
-jn
bru
-le
dlb
-be
yru
-mo
wa
m-e
vna
m-a
bo
ir-t
hr
in-b
om
kz-p
lxjp
-tyo
au
-bn
e
Global replicas
0
0.5
1
Query
port
ion
L-Root• Many global replicas (like i), not often choosing nearby
replicas
Global replicas
KPN
GTT
LGI
ZAYO
OPENTRANSIT
LEVEL3
ATT
DTAG
XO
TELEFONICA
SEABONE
UUNET
COGENT
QWEST
TATA
NTT
TELIANET
apw
hnl
anc
ppt
yvrpdx
sea
sjcrn
ola
xphx
den
mty
ywg
lwc
sal
ord
azo
atl
sjom
iayyzytzuio
ilg iad
lim yow
bog
sclsd
qccscb
bsjueze
asu
mvd
poa
ldb
bfh
fln bel
udi
bsb
vcpgru
sjkcn
fsd
ufo
rssanat
dkr
opo
cmn
byk
abj
dnd
lba
lcyrcsbcn
cdg
ory
bru
am
slysm
rsgva
dus
dtm
ham
tun
osl
flr cph
mm
xprg
bts
arn
cpt
beg
sof
her
msq
jnb
istkb
pods
esb
blz
bey
am
mhrk
svoje
ddar
rov
sah
evn
ruh
kwi
gyd
dm
mbah
dxb
run
sez
mru
mct
svxbom
isblh
eplx
ccum
dl
rgn
bkk
jktjo
gper
pek
mnl
icnhnd
gum
meb
pom
sydbne
pni
hir
nou
maj
chc
akl
nan
suv
apw
hnl
anc
ppt
yvr
pdx
sea
sjc
rno
lax
phx
den
mty
ywg
lwc
sal
ord
azo
atl
sjo
mia
yyz
ytz
uio
ilg iad
lim yow
bog
scl
sdq
ccs
cbb
sju
eze
asu
mvd
poa
ldb
bfh
fln bel
udi
bsb
vcp
gru
sjk
cnf
sdu
for
ssa
nat
dkr
opo
cmn
byk
abj
dnd
lba
lcy
rcs
bcn
cdg
ory
bru
am
sly
sm
rsgva
dus
dtm
ham
tun
osl
flr cph
mm
xprg
bts
arn
cpt
beg
sof
her
msq
jnb
ist
kbp
ods
esb
blz
bey
am
mhrk
svo
jed
dar
rov
sah
evn
ruh
kwi
gyd
dm
mbah
dxb
run
sez
mru
mct
svx
bom
isb
lhe
plx
ccu
mdl
rgn
bkk
jkt
jog
per
pek
mnl
icn
hnd
gum
meb
pom
syd
bne
pni
hir
nou
maj
chc
akl
nan
suv
0
0.5
1
Qu
ery
po
rtio
n
Global replicas
KPN
GTT
LGI
ZAYO
OPENTRANSIT
LEVEL3
ATT
DTAG
XO
TELEFONICA
SEABONE
UUNET
COGENT
QWEST
TATA
NTT
TELIANET
apw
hnl
anc
ppt
yvrpdx
sea
sjcrn
ola
xphx
den
mty
ywg
lwc
sal
ord
azo
atl
sjom
iayyzytzuio
ilg iad
lim yow
bog
sclsd
qccscb
bsjueze
asu
mvd
poa
ldb
bfh
fln bel
udi
bsb
vcpgru
sjkcn
fsd
ufo
rssanat
dkr
opo
cmn
byk
abj
dnd
lba
lcyrcsbcn
cdg
ory
bru
am
slysm
rsgva
dus
dtm
ham
tun
osl
flr cph
mm
xprg
bts
arn
cpt
beg
sof
her
msq
jnb
istkb
pods
esb
blz
bey
am
mhrk
svoje
ddar
rov
sah
evn
ruh
kwi
gyd
dm
mbah
dxb
run
sez
mru
mct
svxbom
isblh
eplx
ccum
dl
rgn
bkk
jktjo
gper
pek
mnl
icnhnd
gum
meb
pom
sydbne
pni
hir
nou
maj
chc
akl
nan
suv
apw
hnl
anc
ppt
yvr
pdx
sea
sjc
rno
lax
phx
den
mty
ywg
lwc
sal
ord
azo
atl
sjo
mia
yyz
ytz
uio
ilg iad
lim yow
bog
scl
sdq
ccs
cbb
sju
eze
asu
mvd
poa
ldb
bfh
fln bel
udi
bsb
vcp
gru
sjk
cnf
sdu
for
ssa
nat
dkr
opo
cmn
byk
abj
dnd
lba
lcy
rcs
bcn
cdg
ory
bru
am
sly
sm
rsgva
dus
dtm
ham
tun
osl
flr cph
mm
xprg
bts
arn
cpt
beg
sof
her
msq
jnb
ist
kbp
ods
esb
blz
bey
am
mhrk
svo
jed
dar
rov
sah
evn
ruh
kwi
gyd
dm
mbah
dxb
run
sez
mru
mct
svx
bom
isb
lhe
plx
ccu
mdl
rgn
bkk
jkt
jog
per
pek
mnl
icn
hnd
gum
meb
pom
syd
bne
pni
hir
nou
maj
chc
akl
nan
suv
0
0.5
1
Qu
ery
po
rtio
n
Why is D-Root not distributed?
• ‘mcva’ and ‘cpmd’ are announced through UMD / MAX-Gigapop, which peers with Quest, Telia, Level3. Other replicas are announced by Packet Clearing House (PCH).
• Some Tier-1 ISPs peer only with UMD, thus route queries only to ‘mcva’ and ‘cpmd’.
UUNET
QWEST
TELIANET
COGENT
OPENTRANSIT
DTAG
LEVEL3
SEABONE
KPN
TELEFONICA
ATT
XO
GTT
ZAYO
SPRINTLINK
TATA
NTT
sew
a
paca
bbca
laca
chil
atg
a
mifl
abva
mcva
cpm
d
nyn
y
louk
zuch
ffde
viat
sgsg
hkcn
tojp
sew
a
paca
bbca
laca
chil
atg
a
mifl
abva
mcv
a
cpm
d
nyn
y
louk
zuch
ffde
viat
sgsg
hkc
n
tojp
Global replicas
0
0.5
1
Query
port
ion
Why is C-Root so good?
• C is operated by Cogent, another Tier-1 • Expect other tier-1’s peer with Cogent widely • Expect their early-exit-ed queries to go
immediately to Cogent, and reach the nearest replica
KPN
DTAG
TELEFONICA
SEABONE
LGI
ZAYO
OPENTRANSIT
QWEST
GTT
ATT
TATA
SPRINTLINK
UUNET
LEVEL3
XO
NTT
TELIANET
COGENT
lax
ord
iad
jfk ma
d
pa
r
fra
bts
lax
ord
iad
jfk ma
d
pa
r
fra
bts
Global replicas
0
0.5
1
Qu
ery
po
rtio
n
KPN
DTAG
TELEFONICA
SEABONE
LGI
ZAYO
OPENTRANSIT
QWEST
GTT
ATT
TATA
SPRINTLINK
UUNET
LEVEL3
XO
NTT
TELIANET
COGENT
lax
ord
iad
jfk ma
d
pa
r
fra
bts
lax
ord
iad
jfk ma
d
pa
r
fra
bts
Global replicas
0
0.5
1
Qu
ery
po
rtio
n
So how can anycast improve?
• Do we bug tier-1 operators? • Do we assume it’s no big deal since PowerDNS
will pick among the 13? • Do we spend resources elsewhere?
(Pretending that my affiliation with Maryland makes me vaguely responsible for administering this resource)