Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring
Post on 31-Aug-2019
1 Views
Preview:
Transcript
Inferring Geography from BGP raw data
Luca Sani
Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi
Luca Sani Inferring Geography from BGP raw data 1 / 18
The Internet
The Internet: a huge set interconnected Autonomous Systems (ASes)
ASes are owned by organizations with different geographicdistribution and economic purposes
Luca Sani Inferring Geography from BGP raw data 2 / 18
Motivation
Internet AS-level topology inferred from BGP data:
1 node = 1 AS
1 edge = 1 or more BGP connections between two ASes
This global view hides the Internet heterogeneity
Luca Sani Inferring Geography from BGP raw data 3 / 18
Goals
1 Infer regional AS-level topologies from BGP data
2 Analyze graph and economic properties . . .
. . . at continental granularity:
Africa
Asia Pacific (Asia and Oceania)
Europe
Latin America (the Caribbean, Central America, Mexico andSouth America)
North America (Bermuda, Canada, Greenland, Saint Pierreand Miquelon, USA)
Luca Sani Inferring Geography from BGP raw data 4 / 18
Goals
1 Infer regional AS-level topologies from BGP data
2 Analyze graph and economic properties . . .
. . . at continental granularity:
Africa
Asia Pacific (Asia and Oceania)
Europe
Latin America (the Caribbean, Central America, Mexico andSouth America)
North America (Bermuda, Canada, Greenland, Saint Pierreand Miquelon, USA)
Luca Sani Inferring Geography from BGP raw data 4 / 18
What “BGP data” is?
We use BGP data provided by the Oregon UniversityRouteViews and the RIPE RIS projects (October 2011)
They deployed route collectors around the world
Route collectors gather routes from cooperating ASes(feeders)
Luca Sani Inferring Geography from BGP raw data 5 / 18
What “BGP data” is? (cont.)
There are three relevant information (for our work) in each route:
Set of AS paths ⇒ Global Topology
39,974 ASes139,944 Connections
Luca Sani Inferring Geography from BGP raw data 6 / 18
First Step - AS geolocation
“An AS is a connected group of one or more IP prefixes runby one or more IP network operators which has a single and clearlydefined routing policy” (RFC 1930)
For each AS
1 We collect its IP prefixes from BGP data
2 We geolocate it by geolocating its prefixes (Maxmind GeoLiteDatabase)
96% of 39,974 ASes result located only in one region
88% of 139,944 connections involve at least an AS locatedonly in one region
Luca Sani Inferring Geography from BGP raw data 7 / 18
Second Step - Single Region ASes
The connection A-B is geolocated in North America
BGP requires the interfaces of the connection to share thesame IP subnet (exception: BGP multihop)
The subnet S belongs either to AS A or to AS B
Luca Sani Inferring Geography from BGP raw data 8 / 18
Second Step - Single Region ASes (cont.)
A does not own any IP address outside the North America!
Luca Sani Inferring Geography from BGP raw data 9 / 18
Second Step - Single Region ASes (cont.)
In any case IP A must be in North America
⇒ The connection is geolocated in the single common region
Luca Sani Inferring Geography from BGP raw data 10 / 18
Second Step - Other cases
We exploit single region ASes or SOURCE and DESTINATION regions
We geolocate the connection in North America (regionalprinciple)
Luca Sani Inferring Geography from BGP raw data 11 / 18
Regional Topologies: EU and NA case
EuropeNorth
AmericaWorld
ASes 17,101 15,894 39,974Connections 72,581 42,610 139,944
Avg.Degree
8.49 5.36 6.97
Max Degree1818
(RETN)2542
(Level3)3418
(Cogent)
10-5
10-4
10-3
10-2
10-1
100
100 101 102 103 104
P(X
>x)
x = k
Europe North America World
Luca Sani Inferring Geography from BGP raw data 12 / 18
Regional Topologies: EU and NA case
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
10-4 10-3 10-2 10-1 100
P(X
>x)
x = kNN/max(k)
Europe North America World
Luca Sani Inferring Geography from BGP raw data 13 / 18
Economic Analysis
In order to get better insights we investigate the economic natureof the connections
Classic Economic Tags
Provider-to-Customer (P2C), Peer-to-Peer (P2P),Sibling-to-Sibling (S2S)
We adapted an economic tagging algorithm* to deal withgeographic information
*Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi, Luca Sani: BGP
and inter-AS Economic Relationships, IFIP Networking ’11
Luca Sani Inferring Geography from BGP raw data 14 / 18
Economic Analysis - Results
EuropeNorth
AmericaWorld
P2C 32,471 31,820 80,095P2P 39,813 10,230 58,040S2S 297 560 1,743
P2C = Provider-to-Customer, P2P = Peer-to-Peer, S2S = Sibling-to-Sibling
Europe vs North-America case
They have a similar number of P2C connections
Europe has much more P2P connections
IXPs play a fundamental role in this difference
Luca Sani Inferring Geography from BGP raw data 15 / 18
Conclusion
We developed a methodology to infer continental AS-leveltopologies
We analyzed their graph and economic properties
We evidenced structural differences otherwise hidden in theglobal topology
Luca Sani Inferring Geography from BGP raw data 16 / 18
Future
Fine-grained analysis (requires a high-precision geolocationtool)
Sensitivity of the results with respect to geolocation databases
Influence of current BGP feeders distribution on the results
Luca Sani Inferring Geography from BGP raw data 17 / 18
The End
Thank you for your attention!
Questions
luca.sani@imtlucca.it
Luca Sani Inferring Geography from BGP raw data 18 / 18
Backup
Backup Slides
Luca Sani Inferring Geography from BGP raw data 19 / 18
Active Measurement
Active measurement to solve geolocation of particular connections
Luca Sani Inferring Geography from BGP raw data 20 / 18
Step 2 - FROM & NEXT HOP
the FROM field identifies the neighbor BGP
the NEXT HOP field identifies the neighbor IP
Luca Sani Inferring Geography from BGP raw data 21 / 18
Economic Analysis - Tag Changes
Tag changes from the worldwide to the regional scenarios
AfricaAsia
PacificEurope
LatinAmerica
NorthAmerica
Peering to transit 12 86 325 36 219Transit to peering 165 824 2,304 361 1,136
Luca Sani Inferring Geography from BGP raw data 22 / 18
Geolocation issues
a) 145 ASes are not geolocated at all
b) pair of ASes that do not share any region (partial geolocationor multihop)
Do not appear in any regional topology
6,141 over 139,944 connections
199 over 39,974 ASes ⇐ 145 because a), 44 because b)
Luca Sani Inferring Geography from BGP raw data 23 / 18
Tagging algorithm
step A: Inference of all the possible economic relationships foreach direct AS connection
direct means that (A,B) 6= (B,A)
It is based on the approach proposed by Oliveira et al. in [2]
The list of Tier-1 provided by Wikipedia has been exploited
For each tag is mantained the lifespan of the AS path used
At the end of this step we have multiple (tag, lifespan) pairsfor each connection
Luca Sani Inferring Geography from BGP raw data 24 / 18
Tagging algorithm
step B: Inference of a single economic relationship for eachdirect AS connectionAll (tag, lifespan) pairs related to the same direct connectionhave to be merged
Find the max lifespan among each pairMerge only those pairs that have a comparable lifespan withthe max, i.e. those do not differ more than N order ofmagnitude from the maxRecord the largest lifespan as the lifespan of the resulting tag
[A, B][A, B] p2c p2p c2p s2s
p2c p2c p2c s2s s2sp2p p2c p2p c2p s2sc2p s2s c2p c2p s2ss2s s2s s2s s2s s2s
Luca Sani Inferring Geography from BGP raw data 25 / 18
Tagging algorithm
step C: Final tagging and two-way validationIn order to have the economic relationship existing betweenAS A and AS B, the tags inferred for (A,B) and (B,A)connections have to be mergedThe approach used is the same as Step B, considering thedifferent direction of connections, e.g. (A,B) = p2c and (B,A)= c2p have the same meaning
The merge is still based on lifespan, thus if the lifespans arenot comparable, only the long-lasting tag affect the final tag
If there is a tag for both (A,B) and (B,A) and their lifespan iscomparable, then the tag is said to be two-way validated
Luca Sani Inferring Geography from BGP raw data 26 / 18
Step 1 - Inferring Enhanced Routes from BGP data
IP Geolocation Database: Maxmind GeoLiteCity
The geolocation of the IP feeder is trivial
The geolocation of a /X prefix requires to geolocate 2(32−X )
IP addresses
What about AS geolocation?
Luca Sani Inferring Geography from BGP raw data 27 / 18
Step 1 - Inferring Enhanced Routes from BGP data
IP Geolocation Database: Maxmind GeoLiteCity
The geolocation of the IP feeder is trivial
The geolocation of a /X prefix requires to geolocate 2(32−X )
IP addresses
What about AS geolocation?
Luca Sani Inferring Geography from BGP raw data 27 / 18
Step 2 - Detection of SRLTPs inside enhanced Routes
SRLTP = Single Region Located Transit Point
In each enhanced route we find regions from which the traffic hasto transit:
1 SOURCE REGION, DEST REGION and one-region located ASes
2 ASes with only one region in common with neighbors
Luca Sani Inferring Geography from BGP raw data 28 / 18
Step 2 - Detection of SRLTP - Examples
Luca Sani Inferring Geography from BGP raw data 29 / 18
Step 3 - Inferring Geographic AS paths
For each enhanced route we analyze each SRTLP
Given the region of a SRLTP we try to expand the set ofconnections in that region
Luca Sani Inferring Geography from BGP raw data 30 / 18
Step 3 - Inferring Geographic AS paths
Luca Sani Inferring Geography from BGP raw data 31 / 18
Regional Topologies - Results
AfricaAsia
PacificEurope
LatinAmerica
NorthAmerica
World
ASes 815 6,427 17,101 2,453 15,894 39,974Connections 2,002 18,040 72,581 8,329 42,610 139,944
Avg. Overlap(Conns)
0.03±0.01 0.05±0.02 0.03±0.02 0.03±0.01 0.05±0.02 -
Avg. Degree 4.90 5.61 8.49 6.79 5.36 6.68
Luca Sani Inferring Geography from BGP raw data 32 / 18
Economic Analysis - Results
AfricaAsia
PacificEurope
LatinAmerica
NorthAmerica
World
P2C 1,456 12,808 32,471 4,514 31,820 80,095P2P 492 5,012 39,747 3,719 10,164 58,040S2S 21 102 297 37 350 1,743
P2C = Provider-to-Customer, P2P = Peer-to-Peer, S2S = Sibling-to-Sibling
Luca Sani Inferring Geography from BGP raw data 33 / 18
Economic Analysis
In order to get better insights we investigate the economicnature of the connections
Original* Adapted
Input: AS paths + LifespansGeographic AS paths +
Lifespans
Output:Global Economic Tagged
TopologyRegional EconomicTagged Topologies
Classic Economic Tags
Provider-to-Customer (P2C) , Peer-to-Peer (P2P) ,Sibling-to-Sibling (S2S)
*Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi, Luca Sani: BGP
and inter-AS Economic Relationships, IFIP Networking ’11
Luca Sani Inferring Geography from BGP raw data 34 / 18
Economic Relationships
provider-to-customer: the customer pays the provider toreach all ASes that it cannot reach in other ways
peer-to-peer: the two ASes exploits each other to reach theircustomer-cones (typically free-of-charge)
sibling-to-sibling: each AS acts as a provider for the other
Luca Sani Inferring Geography from BGP raw data 35 / 18
top related