How the Internet Works Steven M. Bellovin Department of Computer Science, Columbia University hAps://www.cs.columbia.edu/~smb 1
HowtheInternetWorksStevenM.Bellovin
DepartmentofComputerScience,ColumbiaUniversityhAps://www.cs.columbia.edu/~smb
1
DisclaimerAllofthestatements,opinions,facts,myths,errors,etc.,inthistalkaremineandminealone,anddonotrepresenttheopinionsofColumbiaUniversityorofanyagencyoftheUSgovernment.
2
WhatistheInternetMadeof?l Computers
l Serversl Clientsl Phonesl “Things”
l Routers—specializedcomputersthatforward“packets”l Packetsarefragmentsofmessages
l Links—WiFi,Ethernet,fiber,etc.TheInternetwasdesignedtorunoveranything
3
Fibersl Eachcablehasmanypairsofstrands
l Eachstrandcarriesmanywavelengths(aka“colors”or“lambdas”)l Anewtrans-Pacificfiberhassixpairsofstrandsl Eachstrandcarries100wavelengthsl Eachwavelengthhasabandwidthof100Gbpsl Totalcapacity:60terabits/second
l Eachwavelengthcancarrymanydifferentcircuits
l EachInternetcircuitcarriespacketsformanydifferentconversa_ons
4
WiFil Usedinpublicspacesandprivateresidences
l Someuseinbusiness,butwiredEthernetismorecommonfordesktops
l Range:about100meters
l Security:WEPisobsoleteandinsecure;WPA2isquitegood—andinpublic,allbetsareoff.
5
ALookatCommonApplica_onsl Webbrowsing
l Email
l TheCloud
l Cau1on:allofthisissimplified—andarguablyoversimplified
6
HowtheWebAppearstoUsers
7
Internet
WebBrowser
WebServer
TheInternetHasStructure:Mul_pleISPs
8
ISPA
ISPB
LON
NYC
Rou_ngBetweenISPs
9
9
Verizon
Sprint
IIJ
Big ISPs ‘Peering’
GoJ
Amazon Customers buy ‘Transit’
Sakura
EachISPHasStructure:ManyRouters
10
Hos_ngServices
11
Internet
WebBrowser
Hos_ngCompany
CompanyA
CompanyB
CompanyC
ContentDistribu_onNetwork
12
CDNACDNB
CDNC CDNDWebServer
ContentDistribu_onNetwork
13
CDNACDNB
CDNC
CDNDWebServer
ContentDistribu_onNetwork
14
CDNACDNB
CDNC
CDNDWebServer
ContentDistribu_onNetwork
15
CDNACDNB
CDNC CDNDWebServer
CDNExample:www.supremecourtus.gov
NewYork 24.143.200.48
Ashburn,Va 23.15.9.144
Atlanta 208.44.23.57
SanFrancisco 216.156.149.106
Boston 207.86.164.89
16
www.supremecourt.govisanaliasfora1042.b.akamai.net;AkamaiisaprominentCDNoperator
WhichistheBrowser;WhichistheServer?
17
Internet
WebBrowser
WebServer
Architecturally,They’retheSame—WhatMaAersistheSolwareTheyRun
18
Internet
WebServer
WebBrowser
“SmartHosts,DumbNetwork”l Thephonenetworkwasbuiltfordumbphones–nothingelsewastechnicallyoreconomicallyfeasible.
l Allintelligenceisinthenetwork:conferencecalls,callforwarding,evenmanyvoicemenus
l Internetroutersareverydumb;allintelligenceisinendsystemsl Consequence:serviceprovidersarenotnecessarilythesameasnetworkprovidersl Aperson’smailprovidermaybeinanothercountry
19
ThePhoneNetwork:AFewLargeSwitches,ServingPhones
20
TheInternet:ManyRouters,VeryManyTypesofDevices
21
CircuitSwitchingversusPacketSwitchingl Circuits:tradi_onaltelephonymodel
l Paththroughthenetworkselectedat“callsetup_me”l Verysmallnumberofcallsetups;process
canbeheavyweight
l Each“phoneswitch”needstoknowthedes1na1onofthecall,notthesource;returntraffictakesthereversepath
l Packets:Internetmodel
l Every“packet”–afragmentofamessage–isroutedindependentlyl Nocallsetupl Rou_ngmustbevery,veryfast;it’sdone
foreachpacket
l Robustness:ifa“router”fails,packetscantakeadifferentpath
l Everypacketmusthaveasourceanddes_na_onaddress,toenablereplies
l Replytrafficmaytakeaverydifferentpath
22
IPAddressesl Ausertypesanamesuchaswww.dni.gov.
l TheDomainNameSystem(DNS)translatesthattoanInternetProtocol(IP)Addresssuchas23.213.38.42l IPaddressesarefourbyteslong;eachofthosenumbersisintherange0-255l www.dni.govactuallyusesaCDN,soeveryqueriergetsadifferentanswer
l IPaddressesarewhatappearinpackets
l Routerstalktoeachother(viaRou1ngProtocols)tolearnwhereeachIPaddressis
23
IPAddressing
l Roughly4billionpossibleIPaddressestodayl IPv6,anewerversionofIPbeingdeployednow,hasmanymoreaddresses
l IPaddressesarehandedoutinblockstobigISPs.BigISPsgivepiecesoftheiralloca_onstosmallerISPsortoendcustomers
l Unlessyou’reaverylargeenterprise,theonlywaytogetIPaddressesisfromyourISP–andifyouswitchISPs,youhavetorenumberyourcomputers
l Thereisnoanalogto“localnumberportability”ontheInternet–andcan’tbe;there’sno_metodothatmanylookups
24
AddressSpaceAssignmentl IPaddressesarehandedoutbyRegionalInternetRegistries(RIRs),suchasARIN
l TheygettheiraddressesfromICANN,aninterna_onalnon-profitwhichgetsitsauthorityfromtheU.S.DepartmentofCommerce–controversialabroad
l Addressesareallocatedbasedondemonstratedshort-termneedandevidenceofefficientuseofpreviously-allocatedaddresses
l Addressesmaynotbesold,evenaspartofabankruptcy,merger,oracquisi_on,exceptwithARIN’sapprovalandinaccordancewithARIN’spoliciesl Thisasser_onofauthorityhasneverbeencontestedincourt—andsomehavebeentransferredby
orderofabankruptcycourtl SomeISPshave(veryvaluable)pre-ARINaddresses,called“legacyspace”.Legacyaddressholders
don’thavetorenumberwhenswitchingISPs(amongotheradvantages)
25
PortNumbersl Whenonecomputercontactsanother,isittryingtotalktoaWebserverortryingtosendmail?l Rememberthatarchitecturally,allmachinesontheInternetarealikel It’sperfectlylegaltorunaWebserverandamailserveronasinglecomputer
l PacketscontainnotjustanIPaddressbutaportnumberl Port25isthemailserver,port80istheWebserver,443isencryptedWeb,etc.
l IfanIPaddressislikeastreetaddress,aportnumberistheroomnumberinthebuildingl Room25isthemailroom,room80islibrary,etc.
26
TheNetworkStackl TheInternetusesalayered
architecture
l Applica_ons—email,web,etc.—arewhatwecareabout
l TCP(whichhasportnumbers)transportsthedata;itisend-to-end
l IP(thenetworklayer)isprocessedbyeveryrouteralongthepath
l ThelinklayeristhingslikeWiFi,Ethernet,etc.
27
28
SendingEmail
29
ISP
ISP ISP
ISP
OutboundMailServer
InboundMailServer
AccessLinks
SendingMyselfEmail—AnSMTPTranscript
30
220machshav.comESMTPExim4.82Tue,11Mar201419:43:03+0000HELOeloi.cs.columbia.edu250machshav.comHelloeloi.cs.columbia.edu[2001:18d8:ffff:16:12dd:b1ff:feef:8868]MAILFROM:<[email protected]>250OKRCPTTO:<[email protected]>250AcceptedDATA354Entermessage,endingwith"."onalinebyitselfFrom:BarackObama<[email protected]>To:<[email protected]>Subject:TestThisisatest.250OKid=1WNSaS-0001z5-1dQUIT221machshav.comclosingconnec_on
Message
Conversa_onWithAThirdParty
31
220machshav.comESMTPExim4.82Tue,11Mar201419:43:03+0000HELOeloi.cs.columbia.edu250machshav.comHelloeloi.cs.columbia.edu[2001:18d8:ffff:16:12dd:b1ff:feef:8868]MAILFROM:<[email protected]>250OKRCPTTO:<[email protected]>250AcceptedDATA354Entermessage,endingwith"."onalinebyitselfFrom:BarackObama<[email protected]>To:<[email protected]>Subject:TestThisisatest.250OKid=1WNSaS-0001z5-1dQUIT221machshav.comclosingconnec_on
Message
WhattheRecipientSees
32
220machshav.comESMTPExim4.82Tue,11Mar201419:43:03+0000HELOeloi.cs.columbia.edu250machshav.comHelloeloi.cs.columbia.edu[2001:18d8:ffff:16:12dd:b1ff:feef:8868]MAILFROM:<[email protected]>250OKRCPTTO:<[email protected]>250AcceptedDATA354Entermessage,endingwith"."onalinebyitselfFrom:BarackObama<[email protected]>To:<[email protected]>Subject:TestThisisatest.250OKid=1WNSaS-0001z5-1dQUIT221machshav.comclosingconnec_on
Message
ALeAerfromEleanorRoosevelttoLorenaHickock(March1933)
33
Itbegins“Hickmydearest”.(excerptfromAmazon.com)
ThingstoNotel TheSMTPenvelope—that’sthetechnicalterm!—canhavedifferentinforma_onthanthe
messageheaders
l Unlikethephonenetwork,anyonecanruntheirownmailserversl Ipersonallyruntwo,onepersonalandoneprofessionall Thiscomplicatesthirdpartydoctrineanalysis
l TherealityofemailisfarmorecomplexthanI’veoutlinedherel Example:manypeoplereadtheiremailviaaWebbrowser—andtheNSAhasstatedthatevenfor
them,pickingoutjusttheFrom/Toinforma_onfromaWebmailsessionisverydifficult
l Ihaven’tevenbeguntoaddressserver-residentemail,virusscanning,spamfiltering,andthelike,letalonealloftheothermetadatathat’spresent
34
Encryp_onontheInternet
35
AnythingCanbeEncryptedl Links—thoughmostlyusedonWiFi
l VirtualPrivateNetworks(VPNs)
l Simpleconnec_ons(Web,email,etc.),generallyviaTransportLayerSecurity(TLS)
l Data,especiallythebodyofemailmessages
36
VPNsl Usedbycorporateemployeesfortelecommu_ngorwhiletraveling
l Alsousedtoconnectmul_plecorporateloca_ons
l Some_mesusedtospoofloca_onl Covertracksl Foolgeographicrestric_onsoncontent,e.g.,streamingmoviesandmusic
l ArecentlypublishedacademicpaperconcludedthattheNSAcouldcryptanalyzealotofVPNsessions
37
TLSl UsedforallsecureWebtraffic
l Widely(andincreasingly)usedwhensendingandretrievingemaill But—TLSdoesnotprotectemail“atrest”,i.e.,whileondiskonthevariousservers
l Usedformanyotherpoint-to-pointconnec_ons,e.g.,Dropbox
l OlderversionsofTLShavecryptographicweaknesses;theseare(believedtobe)fixedinthenewestversions
l Themostcommonimplementa_onsofTLShavealonghistoryofserioussecurityflaws
38
EmailEncryp_onl Twodifferentstandards,S/MIMEandPGP
l S/MIMEiswidelysupported—butrarelyusedl PGPrequireslessinfrastructuresupport,andhenceisusedbyenthusiasts
l Protectsemailatrest—buthinderssearching
l Doesnotprotectemailheadersorothermetadata
39
Tor:TheOnionRouterl ComputerApicksasequenceofTorrelays
(C➝E➝D)l Distheexitnode,andpassesthetrafficto
des_na_onhostGl Allofthesehopsareencrypted
l BpicksrelaysF➝C➝Dl Gcan’ttellwhichisfromAandwhichfrom
B
l NeithercananyoneelsemonitoringG’straffic
l ManyuseTorforanonymity:police,humanrightsworkers,spies—andcriminals(e.g.,RossUlbrichtofSilkRoadfame)
l Mentalmodel:nested,sealedenvelopes
A D
B C
G
F
E
40
CloudCompu_ng
41
What’saCloud?l Acloudisatradi_onalwayto
representanetwork
l This“three-cloudnetwork”pictureisfrom1982
l But—today“cloud”referstocompu_ngservicesprovidedviatheInternetbyanoutsideparty.
l (Themodernusageseemstodateto1996:hAp://www.technologyreview.com/news/425970/who-coined-cloud-compu_ng/)
42
“ViatheInternet”l Theserviceisnotprovidedon-premises
l AnInternetlinkisnecessary
l Thislinkprovidesanopportunityforintercep_on,lawfulorotherwise
43
“OutsideParty”l Bydefini_on,cloudservicesareprovidedbyanoutsideparty
l Similarinspirittothecompu_ngand_me-sharingservicebureaus,whichdatebacktothe1960s
l Notthesameasacompany’sownremotecompu_ngfacilityl Organiza_onscanhavea“privatecloud”,butthelegalissuesmaybeverydifferent
44
Compu_ngServicesl Manydifferenttypesofservices
l Storagel Compu_ngl Applica_onsl Virtualmachinesl More
45
Storagel Diskspaceinaremoteloca_on
l Easilyshared(andoutsidethecorporatefirewall)
l Olenreplicatedforreliabilityl Replicascanbeondifferentpowergrids,earthquakezones,countries,con_nents,etc.l Datacanbemoved—ormove“byitself”—tobeclosertoitsusers
l Expandable
l Someoneelsecanworryaboutdiskspace,backups,security,andmore
l Examples:Dropbox,GoogleDrive,Carbonite(forbackups),AmazonS3
l Mentalmodel:secure,self-storagewarehouse
46
Compu_ngl Rentcompu_ngcyclesasyouneedthem
l Payonlyforwhatyouuse
l Olenusedinconjunc_onwiththeprovider’scloudstorageservice
l Examples:AmazonEC2,MicrosolAzure,GoogleCloudl Dropboxisacloudservicethatusesadifferentprovider’scloudstorage
l Mentalmodel:callingupatempagencyforseasonalemployees
47
Applica_onsl Providerrunspar_cularapplica_onsforclients
l Commontypes:websites,emailservices
l Lesscommontypes:sharedwordprocessing,payrolls
l Well-knownproviders:Google’sGmailandDocs,Microsol’sOutlookandOffice360,Dreamhost(webhos_ng)
l Mentalmodel:engagingacontractorforspecifictasks
48
PlayinganAc_vePart:GoogleDocsl Someone,usingaWebbrowser,createsadocument
l Standardforma|ngbuAons:font,italicsorbold,copyandpaste,etc.
l Otherswhohavetheproperauthoriza_on(some_mesjustaspecialURL)caneditthedocumentviatheirownWebbrowsers
l Thechangesmadebyoneusershowupinreal1meinallotherusers’browserwindows
l Inotherwords,Googleisnotjustapassiverepository;itisno_cingchangesandsendingthemoutimmediately
49
VirtualMachinesl Normaldesktops:anopera1ngsystem(e.g.,MicrosolWindows)runsthecomputer;applica_onsrunontopoftheopera_ngsystem
l Virtualmachines:ahypervisorrunningonasinglecomputeremulatesmul_plerealcomputers.Adifferentopera_ngsystemcanrunoneachoftheseemulatedcomputers—andeachoneisindependentoftheothersandisprotectedfromit
l Neteffect:manycomputersthatconsumethespaceandpowerrequirementsofasinglecomputer
l Mentalmodel:rentedofficespace
50
Loca_onofCloudServersl Responsivenessofandeffec_vebandwidthtoaserverislimitedbyhowfarawayitisl Theproblemisthespeedoflight—andnotevenSiliconValleycanovercomethatlimit!
l Ittakesaminimumofaquarter-secondtosetupasecureconnec_onfromWashingtontoParis,andtwicethattoNewDelhi
l Forperformancereasons—andindependentofpoli_calandlegalconsidera_ons—largecloudprovidersthereforeplaceservercomplexesinmanyplacesaroundtheworldl Also:takeadvantageofcheappowerandcooling
51
WhereisDataStored?l Modernemail:ontheserverandononeormoredevices
l Userscan’teasilytellwhat’sontheirdevice(e.g.,phoneorlaptop)versuswhatisretrievedfromtheserverondemand
l Itdiffersfordifferentdevicesatdifferent_mes,andmaydependontheuser’srecentac_vity
l Whatifthedeviceandserverareindifferentjurisdic_ons?
l (AbadfitfortheassumedbehaviormodelofStoredCommunica_onsAct)
52
SecurityandPrivacyIssuesl Gmail:Googleapplica_onsscanemailandserveupappropriateads
l Dropbox:usesAmazonS3foractualstorage;encryptsdatasothatAmazoncan’treadit—butDropboxcan
l SpiderOak:dataisencryptedwiththeuser’spassword;SpiderOakcan’treadit
l Outlook.com:blocksfileaAachmentsthatfrequentlycontainviruses
l Many:checkpicturesforknownchildpornography
l Many:spamfiltering
53