1 CSE 524: Lecture 16 Application layer (Part 3)
Dec 31, 2015
2
Administrative
• Programming assignment– Due Monday, 12/1
– Send to Kiran
• Thanksgiving holiday– Lecture on Wednesday 11/26 optional
• Security issues not covered in CSE 506
• DoS, Traceback, Intrusion detection, etc.
3
Application Layer
• So far– Application layer functions
– Specific applications• HTTP
• DNS
• SMTP/POP
• Today– More applications
• FTP
• P2P
– Internet challenges
4
AL: ftp: the file transfer protocol
• transfer file to/from remote host
• client/server model
– client: side that initiates transfer (either to/from remote)
– server: remote host
• ftp: RFC 959
• ftp server: port 21
file transfer FTPserver
FTPuser
interface
FTPclient
local filesystem
remote filesystem
user at host
5
AL: ftp: separate control, data connections
• ftp client contacts ftp server at port 21, specifying TCP as transport protocol
• two parallel TCP connections opened:– control: exchange commands,
responses between client, server.“out of band control”
– data: file data to/from server• ftp server maintains “state”: current
directory, earlier authentication– Note the difference to HTTP
authentication
• Protocol allows one ftp client to initiate a transfer between two ftp servers
FTPclient
FTPserver
TCP control connection
port 21
TCP data connectionport 20
6
AL: ftp commands, responses
Sample commands:• sent as ASCII text over
control channel• USER username• PASS password• LIST return list of file in
current directory
• RETR filename retrieves (gets) file
• STOR filename stores (puts) file onto remote host
Sample return codes• status code and phrase (as in
http)• 331 Username OK, password required
• 125 data connection already open; transfer starting
• 425 Can’t open data connection
• 452 Error writing file
7
AL: ftp, NAT and the PORT command
• Normal FTP mode– Server has port 20, 21 reserved– Client initiates control connection by connecting to port 21 on server– Client allocates port X for data connection– Client passes the data connection port (X) and its IP address in a PORT
command to server– Server parses PORT command and initiates connection from its own
port 20 to the client on port Y– What if client is behind a NAT device?
• NAT must capture outgoing connections destined for port 21• What if NAT doesn’t parse PORT command correctly?• What if ftp server is running on a different port than 21?
• http://www.practicallynetworked.com/support/linksys_ftp_port.htm
8
AL: ftp, NAT, and the PORT command
• Passive (PASV) mode– Client initiates control connection by connecting to port 21 on server– Client enables “Passive” mode– Server responds with PORT command giving client the IP address and
port to use for subsequent data connection (usually port 20, but can be bypassed)
– Client initiates data connection by connecting to specified port on server– Most web browsers do PASV-mode ftp– What if server is behind a NAT device?
• See client issues
– What if both client and server are behind NAT devices?• Problem• Similar to P2P xfers
9
AL: P2P file sharing
Example
• Alice runs P2P client application on her notebook computer
• Intermittently connects to Internet; gets new IP address for each connection
• Asks for “Hey Jude”
• Application displays other peers that have copy of Hey Jude.
• Alice chooses one of the peers, Bob.
• File is copied from Bob’s PC to Alice’s notebook: HTTP
• While Alice downloads, other users uploading from Alice.
• Alice’s peer is both a Web client and a transient Web server.
All peers are servers = highly scalable!
10
AL: P2P: centralized directory
original “Napster” design
1) when peer connects, it informs central server:– IP address
– content
2) Alice queries for “Hey Jude”
3) Alice requests file from Bob
centralizeddirectory server
peers
Alice
Bob
1
1
1
12
3
11
AL: P2P: problems with centralized directory
• Single point of failure
• Performance bottleneck
• Copyright infringement
file transfer is decentralized, but locating content is highly centralized
12
AL: P2P: decentralized directory
• Each peer is either a group leader or assigned to a group leader.
• Group leader tracks the content in all its children.
• Peer queries group leader; group leader may query other group leaders.
ordinary peer
group-leader peer
neighoring re la tionshipsin overlay network
13
AL: More about decentralized directory
overlay network
• peers are nodes
• edges between peers and their group leaders
• edges between some pairs of group leaders
• virtual neighbors
bootstrap node
• connecting peer is either assigned to a group leader or designated as leader
advantages of approach
• no centralized directory server– location service distributed
over peers
– more difficult to shut down
disadvantages of approach
• bootstrap node needed
• group leaders can get overloaded
14
AL: P2P: Decentralized, flat search via flooding
• Gnutella • no hierarchy• use bootstrap node to learn
about others• join message
• Send query to neighbors• Neighbors forward query• If queried peer has object, it
sends message back to querying peer
join
15
AL: P2P: more on query flooding
Pros
• peers have similar responsibilities
• no central registry, no group leaders, entirely flat
• highly decentralized
• no peer maintains directory info
Cons
• excessive query traffic
• query radius: may not have content when present
• bootstrap node
• maintenance of overlay network
16
AL: P2P: BitTorrent
• Previous systems– Sources/servers
• Have entire copy of content
• Keep content available for others to download
– Clients• Connect to sources/servers with full copy
– Problems• Must force clients to become sources/servers for P2P application to work
• Can’t redistribute data blocks of a file until it is received in its entirety (large files problematic)
• No control of content (must rely on naming and popularity to infer integrity)– “URL-based P2P?”
• Slides courtesy of Karthik Tamilmani
17
AL: Philosophy
• Author: Bram Cohen
• Based on Tit-for-tat– Incentive - Uploading while downloading
– Get preference in downloading if you supply good upload
• Pieces of files
• Components– Ordinary web server to serve up metainfo file (.torrent)
– Client web browser
– BitTorrent tracker (location of which is specified as a URL in .torrent)
– Original downloader (the first “seed”)
– Client downloader
18
AL: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Leech]
TrackerWeb Server
.torr
ent
OriginDownloader
[Seed]
Peer
[Seed]
19
AL: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
Get-announce
Web Server
OriginDownloader
[Seed]
20
AL: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
Response-peer list
Web Server
OriginDownloader
[Seed]
21
AL: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
Shake-hand
Web Server
Shake-hand
OriginDownloader
[Seed]
Shake-hand
22
AL: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
pieces
pieces
Web Server
OriginDownloader
[Seed]
pieces
23
AL: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
piecespieces
pieces
Web Server
OriginDownloader
[Seed]
pieces
24
AL: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Leech]
Tracker
Get-announce
Response-peer list
piecespieces
pieces
Web Server
OriginDownloader
[Seed]
pieces
Peer
[Seed]
25
AL: Messages
• Peer – Peer messages– TCP Sockets
• Peer – Tracker messages – HTTP Request/Response
• B-encoding• http://bitconjurer.org/BitTorrent/protocol.html
26
AL: .torrent
• URL of the tracker• Dictionary keys for B-encoding scheme• Pieces <hash1,hash2,….hashn> • Piece length• Name• Length• Files
– Path
– length
27
AL: Tracker
• Peer cache– IP, port, peer id
• State information– Completed
– Downloading
• Returns random list
30
AL: Peer Operation
• Choking algorithm– Choke/Unchoke
• Avoid large numbers of TCP connections
• Optimistic unchoke for a peer, rotates every 30 sec.
• Preferred peers do not get choked
– Snubbing behavior• Prevented by Anti-snubbing.
• Upload to interested peers who are not choking.
31
AL: Peer Operation
• Verify on receiving complete piece• Endgame Behavior
– Cancel
– To fix the “waiting” on the last slow peer
34
AL: Strengths
• Better bandwidth utilization– Never before speeds.
• Up to 7 MB/s from the Internet.
• Limit free riding – tit-for-tat• Limit leech attack – coupling upload & download• Ability to resume a download
35
AL: Drawbacks
• Small files – latency, overhead• Random list of peers - naive• Scalability
– Millions of peers – Tracker behavior (uses 1/1000 of bandwidth)
– Single point of failure
• Robustness– System progress dependent on altruistic nature of seeds (and
peers)
– Malicious attacks and leeches.
36
AL: Interesting links
• Official site: http://bitconjurer.org/BitTorrent• BitTorrent FAQ: http://btfaq.com• Torrent sites
– http://f.scarywater.net
– http://www.suprnova.org
– http://tvtorrents.com
Remember
– leave your download windows open
– Big brother is watching!
37
End of application layer and all layers
• What’s next?– Internet challenges
• Doing this one out-of-order because of holiday
– Lecture on security issues
– Putting it all together lecture (saving this until Monday)
38
Internet challenges
• Not a complete list– Address depletion (IPv4, IPv6)
– NAT and the loss of transparency
– Routing infrastructure
– Quality of service
– Security
– DNS scaling
– Dealing with privatization
– Interplanetary Internet
39
Address depletion
• IPv4: 32-bit address (4.3 billion identifiers)– 25% in use 960 million addresses (advertised in BGP tables)
– http://www.caida.org/outreach/resources/learn/ipv4space/
– Inactive IP addresses advertised as well
– Estimated 86 million active (July 2000)
– http://www.netsizer.com/
– Do we need more addresses?
• IPv6: 128-bit address
• NAT: extend current space via hack
41
NAT
• Network address translation
• Source and destination IP addresses and (sometimes) ports rewritten by device
• Rewritten without knowledge of end-hosts• Translation typically performed only on IP address
portion of packet not on addresses within data• Envelope analogy
– Return address on outside changed
– Return address on inside unchanged
– Application data must be rewritten to maintain consistency
42
NAT
• What’s bad about NAT?– Breaks transparency of IP
– Breaks hourglass and end-to-end principles (network must be changed for new applications to be deployed)
– FTP, servers, P2P services and NAT
– SIP, conferencing applications
– Breaks IPsec
– Man-in-the-middle attacks
• What’s good about NAT?– Renumbering easy
43
NAT
• Application writing before NAT– New applications require no changes to be deployed on the
Internet
– New applications require no changes in the Internet to be deployed
• Application writing after NAT– All new applications must be written with explicit
knowledge of intermediate devices which rewrite network and application information
44
Routing infrastructure
• http://www.telstra.net/ops/bgptable.html
• Backbone routers must keep table of all routes (75000 entries)
• Growth of table size– Alleviated with CIDR aggregation and NAT
– Potentially exacerbated if portable addressing and multi-homing used
• Routing instability– Frequency of updates increases with size
– Update damping occurring already
– Did not respond well under Code Red• BGP sessions broke
• Update floods exacerbating load
• Potential for breakdown in connectivity
46
Routing infrastructure
• Reducing state in the network– Global state at every backbone router
– Other non-global approaches?• Ambulance routing
• Airplane routing
• Landmark routing
• Chess games
• Limited-distance look-ahead
• Better scaling properties
47
Routing infrastructure
• Non-adaptive routing on backbone– Early-exit (hot-potato) routing
• Tier 1 ISPs route traffic solely on whether destination is within network
• Limited alternative paths
• Limited robustness and poor performance
48
Routing infrastructure
• Scaling routing performance– Lambda switching, MPLS
• DWDM requires extremely fast forwarding
• At edges, map traffic based on IP address to wavelength or other non-IP label
• Wavelength or label switch across multiple hops to other edge
• Eliminate intermediate IP route lookups
– Faster IP lookups• Data structures and algorithms for fast lookups
49
Routing Infrastructure
• Other challenges– Policy-based routing, packet classification
– Non-destination-based routing
– Route-pinning for QoS
50
Quality of service
• Predictable performance
• “Weak-link” phenomenon• Requires
– ISP agreements
– Global support for QoS• Applications
• OS
• All devices in the network (routing failures, updates, queuing)
– Problem• Unpredictable media: Ethernet, 802.11, etc….
51
Security
• Anonymity of IP– Sender fills in its address
– Connectivity over security
• Spoofing and DDoS• IP traceback
– http://www.acm.org/sigs/sigcomm/sigcomm2001/p1.html
• Ingress filtering– http://www.ietf.org/rfc/rfc2827.txt
52
Security
• DNS is weak because it is centralized– 13 root name servers
– Limited due to packet size constraints
• Routing is weak because it is decentralized– Rogue source sending updates (anecdotal evidence)
• What if I advertise all possible routes through my modem?
• S-BGP
– Convergence problems
• L0pht– May 1998: 30min to shut down Internet
53
DNS scaling
• Relatively flat structure
• 13 centralized TLD name servers• .com servers overloaded• DNS used as a directory service• Internet directory service?
– RealNames
– AOL Keywords
54
Dealing with Privatization
• Improving routing instability, traffic characterization, security, etc. difficult
• Finding sources of disruption (software, hardware, users) difficult
• Problems are hidden not shared• Open standards in the face of commercial interests
– Patents on protocols– Closed protocols
• ICQ, AIM, Hotmail
– Potential for closed networks• Cable network consolidation, ISP consolidation
• D. Clark, J. Wroclawski, K. Sollins, R. Braden, “Tussle in Cyberspace: Defining Tomorrow’s Internet”
55
Interplanetary Internet, Sensor networks
• Interplanetary:– Extremely long round-trip times, large feedback delays
– Protocols designed with terrestrial timeout parameters– See Vint Cerf’s web page
• Sensor networks– Extremely lossy links– Disconnected operation– See SenSys 2003 program– K. Fall, “A Delay Tolerant Networking Architecture for
Challenged Internets”, SIGCOMM 2003