1
Simulating the Internet:challenges & methods
Kevin FallNetwork Research Group,
Lawrence Berkeley National Laboratory
Berkeley, CA
USA
2
LBNL’s Network Research Group:Members:
Van Jacobson, group leaderKevin Fall
Sally Floyd *Craig Leres
Vern Paxson *
http://www-nrg.ee.lbl.gov
3
Outline
• Simulating the Internet is not easy
• The VINT project: an effort in Internet-style simulation
4
Simulations for Network Research
• Models of interesting behavior• Easily-varied parameters• Controlled environment, reproducible results
5
Problems in Characterizing the Internet• Large Scale:Large Scale:
– even a small fraction of misbehaving entities is non-negligible
– scale stresses assumptions in protocol design and implementation
• Drastic Change:Drastic Change:– will the rate of change continue?
– predominant use not obvious (e.g. the web, continuous media, ?)
• Heterogeneity everywhere!
6
Link and Topology Heterogeneity
• Delay and bandwidth span 5 to 6 orders of magnitude!– 20sec to 2s round-trip prop delay
– 10Kb/s to 10Gb/s bandwidth range
• Topology– hierarchy and clustering chosen by ISPs
– performance tied to which path packets take in network
– paths may change dynamically
– IP routes are frequently asymmetric
7
Protocol Heterogeneity• Adaptive and non-adaptive Internet protocols
– react to congestion (TCP)– nonreactive (UDP)
• Application Dynamics– multi-protocol interactions– user activity– application mix varies greatly by site
• Implementations may not be consistent
8
Traffic
• Internet traffic not easily characterized– no commonly accepted model
– traffic may be shaped by congestion response
• Dependent on source behavior– application protocol limitations
– new applications
– pricing policies
9
So, what can be done in simulation?
• StrategyStrategy
– 1: Look for invariants
– 2: Explore the parameter space
– 3: Understand the limits of simulation
10
1: Searching for Invariants
• What do we really know about Internet dynamics?
• How to characterize statistically?– traffic– users– sessions– congestion, etc.
• Mathematical simplicity does not imply accuracy
11
The Self-Similar Nature of Traffic
• packet arrivals not exponentially distributed– thus, arrival process is not Poisson– bursts over multiple time-scales– they exhibit long-range dependence– suggests self-similar models– (there is still contention on this point)
• Implications– aggregation does not “smooth out” variation– traffic synthesis more difficult– network buffering may be much less effective than thought based on
Markovian models
12
User-generated Sessions look Poisson
• user-generated session arrivals look Poisson (machine-generated connection arrivals are not)
• distribution is invariant, parameterized only by a (fixed, hourly) rate
13
Network Activity tends to have a heavy-tailed distribution
• Examples: packets in a user’s TELNET session; bytes in FTP-DATA transfers
• distribution looks Pareto with 0.9 < • Pareto distribution with shape has:
– infinite mean if – infinite variance if
• This type of Pareto has infinite mean and variance (and is very unlike an exponential)
• burstiness remains across aggregation
14
2: Exploring the Parameter Space
• Consider a large range for parameters– recall, 5-6 orders of magnitude range in bandwidth and delay
– note that behavior is often non-linear in parameter values
• Repeat, repeat, repeat– topology generators
– randomness
15
3: The Limits of Simulation
• Simplified Models– useful for gaining intuition and exploring parameters
– danger of oversimplification
• Need for a Reality Check– compare simulation results with measurement
– Internet measurements often offer “surprises”
16
• USC/ISI: Deborah Estrin, Mark Handley, John Heideman, Ahmed Helmy, Polly Huang, Satish Kumar, Kannan Varadhan, Daniel Zappala
• LBNL: Kevin Fall, Sally Floyd• UCBerkeley: Elan Amir, Steven McCanne• Xerox PARC: Lee Breslau, Scott Shenker• VINT is currently funded by DARPA through mid-
1999
The VINT Project(Virtual InterNet Testbed)
17
VINT Goals
• provide common platform for network research• explore issues of scale and multi-protocol
interaction
• Specific Areas:Specific Areas:– multicast, end-to-end transport– simulation scaling– traffic management– emulation
18
Multicast Research• Reliable Multicast Transport
– Large Scale
– “SRM”-- Scalable Reliable Multicast
• Multicast Congestion Management– Group formation
– (still ongoing)
• Layered Transmission– layered encoding
– dynamic multi-group join/leave
19
Simulation Scaling
• Simulator capable of 1000s of nodes
• Want 100,000s of nodes (or more)
• “Session” Abstraction– abstract away some simulation details– trade detail for time/space– scales simulation by about 10X
20
Traffic Management• Active Buffer Management
– Random Early Detection Gateways
– Explicit Congestion Notification (ECN)
• Packet Scheduling– Class-Based Queuing (CBQ)
– Round-Robin and Fair Queuing Variants
• Differentiated Services– Admission Control
– Reservation Support
21
Emulation
• Interface Simulator with Live Network
• Live Traffic Passes through Simulated Topology
• Special “Real-Time” Scheduler– may not keep synchronized under load
22
The VINT Simulation Environment
• Components: ns2 ns2 and namnam• NS2 (network simulator, version 2):
– Discrete-event C++ simulation engine• scheduling, timers, packets
– Split Otcl/C++ object “library”• protocol agents, links, nodes, classifiers, routing, error
generators, traces, queuing, math support (random variables, integrals, etc)
• Nam (network animator)– Tcl/Tk application for animating simulator traces
• available on UNIX and Windows 95/NT
23
NS Supported Components• Protocols:
– TCP (2modes + variants),UDP, IP, RTP/RTCP, SRM, 802.3 MAC, 802.11 MAC
• Routing– global topology map, classifiers
– static unicast, dynamic unicast (distance-vector), multicast
• Queuing and packet scheduling
– FIFO/drop-tail, RED, CBQ, WRR, DRR, SFQ
• Topology: nodes, links Failures: link errors/failures
• Emulation: interface to a live network
24
TCP Animation
25
SRM Animation
26
Benefits• Common simulation environment
– simulations expressed in scripting language
– separate visualization tool
– topology and “scenario” generators
– modular structure is extensible; sources provided
• Unique Features– Rich Protocol Set
– “Session” abstraction• provides scaling simulations by a factor of 4
– Visualization and Emulation capabilities• separate Network Animator (nam) tool
• low-level interface to system’s protocols
27
The NS Architecture
• Simulator is a Object-Tcl “shell”
• Split Objects– fine-grain, easily composed
– objects exist both in C++ and Tcl Context
– library handles object consistency
28
Work in Progress
• Adaptive Web Caching (LBNL, UCLA)
• Nam Improvements (USC, ISI)
• Simulator Scaling (USC, ISI)
• Simulator Addressing Hierarchy (USC, ISI)
• Protocol Robustness (USC, ISI)
• Emulation (LBNL, UCB)
• Quality of Service (Xerox PARC)
• Router-Based Congestion Control (LBNL)
• Topology and “Scenario” Generation
29
Router-Based Congestion Control
• Two main classes of traffic on Internet:– TCP (reduces sending rate in face of loss)
– UDP (application decides when and how much to send)
• Internet stability due in large part to TCP’s congestion response
• Danger with growing use of UDP-based applications– UDP will “steal” bandwidth from TCP
– currently no incentives to prevent this behavior
30
Encouraging Congestion Control
• Combine RED Gateway with analysis and regulation
• RED (Random Early Detection) Gateways:– keep smoothed average queue size measure
– when measure exceeds threshold, drop or mark packets with increasing probability
– a flow’s fraction of the aggregate random packet drop rate is roughly equal to it’s fraction of the aggregate arrival rate
• Select candidate “bad” flows with high drop rate
31
“Bad” Flow Selection Criteria
• Flow is not “TCP-friendly”– throughput exceeds factor times analytic model:
• Flow is not responsive– does not alter arrival rate with increased packet drops
• Flow is “high-bandwidth”– uses more than it’s “fair share”
pR
B3
2/5.1yprobabilit droppacket
timetrip-roundpath
sizepacket
p
R
B
32
Flow Regulation
• Need bandwidth-regulating packet scheduler– CBQ
– others
• Use “good” and “bad” scheduling partitions• Bad partition gets allocation below current usage
– decays over time with continued offered load
– flows may be reclassified as “ok” if they adapt
33
Conclusion
• Simulating the Internet is difficult
• Simulation is useful, but must be used carefully
• The VINT project a common simulation framework that addresses many of the issues
34
Additional Information
• Web pages:– http://www-nrg.ee.lbl.gov/
– http://www-mash.cs.berkeley.edu/ns
– http://netweb.usc.edu/vint
– http://www.ito.darpa.mil/Summaries97/E243_0.html
• NS Users Mailing list:• [email protected]• “subscribe ns-users”