University of Washington Computing & Communications Networking Update Terry Gray Director, Networks & Distributed Computing University of Washington UW Medicine IT Steering Committee 16 January 2004 20 February 2004
Jan 08, 2016
University of Washington Computing & Communications
Networking Update
Terry GrayDirector, Networks & Distributed Computing
University of Washington
UW Medicine IT Steering Committee16 January 2004
20 February 2004
University of Washington Computing & Communications
Outline
• In our last episode…– Context– Expanded Partnership– Recent Problems
• Today– Systemic Problems and Progress– Network Security Chronology– Design Issues
University of Washington Computing & Communications
Context: A Perfect Storm Increased dependency on network apps Decreased tolerance for outages Decades of deferred maintenance... Inadequate infrastructure investment Some old/unfortunate design decisions Some extraordinarily fragile applications Fragmented host management Increasingly hostile security environment Increasing legal/regulatory liability Importance of research/clinical leverage
University of Washington Computing & Communications
Key Elements of the Partnership
Changed: C&C now responsible for... In-building network implementation and
operational support for med ctrs, clinics Med center network design “for real”
Not Changed: C&C still responsible for... Network backbone, routers Regional and Internet connectivity SoM and Health Sciences networking
University of Washington Computing & Communications
Why the Partnership Makes Sense Consistency, interoperability, manageability Leverage C&C networking expertise Clinical/research hi-performance network needs 24x7 Network Operations Center (NOC) Advanced network management tools Avoid design/build organizational conflicts Beyond the network...
hope to share distributed system architecture and network computing expertise
University of Washington Computing & Communications
Recent Problems Oct 29: Partial router failure reveals escalation
procedure problems Oct 30: Security breach triggers connectivity and
server problems Nov 12: 13 minute power outage triggers extended
server outage Dec 12: Router upgrade uncovers wiring error, which
triggers multicast storm
(None of these were related to the network transition, save perhaps timing of #4)
University of Washington Computing & Communications
System Elements
Environmentals (Power, A/C, Physical Security) Network Client Workstations Servers Applications Personnel, Procedures, Policy, and Architecture
Failures at one level can trigger problems at another level; need Total System perspective
University of Washington Computing & Communications
Reasonable Questions
What’s up with C&C’s alarm system vendor? If power was out for only 14 minutes, why was service
out for multiple hours? What can we say about an app so fragile that a net
interruption of a few seconds requires a server reboot? What can we say about thin clients built on top of
thick (WinXP) operating systems? What can we say about a network where one wiring
fault can disable most of the net?
University of Washington Computing & Communications
Systemic Problems and Progress
University of Washington Computing & Communications
Systemic Network Problems(NB: these pre-date Tom et al)
Old infrastructure (e.g cat 3 wire) Non-supportable technologies (e.g. FDDI) Non-supportable (non-geographic) topology Expensive shortcuts (e.g. cat5 mis-terminated) Security based on individual IP addresses Subnets with clients and critical servers Documentation deficiency
Contact database Device location database Critical device registry
University of Washington Computing & Communications
Systemic General Problems
Ever-increasing system complexity, dependencies Departmental autonomy Un-controlled hosts Un-reliable power and A/C in equipment rooms No net-oriented application procurement standards
Are HA and DRBR expectations realistic? Are backup plans workable?
University of Washington Computing & Communications
Some Numbers
UW Total(incl UWMedicine)
HealthSciences(incl SoM)
MedicalCenters
Subnets 1022 52 145
Devices 70,000 >8,000 10,000
University of Washington Computing & Communications
Network Device Growth
Note: Most dips reflect lower summer use; last one is a measurement anomaly
University of Washington Computing & Communications
Network Traffic Growth (linear)
University of Washington Computing & Communications
Network Traffic Growth (log)
University of Washington Computing & Communications
Near-term Progress and Plans Agreement on standard maintenance window Created “Top 10” list --creeping to Top 20 :) Static addressing work-around (success!) FDDI, VLAN elimination Subnet splits/upgrades (1500 computers) Equipment upgrades Router consolidation, dedicated subnets, separate med
center backbone Equipment, outlet location database updates Initial wireless deployment
University of Washington Computing & Communications
Design Review and Cost Estimates
Biggest cost: physical infrastructure & wireplant upgrades
NetVersant engaged for cost estimation project Cisco engaged for network architecture review We recommend similar reliability/design
assessment for servers, apps & procedures
University of Washington Computing & Communications
Design Issues
University of Washington Computing & Communications
Design Tradeoffs
Networks = Connectivity; Security = Isolation Fault Zone size vs. Economy/Simplicity Reliability vs. Complexity Prevention vs. (Fast) Remediation Security vs. Supportability vs. Functionality
Differences in NetSec approaches relate to: Balancing priorities (security vs. ops vs. function) Local technical and institutional feasibility
University of Washington Computing & Communications
Tradeoff Examples• Defense-in-depth conjecture (for N layers)
– Security: MTTE (exploit) N**2
– Functionality: MTTI (innovation) N**2
– Supportability: MTTR (repair) N**2
• Perimeter Protection Paradox (for D devices)– Firewall value D– Firewall effectiveness 1 / D
• Border blocking criteria– Threat can’t reasonably be addressed at edge– Won’t harm network (performance, stateless block)– Widespread consensus to do it
• Security by IP address
University of Washington Computing & Communications
Network Security Credo
• Focus first on the edge(Perimeter Protection Paradox)
• Add defense-in-depth as needed
• Keep it simple (e.g. Network Utility Model)
• But not too simple (e.g. offer some policy choice)
• Avoid – one-size-fits-all policies– cost-shifting from “guilty” to “innocent”– confusing users and techs (“broken by design”)
University of Washington Computing & Communications
Preserving the Net Utility Model
• What is it?• Why important?• Incompatible with perimeter security?• Too late to save?• NUM-preserving perimeter defense
– Logical Firewalls– Project 172
• Foiled by static IP addressing…– Requires all hosts be reconfigured
University of Washington Computing & Communications
Lines of Defense
• Network isolation for critical services.
• Host integrity. (Make the OS is net-safe.)
• Host perimeter. (Add host firewalling)
• Server sanctuary perimeter.
• Network perimeter defense.
• Real-time attack detection and containment.
University of Washington Computing & Communications
Network Security Chronology• 1990: Five anti-interoperable networks• 1994: Nebula shows network utility model viable• 1998: Defined border blocking policy• 2000: Published Network Security Credo• 2000: Added source address spoof filters• 2000: Proposed med ctr network zone• 2000: Proposed server sanctuaries• 2001: Ban clear-text passwords on C&C systems• 2001: Proposed pervasive host firewalls• 2001: Developed logical firewall solution• 2002: Developed Project-172 solution• 2003: Slammer, Blaster… death of the Internet• 2003: Developed flex-net architecture
University of Washington Computing & Communications
Next-Gen Network Architecture Parallel networks; more redundancy Supportable (geographic) topology Med center subnets = separate backbone zone Perimeter, sanctuary, and end-point defense Higher performance High-availability strategies
Workstations spread across independent nets Redundant routers Dual-homed servers
University of Washington Computing & Communications
Success Metrics
Tom’s Nobody gets hurt Nobody goes to jail
Terry’s “Works fine, lasts a long time” Low ROI (Risk Of Interruption)
Steve’s Four Nines or bust!
University of Washington Computing & Communications
Success Metrics II
We all want: High MTTF, Performance and Function Low MTTR and support cost
The art is to balance those conflicting goals we are jugglers and technology actuaries
University of Washington Computing & Communications
Success Metrics III
How many nines? Problem one: what to measure?
How do you reduce behavior of a complex net to a single number?
Difficult for either uptime or utilization metrics
Problem two: data networks are not like phone or power services… Imagine if phones could assume anyone’s number Or place a million calls per second!
University of Washington Computing & Communications
Concerns, Future Challenges Mitigating impact of closed networking:
Needs of the many vs. needs of the few Pressure to make network topology match administrative boundaries Complex access lists False sense of security Increased MTTR
Next-generation threats: firewalls won’t help Security vs. High-Performance Wireless Balancing innovation, operations, & security
University of Washington Computing & Communications
Lessons Five 9s is hard (unless we only attach phones?) Even host firewalls don’t guarantee safety Perimeter firewalls may increase user confusion, MTTR Nebula existence proof: security in an open network Even so… defense-in-depth is a Good Thing It only takes one compromise inside to defeat a firewall Controlling net devices is hard --hublets, wireless The cost of static IP configuration is very high Net reliability & host security are inextricably linked Never underestimate non-technical barriers to progress
University of Washington Computing & Communications
Questions? Comments?