Top Banner
Network Management Issues, Approach & A Candidate Solution William J. Burkhard Associate Director The Center for Electronic Design, Communications and Computing James F. Carras Network Coordinator The Center for Electronic Design, Communications and Computing Thomas G. Long Senior Systems Analyst The Center for Electronic Design, Communications and Computing 09 September 2002 College of Engineering The Pennsylvania State University
21
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: PSU COE Network Management

Network ManagementIssues, Approach &

A Candidate Solution

William J. Burkhard

Associate Director

The Center for Electronic Design, Communications and Computing

James F. Carras

Network Coordinator

The Center for Electronic Design, Communications and Computing

Thomas G. Long

Senior Systems Analyst

The Center for Electronic Design, Communications and Computing

09 September 2002

College of EngineeringThe Pennsylvania State University

Page 2: PSU COE Network Management

Network Management TopicsIssues

Network Management ElementsMonitoring Resources

Resource Condition Reporting

Monitoring Traffic Loads Benefits of Monitoring Traffic

Traffic Analysis & Filtering Importance of Traffic Analysis & Filtering

Resource Control Importance of Resource Control

Network Management Configuration

Operational Schema

Concept of Operations Flow Diagram

Concept of Operations Description

System Recommendation & Cost

Summary and Conclusion

Page 3: PSU COE Network Management

Network ManagementIssues

By May 2002, the College’s network architecture migrated from a centrally manageable “Star” configuration to a completely distributed architecture.

The distributed architecture increased the College’s dependency on the backbone architecture of the University’s Information Technology Services managed systems.

A single point of failure exists in the new distributed network management schema.

Loss of network connectivity to the College’s server farm in Hammond Building results in reports that the all College network resources are in a failure mode.

To adequately manage the College’s new network architecture requires a philosophical change to the approach to accomplishing network management.

Page 4: PSU COE Network Management

Network ManagementMonitoring Resources

NagiosHost & Services Monitoring

Runs Intermittent checks on hosts and services.

Sends notifications when problems are encountered.

Web Browser accessible hardware status, historical logs and reports.

Network Services Monitored SMTP, POP3, HTTP, NNTP, Ping…

Functionality Ability to define network host hierarchy. Ability to send contact notification of

problems via email and pager. Ability to define event handlers. Ability to apply on-the-fly command

interface modifications.

Page 5: PSU COE Network Management

Network ManagementResource Condition Reporting

Equipment Operational StatusNagios provides the capability to establish control parameters and rules for monitoring and reporting on network closet router and Ethernet switch up or down time status.

When a system’s performance falls outside the performance parameters of rule sets, an alert is generated. The alert can take one form or multiple forms; the software alert can one or more individuals via email, pager, a cell phone message, instant message, SMS, etc.

The software’s configurability supports network host hierarchy for the detection of and distinction between hosts that are down and those that are unreachable.

The web interface permits the viewing of network element status and facilitates on-the-fly configurations.

A web accessible external command interface enables the application of system monitoring and notification behaviors; these behaviors can also be configured via third-party applications.

Page 6: PSU COE Network Management

Network ManagementMonitoring Traffic Loads

Multi Router Traffic Grapher (MRTG)

Feature Monitors traffic loads on

Network links.

Functionality Provides live visual

representation of port data traffic loads.

Provides histories of port data traffic loads on daily, weekly, monthly and annual basis.

Provides a web interface for viewing port traffic loads.

Willard–2 to 211 HammondIn & Out

Page 7: PSU COE Network Management

Network ManagementBenefits of Monitoring Traffic

Aside from a catastrophic network hardware failure reported by the resource monitoring application, Monitoring Traffic is a critical indication of when a real or potential problem may exist in the network.Monitoring data traffic rates down to the Ethernet switch level provides indications of when rates exceed the expected norm; this becomes the first indicator that a system is misbehaving, has been hacked and is broadcasting high outbound data rates over the network, or a subnet is under attack from outside the College’s network.Indications of high data rates allows the staff to efficiently, effectively and swiftly analyze conditions and respond to these unpredictable but certain events.As demands increase on the network’s bandwidth, traffic monitoring is also an activity that will indicate when available bandwidth needs to be increased to support faculty usage demands.

Page 8: PSU COE Network Management

Network ManagementTraffic Analysis & Filtering

EtherealFeatures

Network Protocol Analysis. Filters for refining displayed

packet summary information. Maintaining saved copies of

network trace information.

Functionality Live network data capture. Editing of capture files. Multi protocol filtering of 289

protocols. GUI browsing of network data.

Page 9: PSU COE Network Management

Network ManagementImportance of Traffic Analysis

Traffic analysis is a necessary element in the network management structure because it complements traffic load monitoring.Load monitoring provides the indication something is potentially wrong and traffic analysis provides the answer to “What” is going wrong.Traffic analysis monitors packet flow into and out of a network segment.Traffic analysis alerts the network management team when the packet analysis process identifies data traffic that my match known hacker data signatures.The configurability, monitoring flexibility and filtering capabilities all facilitate efficient threat analysis and identification.Knowing the nature and identity of an attack enables the network management team to react swiftly and appropriately to these events; the employment of router filters can block subsequent hacker attacks.Accurate information about an attack signature can be forwarded to the University Computer & Network Security Office for appropriate actions.

Page 10: PSU COE Network Management

Network ManagementResource Control

iBootFeatures

Web Addressable & Configurable

Remote Manual Control Automatic Failure Detection Password Protected Output Power Reset Switches up to 12 Amps @ 115

Volts

Functionality Remote Reboot of any Device Automatic Reboot on Loss of

Ping Response

Page 11: PSU COE Network Management

Network ManagementImportance of Resource Control

Resource control facilitates monitoring and power control to the most critical networking closet resource…the building router.Provides a mechanism to quickly recover from possible “hung” system conditions.In the automatic (ping) mode, the iBoot can detect the router’s failure to respond to pings; this state assumes that the router needs to be rebooted and the iBoot recycles device power.In the event of non restoration of router functionality after an auto reboot, the network management team can remotely access the iBoot to force another reboot attempt.The iBoot can also be used to remotely force a router reboot if the network management team determines the reboot may result in clearing an anomaly or other router problem.Prevents costly “down time” by possibly eliminating the need for a site visit by the network management team.

Page 12: PSU COE Network Management

Network ManagementNetwork Management Configuration

iBoot

Alcatel OmniCoreOC-5022 Router

Internal Modem

POTS Line

Dell ManagementServer

Applications

Page 13: PSU COE Network Management

Network ManagementOperating Schema

Dell Master Management

Server

Applications

Network OperationsControl Center

Room 151D Hammond

Status Loads Analysis

Web BasedDesktop & Remote

Monitoring

COENetwork

Connections

RemoteUser

Building 1

Building N

Page 14: PSU COE Network Management

Network ManagementOperating Schema Flow Diagram

CustomerCall

ResourceMonitoring

Alert

ReviewNagiosStatus

HardwareResponseProblem

YesRepair

No Review TrafficAnalysis Data to

Find Offending Sys

COE orExternalProblems

ReviewMRTG

Data Rates

Find & RepairOr Block Sys

COE

Review TrafficAnalysis Data ToIdentify Attack

Signature & Source

Place FilterOn RouterOr ServerFirewalls

GatherForensic

Data

External

Notify PSUSecurity

Page 15: PSU COE Network Management

Network ManagementConcept of Operation Description

Data communications problems usually manifest themselves in one of two ways:System Hardware Failures which can be either user or network based.Telephonic inquiries by Technical Contacts or users.

Appropriate immediate response is to determine if a network failure occurred in a building; this is accomplished by referring to the network status plots provided by Nagios.

Network system hardware failures are always followed by an almost immediate, visual and audible alerts from Nagios. Nagios also provides pager and email alerts to notify staff during non-working hours.Customer calls usually signify localized problems attributable to computer configuration issues, a computer-faceplate connection issue or other non–hard network system issue.

No immediate indication of a hardware problem can be an indicator that computer systems within or external to the College may be participating in hacker activities such as Denial of Service attacks. Responses to internal attacks are different from a response to external attacks; both types of responses rely on the analysis of data traffic.

A College system identified by the traffic analysis misbehaving on the network is removed or blocked from network access; once a system repair is verified, the systems is again given network access privileges. A system external to the College caught spamming or creating other problems for the network are filtered at a building router to prevent its access to and disruption of College computing.Forensic analysis of attacks are conveyed to the University Network and Computing Security Office for analysis and any appropriate follow-on actions.

The remaining element not shown on the flow diagram is the iBoot. This device is used to remotely cycle closet router AC power as a step in attempting to remotely clear an observed problem associated with this device.The main console in 151D Hammond provides continuous monitoring of all aspects of network management. Initial alerts and corrective actions can be taken locally on a desktop system; however, detailed troubleshooting, analysis and corrective actions are coordinated and accomplished in this facility.

Page 16: PSU COE Network Management

Network ManagementSystem Recommendation & Cost

Item Cost / Closet Total Quantity

Cost / Month

Total 1st Year Cost

Recurring Annual Cost

Closet Servers $1,285 18 $23,130

1 Gig Memory Upgrade

$239 18 $4,302

Data Probe iBoot $275 15 $4,125

USB Cameras $29 17 $439

TNS POTs Lines $125 Install $18/ mo.

17 $306 $5,797 $3,672

Pagers OR

Cell Phones

$1,155

$105

3 $7.50

$90

$1,177.50

$1,185

$22.50

$1,080

Totals $3,126

$2,076

$313.50

$396

$38,970.50

$38,978

$3,694.50

$4,752

Software Applications are currently provided as freeware.

Page 17: PSU COE Network Management

Network ManagementSummary & Conclusion

SummaryThe distribution of COE networking resources requires a change in network management philosophies an architecture.

Loss of network connectivity to TNS’ backbone from Hammond west results in a indication that the entire COE network is down.

A new COE network management architecture includes hardware and software solutions in all building closets were router connect to TNS’ backbone.

A minimal hardware architecture design and applications are presented.

Applications for critical network hardware status monitoring, data traffic and protocol/packet analysis are imperative elements in network management.

A concept for operational procedures ties together the concept for integrated COE network management.

Analysis and control is remotely accessible via web or Telnet sessions.

ConclusionThe Center is in need of a network management structure that provides accurate early notification of hardware and data traffic loads.

A minimal and cost effective network management architecture is proposed.

The proposed solution address all the basic needs for sound and qualitative network management.

The proposed network management architecture eliminates a single point of failure.

The proposed operational concept provided a formal structure for network management.

The proposed operational structure enable timely and accurate use of personnel resources to resolve problems.

Cell phones for three primary individuals are preferable to pagers; the initial cost is lower but recurring costs are higher. However, cell phones provide greater versatility.

Page 18: PSU COE Network Management

Network Management

Addendum

Additional Reporting Details

16 September 2002

Page 19: PSU COE Network Management

Network Management - NagiosResource Monitoring Reports

“Tactical” Overview of Resource AvailabilityStatus Reports - Instantaneous & Historical

Availability OverviewSummaryNetwork Grid AvailabilityNetwork Mapping3-D Mapping

Troubleshooting ReportsIdentification & Alerting to Service ProblemsIdentification & Alerting to Network OutagesNotification of Data Traffic Trends

OthersAvailability of each Monitored ElementHistory of Alerts Related to each Monitored ElementAnomaly NotificationsHost Resource Monitoring – Processor Loads, Disk & Memory Usage, Running Processes & Log FilesHierarchical Detection & Notification and Distinction of Services that are Down vs. UnreachableEscalation of Host and Services Notifications to Different Contact Groups.

Page 20: PSU COE Network Management

Network Management - MRTGMonitoring Traffic Reports

“The Multi Router Traffic Grapher (MRTG) is a tool to monitor the traffic load on network-links. MRTG generates HTML pages containing graphical images which provide a LIVE visual representation of this traffic.”Instantaneous monitoring of port daily “in” and “out” data traffic loads viewable with web interface.Histogram reports of cumulative port “in” and “out” data traffic loads viewable with web interface: weekly, monthly and yearly.MRTG Logfile information is available for use in user developed analysis programs.

Page 21: PSU COE Network Management

Network Management - EtherealTraffic Analysis & Filtering Reports

Ethereal Supports Data Capturing and Analysis of 289 Protocols.

When set up to Capture Packets of Selected Protocols, the Software Captures and Analyses Packet by Packet Content.

Captured Data is Retained in a Database for Periodic Review or Tracking Down Protocols that Contain Problem Packets.

Format and Content of Captured Data Depends on which of the 289 Protocols are being Analyzed by Ethereal.

Ethereal is a Continuously Running Process Whose Output can be Saved or Printed for Human Analysis of Detected Anomalies.

Protocols not of Interest can be Filtered to Prevent Capturing Excess Data that Would Cloud the Analysis Process.

Manual Data Analysis of Packets Tagged as Having Bogus Signatures Leads to Identification of Denial of Service or Hacker Systems’ Addresses for Router Filters & Reporting to the University Security Office.