SURFnet6 Network Monitoring and Reporting Hans Trompert, SURFnet.

Post on 23-Dec-2015

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

SURFnet6 Network Monitoring and ReportingHans Trompert, SURFnet

Information needs

Connected organizations

NOC / SURFnet / research

Annual report

Info

rmati

on

deta

il

Monitoring versus Reporting

- Monitoring- real-time- status- alarms

- Reporting- afterwards - over a specific time period (day, week, month,

year)

Information source and destination

AviciSSR

NortelERS8600

NortelOM5200

NortelOME6500

NortelOME1060

SURFnet6 operations

Real-time customer reporting

Security

Equipment and interface

Optical devices CPL TL1

OM5200 TL1 (+ SNMP)

OME6500 TL1 (+ SNMP)

OME1060 SNMP

Data devices ERS8600 SNMP

Avici SSR SNMP + Netflow

Reporting: SNMP metrics

SNMP metrics:- Interface in/out octet counters- Interface in/out packet counters

(unicast/broadcast/multicast)- Interface input/output errors- Interface availability- Temperature- Memory- CPU- Device uptime- and more …

Reporting: TL1 metrics

TL1 metrics:- Input/Output Frames - Errored frames- Discarded frames- Transmit and receive power levels- Errored Seconds - number of seconds that have had

CRC errors- Severely Errored Seconds - after 10 seconds of ES

we start counting SES- UnAvailable Seconds - Seconds where we had no

sync- and more …

Monitoring: SNMP traps

SNMP traps- Fan- Temperature - Voltage- Link Up/Down- Bay Controller - Module - PIM + MSDP - BGP- VRRP- ISIS- and more …

Monitoring: TL1 events

TL1 Events- Equipment

- Circuit pack missing/mismatch/failed- Fan failed/missing- Power failure A or B- High temperature

- Shelf- Software upgrade failed/mismatch/….- Database integrity fail/restore in progress/…

- Amplifier- input/output loss of signal- automatic shutoff

- and many, many more

SNMP based volume reporting

Internet

Connected organizations

Border routerAmsterdam1

(SARA)

Border routerAmsterdam2(TeleCity II)

Core routerAmsterdam2(TeleCity II)

Core routerAmsterdam1

(SARA)

-Total external traffic-Per traffic class (AMS-IX, Global, privat peers)-Per provider/peer

-Total SURFnet internal traffic-Per connected organization

SURFnet external traffic volume

- SURFnet external traffic volume- Ams-IX- Private peers (via Ams-IX), including:

- Chello, Tiscali, @Home, Planet, XS4all- Garnier Projects, Abovenet , UUnet, Cogent

- NREN- Geant2- SINET- Abilene

- Global- Global Crossing- Cable & Wireless

SURFnet external traffic volume

SURFnet extern verkeer - januari 1999 t/m december 2006

0

500

1.000

1.500

2.000

2.500

jan-9

9jul

-99

jan-0

0jul

-00

jan-0

1jul

-01

jan-0

2jul

-02

jan-0

3jul

-03

jan-0

4jul

-04

jan-0

5jul

-05

jan-0

6jul

-06

TiB

TiB In TiB Uit

SURFstat: Real-time connected organization traffic volume reporting

- Software- Net-SNMP- Python- RRDtool

- Features- Easy administration by labeling connections with

keywords in interface description on router- Different graph resolutions: day, week, month,

year, decade- 1 minute measurement interval

- Reports on- volume (bits in/out)- packets (unicast/multicast/broadcast)

SURFstat: UvA (many users)

SURFstat: CWI (few users)

Netflow – flow information

- Netflow uses the common 5-tuple definition, where a flow is defined as a unidirectional sequence of packets all sharing all of the following 5 values:

1. Source IP address2. Destination IP address3. Source TCP port4. Destination TCP port5. IP protocol

- Most common fields in Netflow record:- 5-tuple information- Input and output SNMP interface index- Timestamps for the flow start and finish time- Number of bytes and packets observed in the flow

Netflow – versions

v1 First tryv5 Most used versionv6 Encapsulation informationv7 Switch informationv8 Several aggregation formsv9 Template Based, allowing many

combinations, supports IPv6IPFIX aka v10; IETF Standardized NetFlow 9

with Enterprise fields and other community input

Netflow setup

Internet

Connected organizations

Border routerAmsterdam1

(SARA)

Border routerAmsterdam2(TeleCity II)

Core routerAmsterdam2(TeleCity II)

Core routerAmsterdam1

(SARA)

FLOWmon

perfSONAR

test

NFSEN

PeakFlow

Fan out

Netflow applications

- connected organizations:- FLOWmon

detailed traffic reporting- SURFflow (Arbor Peakflow / NFSEN)

suspicious traffic pattern reporting- SURFnet-CERT:

- NFSENsuspicious traffic pattern reportinghistorical flow data queriesprofiles for custom reports

- Geant2 JRA1 perfSONAR probes- Flow Subscription Measurement Point- Flow Selection and Aggregation Measurement

Archive

FLOWmon

Detailed traffic reporting:- total traffic- prefix-based flow grouping- reports on:

- IP version (v4/v6)- IP protocol (TCP, UDP, ICMP, GRE, …)- TCP port (HTTP, SMTP, NNTP, FTP, SSH, …)- UDP port (domain, RTSP, VPN, …)

- top N connected organizations- destination AS traffic

UvA traffic by IP protocol

Connected organization to world traffic by TCP destination port

SURFflow

Reports on suspicious traffic patterns like:- Unusual amount of flows DOS attack- Flows from one host to many ports on other host

portscan- From 1 host to same port on many hosts break-in

attempt making use of known bug- From many hosts to specific (set of) port(s) to many

other hosts virus/worm- etc …

Active measurements: RTTPL

Round Trip Time and Packet Loss monitoring- measurement probes throughout the network- central storage of results- active measurements by injecting ICMP echo

request packets- measuring min/max/avg RTT and jitter

- both IPv4 and IPv6- both unicast and multicast (under development)

- measuring packet loss - 20 pings per minute- report matrices per minute/hour/day/month- results between two probes in graphs

RTTPL report matrices

RTTPL Nijmegen - Amsterdam

Active measurements: Connected organization availability

- measuring availability by sending ICMP Echo Requests to connected organization router

- measurement includes last mile to connected organization plus connected organization router port (unlike commercial providers)

- Cisco routers with Service Assurance Agent software on both Amsterdam1 and Amsterdam2

- results stored in database and reported monthly- redundancy in measurements by ORing results

from Amsterdam1 and Amsterdam2

Thank you

top related