Top Banner
©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Lessons learned from an HP Network Automation and Network Node Manager I integrated deployment with TelAlert notification in an MPLS environment Bill P. Fanelli, Principal Architect Allen Corporation of America
38

Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Jan 15, 2015

Download

Documents

This case study demonstrates how the integration of HP Network Automation, Network Node Manager i (NNMi), and TelAlert Urgent Messaging System reduced costs, improved configuration standards, and helped an energy company through a major acquisition. The implementation team will discuss the benefits of migrating to NNMi, particularly configuration ease. They will also give configuration tips on obtaining full map functionality in an MPLS environment. They’ll report on improved standardization and dramatically reduced MTTR with existing personnel achieved by deploying Network Automation to a network spread across 125 sites including such diverse elements as radio transmission towers and SCADA devices. And they’ll focus particularly on maximizing the shared nodes between NA and NNMi. To close, the team will illustrate the benefits and process of integrating TelAlert Urgent Messaging System to deliver paging notification of essential root cause incidents to both the core network management team and the responsible technical team at affected sites.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

Lessons learned from an HP Network Automation and Network Node Manager I integrated deployment with TelAlert notification in an MPLS environmentBill P. Fanelli, Principal ArchitectAllen Corporation of America

Page 2: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Allen Corporation of America, Inc.

• Headquarters: Fairfax, VA

• Organization

— Training Systems Division

— Integrated Technologies Division

— CyberSecurity Division

— Logistics Services Division

• Regional Offices: Colonial Heights, VA; Ithaca, NY;

Myrtle Beach, SC; The Hague, Netherlands

• Sites in 22 States, with Worldwide Operations

• 250+ employees

• Private Corporation - Small business

• Secret Facilities Clearance

2

Page 3: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Complete Life-Cycle

Support

Security

Management

Enterprise

Notification

Solutions

Cyber Security Division

3

Cyber Security, Enterprise Management Services

Page 4: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Agenda

4

• Integrating NA with NNMi

– Benefits of integration

– Implementation Tips

• Monitoring MPLS with NNMi

– Issues with virtual networks

– How to best match the map to your environment

• Stabilizing staffing using Notification with TelAlert

– Taming the workload with automation

Page 5: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Case Study

5

• Large Energy company

– Diverse network – includes radio transmission towers and SCADA

devices

– Growth by acquisition – reserves grew by a factor of 50 over 15

years

• Issues in IT

– Assimilation of acquired infrastructure

• NNMi & NA

– MTTR for field outages was 2 ½ - 3 days

• NA

– Network staff could not grow linearly with company

• Reserves doubled every four months

• NNMi on MPLS

• TelAlert

Page 6: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

NA and NNMi Selection Drivers

6

• See what is running

• Assimilate acquired infrastructure

– Technology

• Cisco

– Process

• Standardize configurations with NA

• Centralize monitoring with NNMi

– People

• Automated notification from NNMi to TelAlert

Page 7: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Let’s Get Started

7

• Integrating NA with NNMi

• Monitoring MPLS with NNMi

• Stabilizing staffing using Notification with TelAlert

Page 8: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Benefits of Integrated

NA/NNMi Process

8

• High percentage of outages due to changes

– Coordinate changes

– Ability to roll back changes, both authorized and

unauthorized

• Standardize and Automate

– SNMP community string change

• Add new string

• Confirm all nodes are configured and working

• Remove old string

• Expedite Field Replacements

– Drop ship replacement devices to field location

– Push configuration over the wire

Page 9: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Features of NA/NNMi Integration

9

• GUI integration

– Cross launch with context

– Telnet or SSH access to

devices

– Bring NA diagnostics to

NNMi

• Data integration

– Import NNMi devices into

NA

– Secret Ingredient

• NA must have NNMi Node

UUID to make the match

Page 10: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Linking NA with NNMi

10

• Run the Connector installer on the NA machine

– Connects to NNMi and installs components there as well

• Dependence on whether NA and NNMi are co-resident

– Some default ports are the same

• Install NNMi first, then NA installer will accommodate

– Separate Connector installers as well

• Learn from us

– Initially co-resident and then moved NA

– Many extra steps involved

• Not worth a ―try and see‖ approach

– Think your way through impact of co-residency

• NNMi has huge memory requirement

Page 11: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Import NNMi Devices to NA

11

• On NNMi, run nnmimport

• Queries NA for a list of supported OIDs

• Dumps nodes from NNMi database

matching supported OIDs only

• Pushes node information – particularly the

NNMi Node UUID – to NA

• Wanted All Devices from NNMi to NA

– Even Unsupported

Page 12: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Adding Devices from NNMi to NA

• On the NA server, add the OIDs to{NA_DIR}/jre/adjustable_options.rcx

• Format<array name="drivers/custom_sysoids">

<value> <!-- sys oid --> </value>

<value> <!-- another sys oid --> </value>

<value> <!-- etc. --> </value>

</array>

• For example<array name="drivers/custom_sysoids">

<value>1.3.6.1.4.1.9.1.479</value>

</array>

• Save and restart NAS

Page 13: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Finding Supported OIDs in NA

• telnet or ssh to NA box

• Login as an NA User

• Run the commandlist sys oids all

• All OIDs supported by NA will be listed

13

Page 14: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Finding OIDs in Use in NNMi

• On the NNMi server, run the commandnnmtopodump.ovpl -legacy long -type node

pipe this tofind "SNMP OBJECT ID: " or

grep "SNMP OBJECT ID:"

and redirect to a file, such asOIDs_in_use.out

• nnmtopodump.ovpl -legacy long -type node | find "SNMP OBJECT ID:" > OIDs_in_use.out

14

Page 15: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Determine OIDs to Add to NA

• Sort, cut and compare these two lists

• Generate a list of OIDs

– from the NNMi ―OIDs in use‖ list

– that are not in the NA ―supported OIDs‖ list

• Add these to the adjustable_options.rcx file

• The next time nnmimport is run on the NNM box

– NA will respond that the added OIDs are supported

– therefore nnmimport will include them in the push to NA

• Warning

– nnmimport has the tendency to create duplicate entries in NA

– This is not due to modifying adjustable_options.rcx

– Use nnmimport carefully until you understand the impact on NA

in your environment

15

Page 16: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Restart NAS You Say…

16

Page 17: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Where Are We

17

• Integrating NA with NNMi

• Monitoring MPLS with NNMi

• Stabilizing staffing using Notification with TelAlert

Page 18: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Monitoring MPLS with NNMi

• Discovery across virtual boundaries is inherently difficult

– Contiguous map

– Downstream suppression

18

Page 19: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Contiguous Map

• NNMi has Subnet Connection Rules

• NNMi can create Layer 2 Connections for subnets at the

edge of subnetworks that are directly connected via Wide

Area Networks (WANs).

• Define rules to control which subnets and interfaces NNMi

uses to create additional Layer 2 connections.

19

Page 20: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Small Subnets Rule

• All rules are on by default

20

Page 21: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Discovery Islands

21

Page 22: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Discovery Islands

• Good – not perfect

• Remember that we do not manage large networks by

Maps

– Manage by events

• Topology that NNMi knows about that is represented by

these maps is most important

• Status representation on maps is also important

– Maintain user confidence

• Issue with map status display with MPLS connected sites

– Downstream suppression rule prevents nodes and containers from

representing MPLS outage

22

Page 23: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Downstream Suppression:

The Situation

• NNMi analyses the Layer 2 information and determines

when a set of nodes are not connected at layer 2 as far as

it can discover.

• This applies to MPLS connected sites

• NNMi puts these nodes into NNMi defined node groups

named Island nnnn, where nnnn is a unique number for

each set of layer 2 connected nodes that are not

connected to the NNMi server.

• When an island is isolated by an MPLS failure, all the

nodes in the island are put into a warning or unknown

state.

23

Page 24: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Downstream Suppression—The Fix

• If a node is added to the Important Nodes node group and

it goes down or becomes isolated, it will be set to critical

status. This overrides the island logic which sets it to

warning or unknown.

• Added filter rules to the Important Nodes node group on

NNMi server as follows:

– Device Filters

• Device = Gateway or Router

– Additional Filters

• Island = not null

• Automatically populates the Important Nodes node group

with the routers in the islands

24

Page 25: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Downstream Suppression—Outcome

• When MPLS site is isolated

– All routers go critical

• Could be further filtered

• NNMi does produce a Critical Event

– Without adding nodes to Important Nodes

Node Group, the node and containers do not

reflect outage

25

Page 26: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Home Stretch

26

• Integrating NA with NNMi

• Monitoring MPLS with NNMi

• Stabilizing staffing using Notification with TelAlert

Page 27: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

The Case for Notification

27

• Text or Text-to-Speech messaging has lower

barrier to entry since almost everyone now carries

a cell phone

• Normal Hours

– Get someone’s attention at their desk or away from it

• Off Hours

– Staffing for 7 x 24 monitoring is cost prohibitive for

most organizations

• Rule of 13/8

– Need for 7 x 24 monitoring is growing as companies

become more network dependent

Page 28: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Desired Workflow

28

• Immediate Notification

– Core network team only

– SNMP IFdown Trap

• Root Cause Event

– District and Site where event occurred

– Could be:

• Node Down

• Remote site containing node is unreachable

• Node or Connection Down

• Interface Down

– Typically delayed three minutes

• Reminder on open incidents

– Core network team after one hour

Page 29: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

NNMi Actions

29

• Trigger on Lifecycle States

– Registered, In Progress, Completed, and Closed

– Typically use Registered and Closed

• Large number of parameters for configuring incident

actions plus Custom Incident Attributes

– By pairing Lifecycle States, Message ID stays the same

– Node Down Registered is cleared by Node Down Closed

• Instead of separate Node Up event

• Effect in TelAlert

– When Registered

telalertc –g NetCore –m Node $sourceNodeName Down –ticket $id

– When Closed

telalertc –ack –ticket $id

Page 30: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Implemented Workflow

30

• Immediate Notification

– When SNMP Trap Incident enters Registered State

– Send message now to core network group

– telalertc -g NetCore -subject "$severity fault on

$sourceNodeName―

-m "Fault: $name on $sourceObjectName on node

$sourceNodeName at $lastOccurrenceTime―

• Notify Site and District

– When Root Cause Incident enters Registered State

– Send final message to core network, site and district groups

– telalertc -g NetAll -ticket $id -delay 3m -subject "$severity fault on

$sourceNodeName―-m "Fault: $name on $sourceObjectName on

node $sourceNodeName at $lastOccurrenceTime―

Page 31: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Implemented Workflow

31

• Reminder on open incidents

– When Root Cause Incident enters Registered State

– telalertc -g NetCore -delay 60m -ticket $id

-subject "Reminder message on $sourceNodeName―

-m "Reminder message on $sourceNodeName―

• Recovery

– When ―Down‖ Incident enters Closed State

– telalertc -ack -ticket $id

Page 32: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Typical Scenario

32

• Router loses power

• SNMP IFdown Trap from upstream router

– NNMi sends message to NetCore group for immediate delivery

• Causal engine posts Interface Down Root Cause Incident

– NNMi sends message to NetAll group with three minute delay

– NNMi sends reminder to NetCore group with one hour delay

• Causal engine posts Node or Connection Down Incident

– Interface Down Incident is closed

• NNMi sends –ack to clear Interface Down message and reminder

– NNMi sends message to NetAll group with three minute delay

– NNMi sends reminder to NetCore group with one hour delay

Page 33: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Typical Scenario

33

• Causal engine posts Node Down Incident

– Interface Down Incident is closed– NNMi sends –ack to clear Node or Connection Down message

and reminder

– NNMi sends message to NetAll group with three minute

delay

– NNMi sends reminder to NetCore group with one hour

delay

• Three minute delay timer expires

– Node Down message delivered to all groups

• One hour delay timer expires

– Reminder message delivered to NetCore group

Page 34: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Conclusion

34

• Integrating NA with NNMi

– Consistency of configurations

– Same nodes in both tools

• Monitoring MPLS with NNMi

– Monitor by Incidents

– Map status should reflect real world status

• Stabilizing staffing using Notification with TelAlert

– Demands on staff are growing faster than the staff

headcount

– Automation is the key to survival

Page 35: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

35

Allen Corporation

Allen Corporation of America, Inc.

10400 Eaton Place, Suite 450Fairfax, VA 22030

(866) HQ - ALLEN (866) 472-5536

www.allencorp.com

Bill [email protected]

571.321.1648 Voice

Page 36: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

Questions or Comments?

*******

Thank you for your time

Page 37: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment

37 ©2010 Hewlett-Packard Development Company, L.P.

To learn more on this topic, and to connect with your peers after

the conference, visit the HP Software Solutions Community:

www.hp.com/go/swcommunity

Page 38: Lessons learned from an HP Network Automation and Network Node Manager i integrated deployment with TelAlert notification in an MPLS environment