Top Banner
Event Management & ITIL V3
24

Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Dec 31, 2015

Download

Documents

Ruby Nelson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Event Management & ITIL V3

Page 2: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Service Desk

Service Operation Processes

Technical Support Groups

Incident Management

Problem Management

Access Management

Event Management

Request Fulfillment

Page 3: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Service Operation

Event Management

Page 4: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Event ManagementThe process that monitors all events that occur through the IT infrastructure to allow for normal service operation and to detect and escalate exceptions.

Effective service support relies on know the status of the infrastructure and detecting any deviation from normal or expected operation.

This can be provided by good monitoring and control systems, which are based on two types of tools:•Active monitoring tools that monitor key CI’s to determine their status and availability. •Passive monitoring tools that detect and correlate operational alerts or communications generated by CI’s.

Page 5: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Trigger

Terminology

Event

Alert

Page 6: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Scope

Event Management can be applied to any aspect of Service

Management that needs to be controlled and can be

automated. E.g.

• Configuration Items• Environmental conditions• Software license monitoring • Security• Normal activity

Page 7: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Different types of event

There are different types of event:

Events that signify regular operation• Notifications that a scheduled workload has completed• A user has logged in to use an application• An email has reached its intended recipient

Events that signify an exception• User attempts to log on to an application with the incorrect password• Unusual situation has occurred in a business process that may indicate an

exception requiring further business investigation• Device’s CPU is above the acceptable utilization rate

Events that signify unusual, but not exceptional, operation• Server’s memory utilization reaches within 5% of its highest acceptable

performance level• Completion time of a transaction is 100% longer than normal

Page 8: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Event Management - Activities

Event Occurs

Review Actions

ResponseSelection

Event Notification

Significanceof events

Trigger

Alert

Event Filtering

Event Correlation

Close Event

Page 9: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Event OccursEvents occur continuously, but not all of them are detected or

registered. It is therefore important that everybody involved in

designing, developing, managing and supporting IT services

and the IT Infrastructure that they run on understands what

types of event need to be detected.

Event Occurs

Page 10: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Event notificationMost CI’s are designed to communicate certain information

about themselves on one of two ways:

• A device is interrogated by a management tool, which collects certain targeted data. Often referred to as ‘polling’.

• The CI generates a notification when certain conditions are met. The ability to produce these notifications has to be designed and built into the CI.

Event Notification

Page 11: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Alert / Event detection

Once an Event notification has been generated, it will be

detected by an agent running on the same system, or

transmitted directly to a management tool specifically designed

to read and interpret the meaning of the event.

Alert

Page 12: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Event FilteringThe purpose of filtering is to decide what is the best course of

action to take e.g.• Communicate the event to a management tool• Ignore it, if this is the case the event will need to be

recorded.

Events need to be filtered as it is not always possible to turn

event notifications off. During the filtering activity, the first level

of correlation is performed

Event Filtering

Page 13: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Significance of eventsEvery organization will have its own method and criteria for

categorizing the significance of an event, the following are

three broad category suggestions:

• Informational• Warning• Exception

Significanceof events

Page 14: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Event correlation

Correlation is normally done by a ‘Correlation Engine’, part of a management tool that compares the event with a set of criteria and relies in prescribed order.

These criteria are often referred to as Business Rules, but are generally quite technical.

The idea is that the event may represent some impact on the business and the rules can be used to determine the level and type of business impact.

Event Correlation

Page 15: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

TriggerIf the correlation activity recognizes an event, a response will

be required. The mechanism used to initiate that response is

called a trigger. There are many different types of triggers,

each designed specifically for the task it has to initiate. E.g.

• Incident triggers.• Change triggers.• A trigger resulting from an approved RFC that has been implemented but

caused the event or an unauthorized change that has been detected.• Paging systems that will notify a person of the event by mobile phone . • Database triggers that restrict access of a user to specific records.

Trigger

Page 16: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Response selection

At this point in the process, there are a number of response

options available. Difference organizations will have different

options. E.g.

There will be a range of responses for each different

technology.

ResponseSelection

Page 17: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Review actionsWith so many events occurring on a daily basis, it is not

possible to review each one individually. However, it is

essential to check any significant events or exceptions have

been handled correctly, track trends etc In most cases this

can be done automatically.

Review Actions

Page 18: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Close EventSome events remain open until specific actions take place e.g.

an event that is linked to an open incident. However, most

events are not opened or closed.

• Informational events are logged• Auto-response events will typically be closed by the

generation of a second event.• In the case of events that have generated an incident,

problem or change, these will be formerly closed with a link to the appropriate record from the other processes.

Close Event

Page 19: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Information Management• SNMP messages, which are a standard way of

communicating technical information about the status of components of an IT infrastructure.

• Management Information Bases of IT devices

• Vendor's monitoring tools agent software

• Correlation Engines

• Event Records.

Page 20: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Value to the business

The value to the business of implementing the Event Management

process is generally indirect, but it is possible to determine the basis

for its value. E.g.• It provides mechanisms for early detection of incidents.• It enables some types of automation activity to be monitored by

exception.• Signal status changes or exceptions that allow the appropriate

person or team to perform early response.• Provides a basis for automated operations, thus increasing

efficiency and allowing human resources to be better utilized.

Page 21: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Metrics• # of events by category and significance• # and % of events that required human intervention and

whether this was performed• # and % of events that resulted in incidents and changes• # and % of events caused by existing problems and known

errors.• # and % of repeated or duplicated events.• # and % of events indicating performance issues and

potential availability issues.• # and % of each type of event per platform or application• # and ratio of events compared with the number of

incidents.

Page 22: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Challenges

• Initial challenge in obtaining funding for necessary tools and effort required.

• Setting the correct level of filtering.

• Rolling out necessary monitoring agents across the entire IT infrastructure can be difficult and time consuming – required ongoing commitment.

• Acquiring the necessary skills.

Page 23: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

CSF’sIn order to obtain the necessary funding a compelling Business

Case should be prepared showing how the benefits of effective

Event Management can far outweigh the costs – giving a

positive return on investment.

One of the most important CSF’s is achieving the correct level

of filtering. This is complicated by the fact that the significance

of events changes. E.g. a user logging into a system today is

normal, but if that user leaves the organization and tries to log

in is it a security breach.

Page 24: Event Management & ITIL V3. Service Desk Service Operation Processes Technical Support Groups Incident Management Problem Management Access Management.

Risks

• Failure to obtain adequate funding

• Ensuring the correct level of filtering

• Failure to maintain momentum in rolling out the necessary monitoring agents across the IT infrastructure.