Top Banner
The Industry Standard In IT Infrastructure Monitoring
44

Nagios, Getting Started.

Jun 26, 2015

Download

Software

Hitesh Bhatia

Nagios, is a World standard when it comes to monitoring the IT infrastructure.

This presentation would help you to Getiing started with Nagios.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Nagios, Getting Started.

The Industry Standard In IT Infrastructure Monitoring

Page 2: Nagios, Getting Started.

Who are using Nagios

Page 3: Nagios, Getting Started.

Agenda

• What is Nagios

• What can you do with Nagios

• Features

• Basico Architectureo Terminology

• Monitoring

• State Types

• Active / Passive Checks

• Reports

Page 4: Nagios, Getting Started.

What is Nagios Core

Open Source system and network monitoring application

Page 5: Nagios, Getting Started.

With Nagios you can

• Monitor your entire IT infrastructure

• Spot problems before they occur

• Know immediately when problems arise

• Share availability data with stakeholders

• Detect security breaches

• Plan and budget for IT upgrades

• Reduce downtime and business losses

Page 6: Nagios, Getting Started.

• Monitoring of network services • SMTP • POP3 • HTTP • PING and more

• Monitoring of host resources• Processor load • Disk usage and more

• Simple plugin design that allows users to easily develop their own service checks

• Parallelized service checks

Features

Page 7: Nagios, Getting Started.

Features

• Ability to define network host hierarchy/groups

• Allowing detection of and distinction between hosts that are down and those that are unreachable

• Contact notifications when service or host problems occur and get resolved via

• Email• Pager • or user-defined methods

• Ability to define event handlers to be run during service or host events for proactive problem resolution

Page 8: Nagios, Getting Started.

Features

• Automatic log file rotation

• Support for implementing redundant monitoring hosts

• Optional web interface for viewing • Current network status• Notification • Problem history• Log file and more

Page 9: Nagios, Getting Started.

Basics

Page 10: Nagios, Getting Started.

Basics

Page 11: Nagios, Getting Started.

Basics

Page 12: Nagios, Getting Started.

Basics

Page 13: Nagios, Getting Started.

Basics

Page 14: Nagios, Getting Started.

Basics

Definitions• Host• Service• Contacts• Commands• TimePeriod• Eventhandlers

Page 15: Nagios, Getting Started.

Basics

HostDefines a physical server, workstation, device, etc. that resides on your network.

Page 16: Nagios, Getting Started.

Basics

Host

define host{host_name remotehostalias some Remote Hostaddress 192.168.1.50contacts adminmax_check_attempts 3check_period 24x7notification_interval 60notification_period 24x7

}

Page 17: Nagios, Getting Started.

Basics

Service

• Its a service that runs on the host.

• Actual service on the host like POP, SMTP, HTTP, etc.)

• Metric associated with the host (response to a ping, number of logged in users, free disk space, etc.

Page 18: Nagios, Getting Started.

Basics

Service

Page 19: Nagios, Getting Started.

Basics

Servicedefine service {

host_name linux-serverservice_description check-disk-sda1check_command check-disk!/dev/sda1max_check_attempts 5check_interval 5retry_interval 3check_period 24x7notification_interval 30notification_period 24x7notification_options w,c,rcontact_groups admins

}

Page 20: Nagios, Getting Started.

ContactsIdentify someone who should be contacted in the event of a problem.

define contact{contact_name adminalias adminhost_notifications_enabled 1service_notifications_enabled 1service_notification_period 24x7host_notification_period 24x7service_notification_options w,u,c,rhost_notification_options d,u,rservice_notification_commands notify-by-emailhost_notification_commands host-notify-by-emailemail [email protected] [email protected]

}

Basics

Page 21: Nagios, Getting Started.

Commands

define command{name check_httpcommand_name check_httpcommand_line $USER1$/check_http -I $HOSTADDRESS$ $ARG1$

}

define host {..address 192.168.1.50..

}

Basics

define service { .. check_command check-disk!/dev/sda1 ..}

Page 22: Nagios, Getting Started.

Time PeriodValid times for notifications and service checks.

define timeperiod{timeperiod_name nonworkhoursalias Non-Work Hourssunday 00:00-24:00 weekmonday 00:00-09:00,17:00-24:00tuesday 00:00-09:00,17:00-24:00wednesday 00:00-09:00,17:00-24:00thursday 00:00-09:00,17:00-24:00friday 00:00-09:00,17:00-24:00saturday 00:00-24:00

}

Basics

Page 23: Nagios, Getting Started.

Event handlers are optional system commands (scripts or executables) that are run whenever a host or service state change occurs.

• Restarting a failed service• Entering a trouble ticket into a helpdesk system• Logging event information to a database• Cycling power on a host

Event Handlers

Page 24: Nagios, Getting Started.

Event handlers are executed when a service or host:

• Is in a SOFT problem state• Initially goes into a HARD problem state• Initially recovers from a SOFT or HARD problem state

Event Handlers

define service { .. event_handler command_name

event_handler_enabled [0/1] ..}

Page 25: Nagios, Getting Started.

Other Blocks• contactgroup• servicegroup• servicedependency• serviceescalation• serviceextinfo• hostdependency• hostescalation• hostextinfo

Basics

Page 26: Nagios, Getting Started.

Monitoring Services

Nagios can be used to monitor Public and Private Services

• Private Services• CPU load• Memory usage• Disk usage• Logged in users• Running processes

• Publicly available services that are provided by Linux servers • HTTP• FTP• SSH • SMTP

Page 27: Nagios, Getting Started.

Monitoring Private Services

• Plugins/Addons are mostly used for monitoring private services.

• NRPE addon is installed on the target servers (Nagios Remote Plugin Executor)

• Its is an addon that allows you to execute plugins on remote Linux/Unix hosts

Page 28: Nagios, Getting Started.

Monitoring Private Services

• NCSA addon (Nagios Service Check Adapter))

• Allows you to send passive check results from remote Linux/Unix to the Nagios daemon running on the monitoring server.

• This is very useful in distributed and redundant/failover monitoring setups.

Page 29: Nagios, Getting Started.

Monitoring Public Services

• Check plugins first @ Nagios Exchange

• Walk through

• Create host in file within cfg dir• Define Service for each process/service that needs to be

monitored.• Service uses pre-defined/custom defined commands. • Define contacts who would receive notifications and take action.

Page 30: Nagios, Getting Started.

• Based on variable max_check_attempts

• The SOFT state is logged, when• Number of checks haven’t completed yet

• When a service or host recovers from a soft error. This is considered a soft recovery.

State Types

Page 31: Nagios, Getting Started.

• HARD state is logged, when• Number of checks have completed

• When a host or service transitions from one hard error state to another error state (e.g. WARNING to CRITICAL).

• ex. Running to Down

• When a service check results in a non-OK state and its corresponding host is either DOWN or UNREACHABLE.

• When a host or service recovers from a hard error state. This is considered to be a hard recovery.

• Contacts are notified of the host or service problem or recovery.

State Types

Page 32: Nagios, Getting Started.

Active / Passive Checks

Active Checks

● Initiated by the Nagios process

● Ran on a regularly scheduled basis

Page 33: Nagios, Getting Started.

Active / Passive Checks

Passive Checks● Passive checks are initiated and performed

by external applications/processes

● Passive check results are submitted to Nagios for processing

• Used for• Checks that are asynchronous in nature ● Located behind a firewall and cannot be

checked actively from the monitoring host

Page 34: Nagios, Getting Started.

Nagios in Action

Demo Time : http://nagioscore.demos.nagios.com/

Page 35: Nagios, Getting Started.

Reports

• Availability ReportReport for uptime and services

• Trends ReportGraphical breakdown of of state of particular host, service.

Page 36: Nagios, Getting Started.

Reports

• Alert History ReportRecord of historical alerts

Page 37: Nagios, Getting Started.

Reports

• Alert Summary Report

Page 38: Nagios, Getting Started.

Reports

• Alert Histogram ReportFrequency graph of host and service alerts

Page 39: Nagios, Getting Started.

Reports

• Notification ReportProvides historical record of notifications sent to contacts

Page 40: Nagios, Getting Started.

Summary

• Infra monitoring

• Anomaly Outage detection

• Automatic Problem remedy

• Schedule Downtime

• Outage Alerts

• Alert Escalations

• Historical Reporting

• Maintenance Planning

Page 41: Nagios, Getting Started.

Advice for Beginners

• Relax - it's going to take some time.

• Use the quickstart instructions.

• Read the documentation.

• visiting the Nagios Support Forum at http://support.nagios.com/forum/.

Page 42: Nagios, Getting Started.

Next Steps

• Get your hands dirty

• Get trainingLive / Self paced training

• Get certifiedNagios Certified ProfessionalNagios Certified Administrator

• Use it to Monitor your infra.

Page 44: Nagios, Getting Started.

Thank You