Introduction To Introduction To Nagios Nagios A Linux-based Monitoring A Linux-based Monitoring System System
Introduction To Introduction To NagiosNagios
A Linux-based Monitoring A Linux-based Monitoring SystemSystem
What Is Nagios?What Is Nagios?
Nagios is a system that monitors Nagios is a system that monitors availability of network resources, availability of network resources, such as hosts and services.such as hosts and services.
It enables you to identify and It enables you to identify and resolve IT infrastructure problems resolve IT infrastructure problems before they affect critical processes.before they affect critical processes.
Brief HistoryBrief History
Originally created under the name Originally created under the name NetSaint, it was written and is NetSaint, it was written and is maintained by Ethan Galstad along maintained by Ethan Galstad along with a group of plugin developerswith a group of plugin developers
History cont.
Launched in March of 1999 under Launched in March of 1999 under the GNU General Public Licensethe GNU General Public License
March 2002 due to trademark March 2002 due to trademark issues with the name “NetSaint” issues with the name “NetSaint” Ethan decides to rename the project Ethan decides to rename the project to Nagios, a recursive acronym that to Nagios, a recursive acronym that stands for “Nagios ain’t gonna insist stands for “Nagios ain’t gonna insist on sainthood”. on sainthood”.
RequirementsRequirements
* A machine running Linux or Unix-* A machine running Linux or Unix-variantvariant
* C compiler, e.g. gcc* C compiler, e.g. gcc TCP/IP configuredTCP/IP configured CGIs (optional) apache web server, CGIs (optional) apache web server,
Thomas Boutell’s gd library version Thomas Boutell’s gd library version 1.6.3 or higher. Used by the 1.6.3 or higher. Used by the statusmap and trends CGIs.statusmap and trends CGIs.
* Must have
What Can Nagios What Can Nagios Monitor?Monitor?
Applications (tomcat servers)Applications (tomcat servers) Host Resources (cpu load, disk space)Host Resources (cpu load, disk space) Infrastructure components (routers, Infrastructure components (routers,
switches)switches) Database servers (mySQL, Oracle)Database servers (mySQL, Oracle) Network services (http, ssh, ping)Network services (http, ssh, ping) Web serversWeb servers Mail serversMail servers
Nagios ConfigurationNagios Configuration
Nagios.cfgNagios.cfg CGI.cfgCGI.cfg Resource.cfg Resource.cfg Object Definition FilesObject Definition Files
CommandsCommands Hosts and ServicesHosts and Services Contacts and contact groupsContacts and contact groups
Plugins Plugins Homemade PluginsHomemade Plugins
Commands and Plugins A plugin is an executable or script that can be run from the
command line and returns an exit code of 0=ok, 1=warning, 2=critical or 3=unknown
A command consists of a plugin plus macros and is used to perform the host or service check.
define command { command_name check_host-alive command_line $USER1$/check_ping $HOSTADDRESS$ -w
3000.0,80% -c 5000.0,100% -p 1}
define host { host_name glastlnx19.slac.stanford.edu check_command check_host_alive}
Homemade PluginsHomemade Plugins
Host and Service Host and Service DefinitionsDefinitions
define host {define host {
use generic-host; Name of host use generic-host; Name of host template to usetemplate to use
host_name glastlnx19.slac.stanford.eduhost_name glastlnx19.slac.stanford.edu
alias glastlnx19alias glastlnx19
address 134.79.200.39address 134.79.200.39
check_command check-host-alivecheck_command check-host-alive
max_check_attempts 10max_check_attempts 10
check_period 24 x 7check_period 24 x 7
notification_interval 120notification_interval 120
notification_period 24 x 7notification_period 24 x 7
contact_groups corecontact_groups core
}}
define service {define service {
use generic-serviceuse generic-service
host_name glastlnx19.slac.stanford.eduhost_name glastlnx19.slac.stanford.edu
service_description Web App Telemetry Trending – tomcat 12service_description Web App Telemetry Trending – tomcat 12
is_volatile 0is_volatile 0
check_period 24 x 7check_period 24 x 7
max_check_attempts 4max_check_attempts 4
normal_check_interval 5normal_check_interval 5
retry_check_interval 1retry_check_interval 1
contact_groups corecontact_groups core
notification_options w,u,c,rnotification_options w,u,c,r
notification_interval 960notification_interval 960
notification_period 24 x 7notification_period 24 x 7
check_command check_command check_jmx!-uservice:jmx:rmi://jndi/rmi://glast-check_jmx!-uservice:jmx:rmi://jndi/rmi://glast-tomcat12.slac.stanford.edu:8081/jmxrmi!-tomcat12.slac.stanford.edu:8081/jmxrmi!-mCatalina:j2eeType=WebModule,name=//localhost/mCatalina:j2eeType=WebModule,name=//localhost/TelemetryTrending,J2EEApplication=none,J2EEServer=none!-TelemetryTrending,J2EEApplication=none,J2EEServer=none!-astate!-e1astate!-e1
}}
Nagios File Structure For Nagios File Structure For FermiFermi
MonitoringMonitoring
Nagios Remote Plugin Nagios Remote Plugin Executor Executor
Chronological Chronological Progression Of Service Progression Of Service
StateState
Notifications Notifications
Nagios Web InterfaceNagios Web Interface
http://glast-nagios.slac.stanford.edu/nagios/
Contacts and Contact Contacts and Contact GroupsGroups
define contact {define contact {
contact_name Briancontact_name Brian
alias Brian Van Klaverenalias Brian Van Klaveren
service-_notification_options w,u,c,rservice-_notification_options w,u,c,r
service_notification_period 24 x 7service_notification_period 24 x 7
service_notification_commands notify_by_emailservice_notification_commands notify_by_email
host_notification_commands notify_by_emailhost_notification_commands notify_by_email
email [email protected] email [email protected]
} }
define contactgroup {define contactgroup {
contactgroup_name oracle_load_groupcontactgroup_name oracle_load_group
alias Oracle Load Groupalias Oracle Load Group
members Brian, Tonymembers Brian, Tony
} }
Host and Service Host and Service DefinitionDefinition
define host{ define host{ use generic-host ; Name of host template to use use generic-host ; Name of host template to use host_name glast-astro-db1.slac.stanford.edu host_name glast-astro-db1.slac.stanford.edu alias glast-astro-db1 alias glast-astro-db1 address 134.79.200.16 address 134.79.200.16 check_command check-host-alive check_command check-host-alive max_check_attempts 10 max_check_attempts 10 check_period 24x7 check_period 24x7 notification_interval 120 notification_interval 120 notification_period 24x7 notification_period 24x7 notification_options d,r notification_options d,r contact_groups oracle_load_group contact_groups oracle_load_group }}
define service{ define service{ use generic-service ; Name of service template to use use generic-service ; Name of service template to use host_name glast-astro-db1.slac.stanford.edu host_name glast-astro-db1.slac.stanford.edu service_description Oracle Astro Pass 7 service_description Oracle Astro Pass 7 is_volatile 0 is_volatile 0 check_period 24x7 check_period 24x7 max_check_attempts 4 max_check_attempts 4 normal_check_interval 5 normal_check_interval 5 retry_check_interval 1 retry_check_interval 1 contact_groups oracle_load_group contact_groups oracle_load_group notification_options w,u,c,r notification_options w,u,c,r notification_interval 1800 notification_interval 1800 notification_period 24x7 notification_period 24x7 check_command check_oracle2!/@astro_pass7 check_command check_oracle2!/@astro_pass7 } }
PluginsPlugins Plugins are executables or scripts that Plugins are executables or scripts that
can be run from a command line and can be run from a command line and return an exit codereturn an exit code
homemade plugins (aka commands) are homemade plugins (aka commands) are built from plugins and macros; Nagios built from plugins and macros; Nagios can call external programs using these can call external programs using these commandscommands
define service {define service {
use generic-service; Name of service templateuse generic-service; Name of service template
host_name glastlnx19.slac.stanford.eduhost_name glastlnx19.slac.stanford.edu
service_description Pingservice_description Ping
is_volatile 0is_volatile 0
check_period 24 x 7check_period 24 x 7
max_check_attempts 4max_check_attempts 4
normal_check_interval 5normal_check_interval 5
retry_check_interval 1retry_check_interval 1
contact_groups corecontact_groups core
notification_options w,u,c,rnotification_options w,u,c,r
notification_interval 960notification_interval 960
notification_period 24 x 7notification_period 24 x 7
check_command check_ping!100.0,20%!check_command check_ping!100.0,20%!500.0,60%500.0,60%
}}