Top Banner
Integrating kdump into oVirt 3.5 Martin Peřina Software Engineer at Red Hat August 26 th 2014
43

Integrating kdump into oVirt

Feb 08, 2017

Download

Internet

Martin Peřina
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5

Martin PeřinaSoftware Engineer at Red Hat

August 26th 2014

Page 2: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 2/43

Agenda● Motivation

● What is kdump?

● What is fence_kdump?

● How is it all coupled together?

● Configuration

● Future features

Page 3: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 3/43

Motivation

Page 4: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 4/43

Host kernel crash on oVirt <= 3.4:1.host kernel crashed, process which gathers crash

information started (this process can take a lot of time)

2.after some time engine detected the host as non responsive and execute fencing on it

3. if host is fenced during crash gathering, all crash information are lost

Page 5: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 5/43

Goal for oVirt 3.5● Try to detect if host is not in kdump flow prior to fence

execution

● If host is in kdump flow, do not execute fencing and wait for host to gather its crash information successfully

Page 6: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 6/43

What is kdump?

Page 7: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 7/43

What is kdump?● kexec based kernel crash dumping mechanism (when

standard kernel crashed, capture kernel is booted)

● dumps memory content of crashed kernel into file on local or remote target

● dumping is executed from capture kernel, crashed kernel memory is preserved

● capture kernel needs reserved memory in standard kernel

Page 8: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 8/43

Standard and capture kernel

Page 9: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 9/43

How kdump works?1. Standard kernel crashes

2. Kexec boots capture kernel

3. Memory dump is executed in capture kernel

4. Memory dump file is stored to specified target

5. Host is rebooted

Page 10: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 10/43

Kdump configurationkdump configuration is stored in:

● /etc/kdump.conf

● static configuration that can be changed by administrator

● capture kernel initial ramdisk file

● created from /etc/kdump.conf on kdump service restart

Page 11: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 11/43

Sample kdump.confpath /var/crash

core_collector makedumpfile -l --message-level 1 -d 31

Page 12: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 12/43

Kdump requirements● kexec-tools package which contains tools to setup and

execute kdump

● crashkernel=MEM_SIZE command line parameter needs to be configured for standard kernel (on RHEL/Centos enabled by default, on Fedora administrator is required to enable it)

● kdump service has to be enabled

Page 13: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 13/43

What is fence_kdump?

Page 14: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 14/43

What is fence_kdump?● set of command line tools to receive messages from

dumping host on another predefined host

● part of fence-agents-kdump package

● it uses UDP protocol for messaging

● it uses port 7410 (can be changed)

● it sends messages each 10 seconds (can be changed)

Page 15: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 15/43

Kdump and fence_kdump/etc/kdump.conf contains two options to setup fence_kdump:

● fence_kdump_nodes

● list of hosts to send messages to

● fence_kdump_args

● additional parameters for fence_kdump_send

Page 16: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 16/43

kdump.conf with fence_kdumppath /var/crash

core_collector makedumpfile -l --message-level 1 -d 31

fence_kdump_nodes mperina.brq.redhat.com

fence_kdump_args -p 7410 -i 5

Page 17: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 17/43

fence_kdump limitations● fence_kdump destination host(s) have to be predefined and

they are part of capturing kernel initial ramdisk

● fence_kdump receiver can be used to determine if host is kdumping only for one host at the time and it cannot be used to determine if host finished kdumping

● fence_kdump messages are sent unencrypted using UDP protocol

● fence_kdump messages are not signed, sender can be identified only by source IP address

Page 18: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 18/43

How is it coupled together?

Page 19: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 19/43

oVirt kdump integration

Page 20: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 20/43

New fence_kdump listener● new standalone fence_kdump listener was implemented as

a part of oVirt kdump integration

● it can receive messages from multiple kdumping hosts at once

● it can determine that host finished kdumping using timeout from last received message

● it communicates with engine using engine database

● it's executed as a service on the same host as engine

Page 21: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 21/43

Integration – host deploy 1/3● kdump integration can be enabled for each host by setting

an option in Power Management tab of Host detail popup in webadmin

● host needs to be redeployed after kdump integration was enabled

● kdump integration is not bound to cluster level, it can be enabled even for < 3.5 cluster levels

Page 22: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 22/43

Integration – host deploy 2/3● during host deploy there are executed checks if kdump

integration can be enabled:

● host kernel has crashkernel=MEM_SIZE option set

● correct version of kexec-tools is available

● kdump destination address (engine FQDN) can be resolved

● if any of these checks are not successful, host deploy finishes successfully, but kdump integration is not configured and warning displayed

Page 23: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 23/43

Integration – host deploy 3/3● if all checks are successful

● fence_kdump options are updated in /etc/kdump.conf

● kdump service is restarted

● if kdump integration was not successfully configured during host deploy, administrator can fix the issues later manually and try to redeploy host again

Page 24: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 24/43

UI: New Host popup

Page 25: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 25/43

UI: Host Detail

Page 26: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 26/43

Host deploy part limitations● host deploy updates only fence_kdump options in

kdump.conf, other options are untouched

● administrator is responsible to manually set correct kdump target

Page 27: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 27/43

Integration – kdumping 1/2

Page 28: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 28/43

Integration – kdumping 2/2

Page 29: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 29/43

UI: Host start dumping

Page 30: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 30/43

UI: Host finished dumping

Page 31: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 31/43

Configuration

Page 32: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 32/43

fence_kdump listener configListener configuration is stored in text files:

● They need to have .conf suffix

● They have to be located under/etc/ovirt-engine/ovirt-fence-kdump-listener.d directory

● They are simple property based text files

Service restart is needed when config files were changed:

systemctl restart ovirt-fence-kdump-listener

Page 33: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 33/43

Listener config file sampleLISTENER_ADDRESS=0.0.0.0

LISTENER_PORT=7410

HEARTBEAT_INTERVAL=30

SESSION_SYNC_INTERVAL=5

REOPEN_DB_CONNECTION_INTERVAL=30

KDUMP_FINISHED_TIMEOUT=30

Page 34: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 34/43

fence_kdump listener options 1/3LISTENER_ADDRESS

● IP adress(es) that fence_kdump listener listens on

● It can contains either 0.0.0.0 (default) or one specific IP address

LISTENER_PORT

● port that fence_kdump listener listens on (default 7410)

Page 35: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 35/43

fence_kdump listener options 2/3HEARTBEAT_INTERVAL

● Defines the interval in seconds (default 30) of listener's heartbeat updates to database

SESSION_SYNC_INTERVAL

● Defines the interval in seconds (default 5) to synchronize listener's host kdumping sessions in memory to database

Page 36: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 36/43

fence_kdump listener options 3/3REOPEN_DB_CONNECTION_INTERVAL

● Defines the interval in seconds (default 30) to reopen database connection which was previously unavailable

KDUMP_FINISHED_TIMEOUT

● Defines maximum timeout in seconds after last received message from kdumping hosts after which the host kdump flow is marked as FINISHED

Page 37: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 37/43

fence_kdump engine config 1/4● fence_kdump options which are not related to listener are

stored in database and they can be changed using engine‑config tool

● it's required to restart ovirt-engine (and sometimes also redeploy hosts) when these values were changed

Page 38: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 38/43

fence_kdump engine config 2/4FenceKdumpDestinationAddress

● Defines the hostname(s) or IP address(es) to send fence_kdump messages to

● If empty (default), engine FQDN is used

FenceKdumpDestinationPort

● Defines the port (default 7410) to send fence_kdump messages to

Page 39: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 39/43

fence_kdump engine config 3/4FenceKdumpMessageInterval

● Defines interval in seconds (default 5) between messages sent by fence_kdump

FenceKdumpListenerTimeout

● Defines max timeout in seconds (default 90) since last heartbeat to consider fence_kdump listener alive.

Page 40: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 40/43

fence_kdump engine config 3/4KdumpStartedTimeout

● Defines maximum timeout in seconds (default 30) to wait until 1st message from kdumping host is received (to detect that host kdump flow started)

Page 41: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 41/43

Future features

Page 42: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 42/43

Future features● Extend kdump to send it's flow status as a part of

fence_kdump message (starting, dumping, finished, error, ...)

● Extend fence_kdump protocol to:

● use message sequence number

● include unique host id (not to rely just on IP address)

● include HMAC signature for message

Page 43: Integrating kdump into oVirt

Integrating kdump into oVirt 3.5 43/43

THANK YOU !

[email protected] at #ovirt (irc.oftc.net)