Page 1
Open Standards andOpen Source in
Datacenter Management
蔡鎮宇 Chen-Yu Tsai <[email protected] >
2014/4/11 OSDC 2014 1
Page 2
Who am I?• Software Engineer @ CloudMosa, Inc.
• System Administrator for 10+ years starting in college
• Skills: breaking and fixing things
2014/4/11 OSDC 2014 2
Page 3
Overview
•Monitoring
•Management
•Provisioning
2014/4/11 OSDC 2014 3
Page 4
- Monitoring -
2014/4/11 OSDC 2014 4
Page 5
Log Everything!
2014/4/11 OSDC 2014 5
Page 6
Where to start?
2014/4/11 OSDC 2014 6
Page 7
MRTG
2014/4/11 OSDC 2014 7
Page 8
Based on SNMPSupported by most network devices
2014/4/11 OSDC 2014 8
Page 9
Exports data and metrics
2014/4/11 OSDC 2014 9
Page 10
Network traffic counters– used by MRTG
2014/4/11 OSDC 2014 10
Page 11
Known MAC addresses- Map the network
2014/4/11 OSDC 2014 11
Page 12
2014/4/11 OSDC 2014 12
Page 13
2014/4/11 OSDC 2014 13
Page 14
Whatever the device supports
Look up vendor specific MIBs
2014/4/11 OSDC 2014 14
Page 15
RRDToolTime Series Database
2014/4/11 OSDC 2014 15
Page 16
MRTG uses it
2014/4/11 OSDC 2014 16
Page 17
Munin uses it
2014/4/11 OSDC 2014 17
Page 18
… uses it
2014/4/11 OSDC 2014 18
Page 19
Write your own!
2014/4/11 OSDC 2014 19
Page 20
2014/4/11 OSDC 2014 20
Page 21
2014/4/11 OSDC 2014 21
Page 22
Munin –Resource Monitoring
2014/4/11 OSDC 2014 22
Page 23
System is slow…
2014/4/11 OSDC 2014 23
Page 24
CPU usage?
2014/4/11 OSDC 2014 24
Page 25
2014/4/11 OSDC 2014 25
Page 26
Memory usage?
2014/4/11 OSDC 2014 26
Page 27
2014/4/11 OSDC 2014 27
Page 28
Disk I/O?
2014/4/11 OSDC 2014 28
Page 29
2014/4/11 OSDC 2014 29
Page 30
Web requests?
2014/4/11 OSDC 2014 30
Page 31
2014/4/11 OSDC 2014 31
Page 32
Use plugins from standard set
2014/4/11 OSDC 2014 32
Page 33
Or write Your Own!
2014/4/11 OSDC 2014 33
Page 34
2014/4/11 OSDC 2014 34
Page 35
Aggregate DataManual configuration for now
2014/4/11 OSDC 2014 35
Page 36
2014/4/11 OSDC 2014 36
Page 37
Others• Monitoring
• Xymon (Hobbit)
• Nagios
• Cacti
• Data collection / Graphing• Graphite
• ZipKin (Twitter)
• Log collection• Scribe (Facebook)
2014/4/11 OSDC 2014 37
Page 38
Management
2014/4/11 OSDC 2014 38
Page 39
IPMIIntelligent Platform Management Interface
2014/4/11 OSDC 2014 39
Page 40
2014/4/11 OSDC 2014 40
Image from Wikipedia
Page 41
Built into most BMCs
2014/4/11 OSDC 2014 41
Page 42
Out-of-Bandvs
Side-band
2014/4/11 OSDC 2014 42
Page 43
Power ControlOn, Off, Reset
2014/4/11 OSDC 2014 43
Page 44
Serial over LANConsole Access
2014/4/11 OSDC 2014 44
Page 45
Boot OrderForce PXE boot?
2014/4/11 OSDC 2014 45
Page 46
SSHSecure Shell
2014/4/11 OSDC 2014 46
Page 47
SSH Public Key Authentication
Don’t need to input password every time.
2014/4/11 OSDC 2014 47
Page 48
OmniTTYConsole-based interactive SSH multiplexer
2014/4/11 OSDC 2014 48
Page 49
Parallel-SSH (pssh)Parallel versions of OpenSSH
2014/4/11 OSDC 2014 49
Page 50
FabricScriptable, Parallel SSH
2014/4/11 OSDC 2014 50
Page 51
Provisioning
2014/4/11 OSDC 2014 51
Page 52
DHCPNetwork Provisioning
2014/4/11 OSDC 2014 52
Page 53
PXE BootBoot over Network
2014/4/11 OSDC 2014 53
Page 54
Auto-configurationvia DHCP
Network Switches
2014/4/11 OSDC 2014 54
Page 55
Kickstart/PreseedAutomatic Install
2014/4/11 OSDC 2014 55
Page 56
ChefPuppet
Disclaimer: We don’t use them.
2014/4/11 OSDC 2014 56
Page 57
Custom PackagesPut programs/services/settings
into native packages.
2014/4/11 OSDC 2014 57
Page 58
Apt-cacher-ngWeb cache for package files
2014/4/11 OSDC 2014 58
Page 59
Put It All Together
2014/4/11 OSDC 2014 59
Page 60
2014/4/11 OSDC 2014 60
Page 61
With the proper hardware/software
2014/4/11 OSDC 2014 61
Page 62
Datacenters Become Manageable
2014/4/11 OSDC 2014 62
Page 63
2~3 People2k+ Nodes in4 Datacenters
2014/4/11 OSDC 2014 63
Page 64
Hands free afterracking and cabling
2014/4/11 OSDC 2014 64
Page 65
2014/4/11 OSDC 2014 65
Page 66
10k nodes?
2014/4/11 OSDC 2014 66
Page 67
100k nodes?
2014/4/11 OSDC 2014 67
Page 68
Evolve!
2014/4/11 OSDC 2014 68
Page 69
We are Hiring!
2014/4/11 OSDC 2014 69
Page 70
Thank You
2014/4/11 OSDC 2014 70