CIT 470: Advanced Network and System Administration Slide #1 CIT 470: Advanced Network and System Administration Change and Configuration Management
Jan 27, 2016
CIT 470: Advanced Network and System Administration Slide #1
CIT 470: Advanced Network and System Administration
Change and Configuration Management
CIT 470: Advanced Network and System Administration Slide #2
Topics
1. Change Management
2. Change Processes
3. Revision Control
4. Configuration Management
5. cfengine
Images from Pro Git
CIT 470: Advanced Network and System Administration Slide #3
Change Management
Effective planning and implementation of changes to systems.
Changes should be1. Well documented.
2. Have a backout plan.
3. Reproducible.
CIT 470: Advanced Network and System Administration Slide #4
Why do we need Change Management?
March 26-29, 2006: BART trains halted to avoid running into each other when computer systems crashed.• Crashes on Monday/Tuesday resulted from
software maintenance upgrades.• Crash on Wednesday resulted from installing a
backup system to avoid future crashes.• Thousands of passengers stranded for several
hours each time.
CIT 470: Advanced Network and System Administration Slide #5
Change Management
1. Plan change.
2. Test change on single system.
3. Test change on multiple systems.
4. File a change request.
5. Change committee approves request.
6. Schedule change.
7. Communication with users/admins.
8. Change systems at scheduled time.
9. Post-event analysis.
CIT 470: Advanced Network and System Administration Slide #6
Testing Changes
• Automated checks.– Sanity checks like Samba testparm.– Reboot system.
• Test on one system first.
• Then test on set of systems.– Dedicated test systems.– System admin workstations.– Virtual machines.
CIT 470: Advanced Network and System Administration Slide #7
When do you need a Change Proposal?
Does the change impact critical services?
Critical machines/services– Business critical: e-commerce server, etc.– Essential services: routers, DNS, NFS, auth.
Non-critical machines/services– Individual desktops– Internal news web server
CIT 470: Advanced Network and System Administration Slide #8
Change Proposal
1. Description of the change.
2. Systems impacted by change.
3. Why the change is being made.
4. Risks presented by the change.
5. Test procedure.
6. Backout plans.
7. How long the change will require.
CIT 470: Advanced Network and System Administration Slide #9
CommunicationCommunicate change to impacted people.
– What change is being made (nontechnical.)
– Which services will be unavailable.
– When and how long will they be unavailable.
– What actions do they need to task (if any.)
Communication issues– If you send too many notes, they’ll be ignored.
– Send notices only to those impacted.
– Push critical notices; use pull for non-critical.
CIT 470: Advanced Network and System Administration Slide #10
Scheduling
Scope When Notification Type
Routine Single host or user.
Anytime. Personal.
Major Many hosts or users.
Off-peak Push.
Sensitive None but major impact on failure.
Off-peak. Pull.
CIT 470: Advanced Network and System Administration Slide #11
Change Freezes
Time when only minor updates can be done.– End of quarter or year.– “Crunch time” for projects.
CIT 470: Advanced Network and System Administration Slide #12
Backing Out
Decide back-out conditions before downtime– Avoid the “just 5 more minutes” problem.
– Be sure that someone is keeping track of time.
Questions:– How much time is required for back out?
– When is the latest time you can successfully back out?
– Will backing out this change prevent other changes from being committed?
CIT 470: Advanced Network and System Administration Slide #13
Backing Out: How to do it?
Service-level changesUse revision control system to revert config.
Restart service.
Machine-level changesSoft cutover: Old service is still running.
Hard cutover: Power up old server or restore from backups.
IssuesData migration.
Compatibility.
CIT 470: Advanced Network and System Administration Slide #14
Automatic Checks
Check integrity of critical files before use.– Some services provide checks: LDAP, SMB.– Check startup files by rebooting machine.– Write your own checks for other files.
• Most people only do this after they have a problem.
CIT 470: Advanced Network and System Administration Slide #15
Revision ControlRevision control systems provide
Conflict management: prevents multiple people from modifying file at once and corrupting it.
Change history: records who modified the file when and why the change was made.
Revision control paradigmsLock-Modify-Unlock: rcs
Copy-Modify-Merge: cvs, subversion, etc.
Distributed: darcs, git, mercurial
Local Version Control
CIT 470: Advanced Network and System Administration Slide #16
Centralized Version Control
CIT 470: Advanced Network and System Administration Slide #17
Distributed Version Control
CIT 470: Advanced Network and System Administration Slide #18
Local Git Operations
CIT 470: Advanced Network and System Administration Slide #19
Git File Lifecycle
CIT 470: Advanced Network and System Administration Slide #20
Gitk history visualizer
CIT 470: Advanced Network and System Administration Slide #21
CIT 470: Advanced Network and System Administration Slide #22
CIT 470: Advanced Network and System Administration Slide #23
References1. Mark Burgess, Principles of Network and System
Administration, 2nd edition, Wiley, 2004.2. Aeleen Frisch, Essential System Administration, 3rd
edition, O’Reilly, 2002.3. Thomas A. Limoncelli and Christine Hogan, The Practice
of System and Network Administration, Addison-Wesley, 2002.
4. Evi Nemeth et al, UNIX System Administration Handbook, 3rd edition, Prentice Hall, 2001.
5. Todd R. Weiss, “IT upgrades slow BART trains in San Francisco,” http://www.computerworld.com/printthis/2006/0,4814,110107,00.html, ComputerWorld, March 31, 2006.