Top Banner
What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup
37

What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Dec 16, 2015

Download

Documents

Noah Daniel
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

What to Do When…

CPTE 433 Chapter 1Adapted by John Beckett from

The Practice of System & Network Administration

by Limoncelli, Hogan, & Chalup

Page 2: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Building a Site from Scratch

• Organizational structure

• Business priorities• Namespaces• Solid data center• Solid network• Scalable services• Software depot

Core application services:

• Authentication & Authorization

• Desktop life-cycle• Email• File service &

Backups• Printing• Remote access

Page 3: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Growing a Small Site

• Helpdesk• Checklists for new hires, desktops,

servers• Build a NOC• Organizational structure

– Statistics• Dashboard• Prepare to scale up

Page 4: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Replacing Services

• Know the process• Factor in dependencies

– Network– Service

• Server names versus service aliases• DHCP lease times• DNS TTL (time to live)

Page 5: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Moving a Data Center

Before• Schedule windows

versus redundancy• Design for current use &

expansion• Back up everything just

before moving it• “Fire drill” on backup• Test cases

– Before and after• Label every cable and

device

Moving In• Establish minimal

services at new site• Test new environment

– Networking, power, UPS, HVAC…

• Identify vanguard customers

• Run HVAC 48-72 hrs, replace filters

• Dress Rehearsal

Page 6: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Moving to / Opening a New Building

• 4weeks: Get access

• Radios or walkie-talkies

• PDAs• Order WAN & ISP 2-

3 months in advance

• Prewire offices during construction

• Get a moving company to help planning

• “Inventory” person to keep track of everything– Label fanatic

• Preprinted labels for each person’s workstation stuff

• Plastic bag for all PC cables

• More boxes than you think– Use plastic crates

Page 7: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Handling a High Rate of Office Moves

• Work to limit moves to one day a week.– Develop a routine

• Establish procedure & form to capture all info– Have SA find

nonstandard gear• Connect & test

network ahead of time• Customer powers

down & collects stuff

• Can users help?• Moving company

moves equipment• Helpdesk is

prepared– Communicating

about common problems related to move

• Formalize the entire process

Page 8: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Assessing a Site (Due Diligence)

• Use book headings as an outline for investigation

• Reassure existing people– Looking for ways to improve process– Input welcome

• Private document repository for your team• Document physical equipment• Document services & security• Analyze ticket-system ratios & trends

Page 9: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Dealing with Mergers & Acuisitions - 1

• Get into the early part of the loop

• Connect expectations with legal situation

• CIO should be involved before announcement

• SA: Who at the other company can make big decisions?

• Clear, final decision process

• One designated go-to lead per company

• Dialog with SAs at other company– Establish informal

relationships– Begin with F-F

meeting– Get to technical

details, starting with namespace

Page 10: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Dealing with Mergers & Acuisitions - 2

• Adopt best practices of both, not just the bigger company

• Be sensitive to corporate culture differences

• Both teams need a high-level overview diagram and detailed map of their area

• Determine how the new network architecture should work

• Find out corporate identity issues– Account names– Email– Address format– Domain name– Separate identity or

merged?• Customer issues

with merge

Page 11: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Dealing with Mergers & Acuisitions - 3

• Security issues:– Privacy policy– Security policy– Interconnection with business partners

• Check Router tables– Do you use the same off-net address

space?• Consider a firewall between

companies

Page 12: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Frequent Machine Crashes

• Establish temporary workaround– Tell users it is temporary

• Find the real cause• Fix the real cause, not symptoms

– If hardware, buy better hardware– If environment, fix the environment– Replace the system if necessary

• Give SAs better training on diagnostic tools• Get production back into production quickly

– Get a backup if you don’t have one– Don’t do diagnostic games on production!

Page 13: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Surviving Major Outage

• Use ICS model– Define escalation

before• Notify customers on

channel they use to contact you

• Form “tiger team”– Short meeting to

establish goals• Establish costs of

fallback versus downtime

• Let businesspeople determine how much time to spend attempting

• One hour gathering info• Hourly updates on

progress• Stay with stakeholders

to know when they see success

• Beware of loose cannons• Feed your team

Page 14: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

What Tools Should SAs have?

• Laptop with network sniffing etc.

• Terminal software & serial cable

• Spare PC or server for trying new configs

• Portable label printer• PDA• Screwdrivers• Cable tester• Splicing scissors

• Patch cables of various lengths incl 100’

• Big USB disk drive• Radios• Cabinet w/tools &

spares• Library of ref books• Memberships• Headache medicines• SA Code of Ethics• Snacks

Page 15: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Getting Your Tools Back

• Label them• Open a helpdesk

ticket if one is borrowed

• Some tools won’t be returned

• Team toolbox– Rotate

responsibility to check and get things back

• Give screwdriver kits out free– Normal size– Eyeglass repair

• Don’t give a software person a screwdriver

Page 16: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Why Document Systems & Procedures?

• Good docs say why and how to• When you do things right, they work

right• You need a vacation• You want to move on to other

projects• You will be viewed as an asset• Save yourself scrambling when

investors or auditors ask for it

Page 17: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Why Document Policies?

• Comply with federal health & business regs• Avoid appearing arbitrary• People can’t read your mind• Communicating expectations for your own

team• Avoid being unethical• Avoid needless punishment• Let people change their ways before

trouble happens

Page 18: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Identifying Fundamental Problems in the Environment

• Look at Basics of each chapter• Ask the management chain that funds you• Ask 2-3 customers who use your services• Ask all customers• What consumes your time the most?• Ask helpdesk people what is most common• Ask field people what is most common• Use the “whiteboard” complexity test

Page 19: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Getting More Money for Projects

• Establish the need (for project) in minds of managers

• Find out what management wants, and show how your projects will serve that goal

• Become part of the budget process

• Do more with less – time-management is prime

• Manage your boss• Learn how your

management commnicates with you, and work with that method

• Don’t manage by crisis – show “real cost” of policies & decisions

Page 20: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Getting Projects Done - 1

• Get out of fire-fighting mode

• Get a management sponsor

• Do SAs have resources?

• Hold staff accountable for milestones

• Communicate priorities to SAs, move resources to high-impact projects

• Make sure people have good time-management skills

• Designate dedicated project times– Shield from

interruptions• Reduce the number of

projects• Don’t spend time on

projects that don’t matter

Page 21: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Getting Projects Done - 2

• Consider out-sourcing

• Hire junior staff for mundane tasks

• Hire short-term contract programmers to write code to spec

Page 22: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Keeping Customers Happy

• Make a good impression on new customers

• Communicate more with existing customers

• Go to lunch and listen• Create System Status Web page• Create local Enterprise Portal• Terminate worst performers• Move up the chain of command for

your most costly customers

Page 23: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Keeping Management Happy

• Meet with them in person and listen– Don’t try to do it via email

• Find out the manager’s priorities, and adopt them as your own

• Understand how your manager communicates with you

• Make sure specialized people understand their roles

Page 24: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Keeping SAs happy

• Learn how to manage people• Make sure executives support the

management of SAs• Make sure they take care of themselves• Good fit for their roles?• Overloaded? Time management, getting

more people and divide• Fire dissenters• Hire people with positive attitudes

Page 25: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Keeping Systems From Being Too Slow

• Define slow• Use monitoring to find bottlenecks• Look at performance-tuning info• Recommend a solution• Know what the real problem is before

you try to fix it• Do you understand latency versus

bandwidth?

Page 26: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Coping with a Big Influx of Computers

• Know desktop versus server hardware

• Establish small number of standard hardware configs

• Automate host install, config, updates

• Check power, space, HVAC capacity

• Small computer closets should have cooling

• New employees – see next page

Page 27: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Coping with a Big Influx of New Users

• Make sure hiring process includes computer provisioning– Workstations– Networking– User accounts

• Stock standard desktops

• Automate installs

• Games on new computers?

• Do you have enough power?

• Start them in waves so you can batch processes

Page 28: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Coping with a Big Influx of New SAs

• Assign mentors• Hold orientations

– Key processes– Where to go for help

• Documentation (wiki is good)• Get them reference books• Bulk-order tools

Page 29: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Handling a High SA Team Attrition Rate

• When an SA leaves, lock them out of all systems

• Be sure HR performs exit interviews

• Listen to complaints in private

• Anonymous “upward feedback” path

• Find out what you’re doing wrong

• Do morale-increasing things

• Have them all read about Being Happy

• If a bad apple is making people miserable, get rid of him– Good reason to avoid

nepotism

Page 30: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Handling User Attrition

• Establish a pipeline from management– So you can lock out the right people

• Make sure they return all company-owned equipment– Need a procedure for this

• Take measures against theft of stuff• Take measures against theft of intellectual

property– Restrict access– Termination agreements

Page 31: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Being New to a Group

• Before you comment, ask questions• Meet all coworkers one-on-one• Meet customers informally & formally• Make a good first impression• Give credence to coworkers re

problems, don’t reject out of hand– “That can be a problem”

• Don’t blindly believe complainers – verify

Page 32: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Being the New Manager of a Group - 2

• Establish weekly group staff meetings• Meet your manager & peers one-on-one• Show team members you have faith in them• Meet with customers informally & formally• Ask everyone what the problems are

– Listen, verify, make up your mind• Ask before you call a shot• Underperforming group: “Postpone major high-

risk projects until team is fixed”– Boss probably hired you to do that project!– Maybe that project is how to fix the team

Page 33: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Looking for a New Job

• Why are you looking?• What role do you want to play?• What type of organization do you enjoy?• Meet future coworkers• Never accept the first offer – negotiate• Negotiate in writing what’s important• Interview your future boss• Have a lawyer look at contract

Page 34: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Hiring Many New SAs Quickly

• Use as many recruiting methods as possible

• Make sure your recruiter knows what a good SA is

• Determine the number and skill sets you need– See SAGE level classifications

• Move quickly when you find a good candidate

• After hiring one, refine other job descriptions

Page 35: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Increasing Total System Reliability

• Determine your target and how far you are form it

• Set up monitoring• Deploy end-to-end monitoring for key

apps• Reduce dependencies

– Nothing in the data center should rely on anything outside the data center

– Dependencies should be clearly documented

Page 36: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Decreasing Costs

• Centralize some services

• Review maintenance contracts

• Review datacomm bills

• Reduce running costs through outsourcing

• Can you use standards or automation to reduce costs?

• Improve support docs & training

• Distribute costs more directly to those who incur them

• If people aren’t willing to pay the cost, maybe it isn’t important

• Take control of ordering & inventory

Page 37: What to Do When… CPTE 433 Chapter 1 Adapted by John Beckett from The Practice of System & Network Administration by Limoncelli, Hogan, & Chalup.

Adding Features

• Interview customers re needs• Know the requirements• Maintain existing service• Altering service: have back-out plan• Consider new system versus altering current• Do you need a maintenance window?• Decentralize so local needs can be served• Test• Document