Top Banner
CIT 470: Advanced Network and System Administration Slide #1 CIT 470: Advanced Network and System Administration Servers and Services
41

CIT 470: Advanced Network and System Administration

Jan 25, 2016

Download

Documents

sharis

CIT 470: Advanced Network and System Administration. Servers and Services. Topics. SERVERS Servers vs Desktops Server Hardware Different Approaches to Servers SERVICES Service Requirements Open Architecture Service Design Principles. How are Servers different?. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #1

CIT 470: Advanced Network and System Administration

Servers and Services

Page 2: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #2

Topics

SERVERS

1. Servers vs Desktops

2. Server Hardware

3. Different Approaches to Servers

SERVICES

1. Service Requirements

2. Open Architecture

3. Service Design Principles

Page 3: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #3

How are Servers different?

• 100s or 1000s of clients depend on server.

• Requires high reliability.

• Requires tighter security.

• Often expected to last longer.

• Investment amortized over many clients, longer lifetime.

Page 4: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #4

Vendor Product Lines

Home– Cheapest purchase price.

– Components change regularly based on cost.

Business– Focuses on Total Cost of Ownership (TCO).

– Slower hardware changes, longer lifetime.

Server– Lowest cost per performance metric (nfs, web)

– Easy to service rack-mountable chassis.

– Higher quality (MIL-SPEC) components.

Page 5: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #5

Server Hardware

• More internal space.• More CPU/Memory.

– More / high-end CPUs.– More / faster memory.

• High performance I/O.– PCIe vs PCI– SCSI/FC-AL vs. IDE

• Rack mounted.• Redundancy

– RAID– Hot-swappable hardware.

Page 6: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #6

Rack Mounting

Efficient space utilization. Simple, rectangular shape measured in RUs. Repair and upgrade while mounted in rack. No side access required.

Requirements Cooling through back, not sides. Drives in front, cables in back. Remote management (serial console, hw sensors)

Page 7: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #7

Server Memory

Servers need more memory than desktops.– x86 supports up to 64GB with PAE.– x86-64 supports 1 PB (1024 TB)

Servers need faster memory than desktops.– Higher memory speeds.– Multiple DIMMs accessed in parallel.– Larger CPU caches.

Page 8: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #8

Server CPUs

Enterprise Processors– Intel Xeon (x86)– AMD Opteron (x86)– Itanium 2– Sun UltraSPARC T1

• 4, 6, or 8 cores.• Each with 4 threads.

– IBM POWER 5• dual-core• Each with 2 threads.

POWER 5 MCM with 4 dual-core HT CPUs + 4 36MB L3 cache chips.

Page 9: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #9

Xeon vs Pentium

Xeon improvements– Faster L2 cache (Pentium-II/III)– Multiprocessing support (or >2 MP

support)– Hyperthreading (before Pentium-4 could)– x86-64 support (before Pentium-4 could)– Larger L2 cache (Pentium-4)– Faster FSB (Pentium-4)

Page 10: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #10

System Buses

Servers need high I/O throughput.– Fast peripherals: SCSI-3, Gigabit ethernet– Often use multiple and/or faster buses.

PCI– Desktop: 32-bit 33 MHz, 133 MB/s– Server: 64-bit 66 MHz, 533 MB/s

PCI-X (backward compatible)– v1.0: 64-bit 133 MHz, 1.06 GB/s– v2.0: 64-bit 533 MHz, 4.3 GB/s

PCI Express (PCIe)– Serial architecture, v2.0 up to 16 GB/s

Page 11: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #11

Hardware Redundancy

Disks are most likely component to fail.– Use RAID for disk redundancy.– Cover in detail in Disks lecture.

Power supplies second most likely to fail.– Use redundant power supplies.– Many servers need 2 power supplies normally.– Need 3 power supplies for redundancy.– Use separate power cord and UPS for each

power supply.

Page 12: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #12

Full and n+1 Redundancy

n+1 Redundancy: One component can fail, but the system is still functional.– Ex: RAID 5, dual NICs with failover

Full Redundancy: Two complete sets of hardware configured with failover mechanism.– Manual: SA switches to 2nd system when notices failure.

– Automatic: The second system monitors the first and switches over automatically on failure.

– Load-sharing: Both systems serve users, sharing load, but each has capacity to handle entire load on its own. When one fails, other automatically handles entire load.

Page 13: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #13

Hot-swap Components

Hot-swap components– Components can be replaced while running.– Need n+1 redundancy for this to be useful.– Don’t need to schedule a downtime.

Issues– Which parts are hot-swappable?– May require a few seconds to reconfigure.– Be sure components are hot-swap, not hot-plug.

Page 14: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #14

Hot Plug and Hot Spare

Hot Plug– Electrically safe to replace component.– Part may not be recognized until next reboot.– Requires downtime, unlike hot swap.

Hot Spare– Spare component already plugged into system.– System automatically uses hot spare when

disk/CPU board etc. fails.– Provides n+2 redundancy.

Page 15: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #15

Separate Administrative Network

Reliability– Allows access to machines even when network

is down.

Performance– Backups require so much bandwidth that they’re

often done over their own network.

Security– Network security monitoring data and logs sent

across network should be secured.

Page 16: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #16

Maintenance Contracts

• All machines eventually break.• Vendors offer variety of maint contracts.• Non-critical: Next-day or 2-day contract.• Clusters: If you have many similar hosts (CPU or

web farm), then on-site spares may be cheaper than maintenance contract.

• Controlled Model: Use small # of machine types for all servers, so you can afford a spares kit.

• Critical Host: Same-day response or on-site spares.• Highly Critical: On-site technician + dup machine.

Page 17: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #17

Data Protection

• Avoid desktop backups by storing data on servers. Easy on UNIX, harder on Windows.

• Use RAID for server hardware failures.– Mirror root disk, higher RAID levels for data.– Some servers use 16GB Flash drives for root disk.– Doesn’t protect against software mistakes.

• Server backups– Use specialized admin network to keep load off main

network.– Use specialized tape jukeboxes to fully automate

backups of large data servers (DBs, fileservers).

Page 18: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #18

Keep Servers in Data Center

Data center necessary for server reliability.– Power (enough power, UPS)– Climate control (temperature, humidity)– Fire protection– High-speed network– Physical security

Page 19: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #19

Server OS

Need greater reliability, security than desktop.– Remove unnecessary OS components.– Configure for best security & performance.

Install and config specialized server software.– Server software: web, db, nfs, dns, ldap, etc.– May need monitoring software too.– Configuration: disk space, networking

Server OS install should be automated too.

Page 20: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #20

Remote Administration

Servers must be accessible remotely.– Allows SA to fix problems quickly at 3am.– Allows SA to work outside machine room.

Remote Administration– Serial console and concentrator (UNIX)– Networked KVM (Windows)– Remote power control.– Important to secure remote admin facilities.

Page 21: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #21

Server Appliances

Dedicated hardware + software– Fileserver (NetApp, Auspex)– Print servers– Routers

Advantages– Performance– Reliability– Easy to setup– Extra capabilities

Disadvantages– Cost

Page 22: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #22

Many Inexpensive Workstations

Why buy svr hardware?– Buy two cheap rack-

mount PCs + failover software.

– Works if two PCs cheaper than server.

– Google’s approach with ~450,000 servers.

Page 23: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #23

Blade Servers

• High-density servers on a board.– CPU– Memory– Disk

• Each blade lives in a blade chassis.

Page 24: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #24

Blade Chassis

• Blade chassis provides power, network, remote.

• Typically hot-swappable, hot-spare.

• Racks can only support 1 svr/RU.

• Blades are higher density, but also require more power and cooling.

Page 25: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #25

Servers vs Services

A server is a piece of hardware.

A service is the function that is provided by one or more servers.

Page 26: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #26

Services

• Distinguish structured computing environment from some standalone PCs.

• Large orgs linked through shared services to ease communication and optimize resources.

• Typical environments have many services– Fundamental: net, DNS, email, auth, printing.– Typical: DHCP, backup, directory, file, license.

• Services often depend on other services– Almost everything depends on DNS.

Page 27: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #27

Providing a Service

• A service is more than hardware+software.

• A service must be– Reliable.– Scalable.– Monitored.– Maintained.– Supported.

Page 28: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #28

Servers and Services

For a service to be reliable, servers– Should be as simple as possible.– Should have minimum software to run service.– Should depend on as few other services.– Should depend only on services that are at least

as reliable as the service running on the server.– Should have access restricted to SAs.– Should be as few as needed for performance and

reliability.

Page 29: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #29

Customer Requirements

Customers are the reason for the service.– How do they intend to use it?

– What features do they need?

– What features would they like to have?

– How critical is the service?

– What levels of availability and support are needed?

Service Level Agreement (SLA)– Enumerates services.

– Defines level of support.

– Commits to response times for problem types.

Page 30: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #30

Operational Requirements

Essential to designing a reliable service– What services does it depend upon?– What other services will depend upon it?– How does it interoperate with other services?– How can it be integrated with auth/dir services?– How does the service scale?– How can the service be upgraded?

• Downtime requirements.

• What systems are affected?

Page 31: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #31

Open Architecture

Service should be built around open standards– Check IETF RFCs to see if it’s an open protocol.

– Example service: SMTP

– Example products: exim, postfix, qmail, sendmail.

– Open standards don’t require open source.

Allows vendors to make interoperable products.– Avoids vendor lock-in.

– Allows vendor competition (cheaper prices for you.)

– Decouples client selection from server selection.

– Avoids need for protocol gateways.

Page 32: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #32

Requests for Comments (RFCs)

• Documentation for Internet protocols, technologies, and methodologies.– Standards track RFCs describe Internet standards

(TCP, IP, SMTP) and must be approved by IETF.– Experimental RFCs may become standards.– Best Common Practice RFCs describe how to run

services or use protocols.– Informational RFCs is a catch-all including

proprietary protocols, April Fool’s jokes, etc.

• Available from http://www.rfc-editor.org/

Page 33: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #33

Principles for Designing a Reliable Service

Simplicity– The more features, the more bugs.– Simplicity increases reliability, ease of

maintenance.

Vendor Relations– Can be helpful about configuring service.– Let vendors compete for your business.– Stick to vendors who develop for your platform.

Page 34: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #34

Machine Independence

Will eventually move service to new host.– Want to avoid having a downtime.– Want to avoid reconfiguring every desktop.

Use generic DNS alias for machine – Mail server has name romero– DNS alias is smtp

Use virtual IP addresses for non-name svcs– Machine has usual IP address: 192.168.1.54– Virtual: ifconfig eth0:0 192.168.1.5

Page 35: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #35

Dedicated Machines

Put each service on its own machine(s).– If a server crashes, only impacts one service.– Easier to debug if only one service running.– Performance tuning easier with one service.– If you can’t afford a new machine, use a VM.

Page 36: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #36

Environment

Safe environment– Improves reliability: AC, UPS, physical security.– Data center usually provides faster network too.– Only rely on services provided by data center.

Restricted access– Customers should not need to login to servers.– More logins decrease stability, performance.– Even Windows can be stable w/o user logins.

Page 37: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #37

Principles for Designing a Reliable Service

Service components should be tightly coupled.– Other than redundant components.– Share same power source, network.– Reduces service dependencies (single points of

failure.)

Centralize management of service– Managed by one set of SAs.– Support for service by single helpdesk.– Document service.

Page 38: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #38

Performance

Latency vs throughput– Latency is delay before data received.– Throughput is how much data sent per second.– Performance problems typically affects one.– Increasing the other will not solve your problem.

Remote sites– May have high latency to main site.– Do you need secondary servers at remote sites?

Page 39: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #39

Capacity Planning

Estimate capacity from testing.– Test server at 100 qps, 200 qps, until slow.– Identify resources used by each query

• RAM

• Disk

• Network

• CPU

Can service be split onto multiple servers?– Can it be done w/o users noticing?

Page 40: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #40

Principles for Designing a Reliable Service

Monitoring– Availability, problems, performance.

– Auto-alert front line support.

– Customers shouldn’t discover problems before SA.

– Capacity planning: CPU, mem, disk, network, licenses.

Service Rollout– First impressions are difficult to change.

– Be ready for support: docs, trained helpdesk.

– Use one, some, many technique.

Page 41: CIT 470: Advanced Network and System Administration

CIT 470: Advanced Network and System Administration Slide #41

References

1. Mark Burgess, Principles of System and Network Administration, Wiley, 2000.

2. Aeleen Frisch, Essential System Administration, 3rd edition, O’Reilly, 2002.

3. Evi Nemeth et al, UNIX System Administration Handbook, 3rd edition, Prentice Hall, 2001.

4. SAGE, SAGE Code of Ethics, http://www.sage.org/ethics.mm

5. Wikipedia, http://en.wikipedia.org/wiki/POWER5