Usenet Training May 4-7, 2004 Orlando, FL At Disneys Coronado Spring Resort.

Post on 26-Mar-2015

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Usenet Training

May 4-7, 2004

Orlando, FL

At Disney’s Coronado Spring Resort

Day Two

Building an Enterprise Usenet Environment

Questions to ask

• How many concurrent NNTP reading sessions?

– I don't know

– What is the size of the user base?

– 5-10% or more may access news

• How much retention?– What content?

– Full feed, internal discussion, partial feed

• Binaries/Text– 97-99% size from binaries

– Full feed is 1.1 TB/day - doubles every 9 months

• Redundancy desired– Full data replication

– Disaster recovery

– Failover

• Feeding architecture to provide/receive• Authentication and classes of service• Performance

– Honda, Audi, Porsche

Sample Architecture Criteria

• 8,000 Concurrent Connections

• 14 days binary, 180 days text

• Two data centers

• Full redundancy, outage of one DC can support 4000 connections.

• Sell 10 full feeds, peer with 50 providers

• Authentication by IP address

Calculations

• Note about calculations– Change over time

– Not hard and fast

– Change with experience and new architectures

Storage

• 14 days Binary and 180 days text• 17 TB• x2 with full redundancy• Adaptive spooling

Machine Class Choices

• Problems we're solving– Totally random I/O

– Significant network volume

– We're not NASA

– No Complex Number Theory modelling here

– WOPR is misplaced

– Disk mechanics

– Slowest evolving technology

– Physical limitations

• Cost – Administration

– Deployment

Machine Classes

• 2000 connections per host - 4 hosts

• 1000 connections per host - 8 hosts

• 400 connections per host - 20 hosts

• Machine Class dictates OS or vice versa

• Summary– We're looking for an efficient I/O mover

– Many paths, independent busses

– Carl Lewis, not Andre the Giant.

Architecture Diagramming

• Start with Tornado Front Ends– At least 100 GB storage

• At least 1 "fast" spindle per 400 connections

• At least 2 "slow" spindles per 300 connections

• Sun 280R's, 2 CPU, 4 GB RAM, Sun 3510 attached array (shared)

• Module to support 2000 connections

Tornado Back Ends

• 4000 connections accessing articles• 1 BE with one array can be enough • BE's can be smaller than an FE

– Direct spool sharing only– Not quite a Commodore 64… but close

Tornado Back Ends

• Storage Array– Very rough estimate 1 drive per 75 connections.– Optimize for Random I/O– Benchmark Precautions

• Usually sequential I/O, push for random access data.• Plan on performance 10-20% of spec benchmark or

even less.– 48x72GB 10,000 RPM drives (3456 GB)

• FCAL Switching– Writes – approx 120 mbps ~ 13 MB/sec– Reads

• 4000 connections– Average? 1Mbps – Many “constant downloaders”– .11 MB/sec X 4000 = 450 MB/sec– 450 MB/sec x ~6 (truly random adjustment)– 2.6 GB/sec Switching Spec

» Start with one 2GB/sec switch with plans to add another.

• We only have 3 TB and we need 17?!

Enter Storm Cellar

• 17TB total, 3 TB deployed - need 14 TB• 3 Storm cellars with 6 TB usable• Split cascade feed from Tornado Back

Ends• 4U NAS Appliance 24x300GB drives

(7.03 TB / ATA Drives)• Linux 2.6.2+

Feeding

• 10 Full feeds @ 120 mbps ~ 3-5 feeds per box

• 2 Peering Cyclones per site• 1 Master Cyclone• 1 Hurricane

Data Center Architecture

Full Architecture

What could we do with Linux?

• Features we can use– FE adaptive caching

– Cyclone Split Feeding

– Newest Linux kernels

• FCAL sharing probably not feasible– IP spool sharing with adaptive caching

– Hardware choices

• 1 Day’s worth of disk on each FE for cache

• BE's have 4+ TB each

Linux Architecture

• Local spool cache on Tornado FE’s serves recent articles that are actually read (8 copies – 16 for 2 Data Centers)

• Spools are aggregated by all FE’s

• Back Ends contain only partial data – spreading I/O over many BE’s.

Quiz

1. Trace the path of locally posted articles.

2. Trace the path of articles received from peers

3. Where and how are articles numbered?

4. If a Front End is down, what happens?

5. If a Back End is down what happens?

6. If the storage array is out, what happens?

7. What do we do if the Master Cyclone goes down?

8. What do we do if the Hurricane goes down?

9. How do we add a feed from another data center to only fill in articles we’re missing?

10. How do we add a feed to only turn on if our Master Cyclone goes down?

11. Who should we peer with?

Questions

Questions

top related