Top Banner
Petascale Data Intensive Computing for eScience Alex Szalay, Maria Nieto- Santisteban, Ani Thakar, Jan Vandenberg, Alainna Wonders, Gordon Bell, Dan Fay, Tony Hey, Catherine Van Ingen, Jim Heasley
22

Petascale Data Intensive Computing for eScience

Feb 08, 2016

Download

Documents

Graham Douglas

Petascale Data Intensive Computing for eScience. Alex Szalay, Maria Nieto-Santisteban, Ani Thakar, Jan Vandenberg, Alainna Wonders, Gordon Bell, Dan Fay, Tony Hey, Catherine Van Ingen, Jim Heasley. Gray’s Laws of Data Engineering. Jim Gray: - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Petascale  Data Intensive Computing for eScience

Petascale Data Intensive Computing for eScience

Alex Szalay, Maria Nieto-Santisteban,

Ani Thakar, Jan Vandenberg, Alainna Wonders, Gordon Bell,

Dan Fay, Tony Hey, Catherine Van Ingen, Jim Heasley

Page 2: Petascale  Data Intensive Computing for eScience

Gray’s Laws of Data EngineeringJim Gray:Scientific computing is

increasingly revolving around data

Need scale-out solution for analysis

Take the analysis to the data!Start with “20 queries”Go from “working to working”

DISSC: Data Intensive Scalable Scientific Computing

Page 3: Petascale  Data Intensive Computing for eScience

Amdahl’s Laws

Gene Amdahl (1965): Laws for a balanced system

i. Parallelism: max speedup is S/(S+P)ii. One bit of IO/sec per

instruction/sec (BW)iii. One byte of memory per one instr/sec

(MEM)iv. One IO per 50,000 instructions (IO)

Modern multi-core systems move farther away from Amdahl’s Laws (Bell, Gray and Szalay 2006)

For a Blue Gene the BW=0.001, MEM=0.12.For the JHU GrayWulf cluster BW=0.5,

MEM=1.04

Page 4: Petascale  Data Intensive Computing for eScience

Typical Amdahl Numbers

Page 5: Petascale  Data Intensive Computing for eScience

Commonalities of DISSCHuge amounts of data, aggregates needed

◦Also we must keep raw data ◦Need for parallelism

Requests benefit from indexingVery few predefined query patterns

◦Everything goes…. search for the unknown!!◦Rapidly extract small subsets of large data sets◦Geospatial everywhere

Limited by sequential IOFits DB quite well, but no need for

transactionsSimulations generate even more data

Page 6: Petascale  Data Intensive Computing for eScience

Total GrayWulf Hardware46 servers with 416 cores1PB+ disk space1.1TB total memoryCost <$700K

Infiniband 20Gbits/s

10 Gbits/s

Tier 1

Tier 2 Tier 3 Interconnect

320 CPU640GB memory

900TB disk

96 CPU512GB memory

158TB disk

Infiniband 20Gbits/s

10 Gbits/s

Tier 1

Tier 2 Tier 3 Interconnect

320 CPU640GB memory

900TB disk

96 CPU512GB memory

158TB disk

Page 7: Petascale  Data Intensive Computing for eScience

Data Layout7.6TB database partitioned 4-

ways◦4 data files (D1..D4), 4 log files

(L1..L4)Replicated twice to each server

(2x12)◦IB copy at 400MB/s over 4 threads

Files interleaved across controllers

Only one data file per volumeAll servers linked to head nodeDistributed Partitioned Views

GW01ctrl vol 82P 82Q1 E D1 L41 F D2 L31 G L1 D41 I L2 D32 J D4 L12 K D3 L22 L L3 D22 M L4 D1

Page 8: Petascale  Data Intensive Computing for eScience

Software UsedWindows Server 2008 Enterprise

EditionSQL Server 2008 Enterprise RTMSQLIO test suitePerfMon + SQL Performance

CountersBuilt in Monitoring Data

WarehouseSQL batch scripts for testingDPV for looking at results

Page 9: Petascale  Data Intensive Computing for eScience

Performance TestsLow level SQLIO

◦Measure the “speed of light”◦Aggregate and per volume tests (R,

some W)Simple queries

◦How does SQL Server perform on large scans

Porting a real-life astronomy problem◦Finding time series of quasars◦Complex workflow with billions of objects◦Well suited for parallelism

Page 10: Petascale  Data Intensive Computing for eScience

SQLIO Aggregate (12 nodes)

0 500 1000 1500 2000 25000

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

Read

Write

time [sec]

ag

gre

ga

te IO

[M

B/s

ec

]

Page 11: Petascale  Data Intensive Computing for eScience

Aggregate IO Per Volume

0 500 1000 1500 2000 2500 3000 35000

500

1000

1500

2000

2500

3000

3500

4000

FE G I J K L M

Page 12: Petascale  Data Intensive Computing for eScience

IO Per Disk (Node/Volume)

GW01 GW02 GW03 GW04 GW05 GW06 GW07 GW08 GW17 GW18 GW19 GW200

10

20

30

40

50

60

70

80

90

E

F

G

I

J

K

L

M

Test file on inner tracks,

plus 4K block format

2 ctrl volume

Page 13: Petascale  Data Intensive Computing for eScience

Astronomy Application Data

SDSS Stripe82 (time-domain) x 24◦300 square degrees, multiple scans

(~100)◦(7.6TB data volume) x 24 = 182.4TB◦(851M object detections)x24 = 20.4B

objects◦70 tables with additional info

Very little existing indexing Precursor to similar, but much

bigger data from Pan-STARRS (2009) & LSST(2014)

Page 14: Petascale  Data Intensive Computing for eScience

Simple SQL Query

0 200 400 600 800 1000 1200 1400 1600 180011000

11200

11400

11600

11800

12000

12200

12400

12600

12800

13000

2a

2b

2c

Harmonic Arithmetic 12,109 MB/s 12,081

Page 15: Petascale  Data Intensive Computing for eScience

Finding QSO Time-SeriesGoal: Find QSO candidates in the SDSS Stripe82

data and study their temporal behaviorUnprecedented sample size (1.14M time series)!Find matching detections (100+) from positionsBuild table of detections collected /sorted by the

common coadd object for fast analysesExtract/add timing information from Field tableOriginal script written by Brian Yanny (FNAL)

and Gordon Richards (Drexel)Ran in 13 days in the SDSS database at FNAL

Page 16: Petascale  Data Intensive Computing for eScience

CrossMatch Workflow

PhotoObjAll

coadd

zone1 zone2

Field

filter filter

xmatch

neighbors

join

Match

10 min

1 min2 min

Page 17: Petascale  Data Intensive Computing for eScience

Xmatch Perf Counters

Page 18: Petascale  Data Intensive Computing for eScience

Crossmatch ResultsPartition the queries spatially

◦Each server gets part of skyRuns in ~13 minutes!Nice scaling behaviorResulting data indexed Very fast posterior analysis

◦Aggregates in seconds over0.5B detections

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100

05000

100001500020000250003000035000400004500050000

Frequency of number of detections per object

Tim

e [

s]

Objects [M]

Page 19: Petascale  Data Intensive Computing for eScience

ConclusionsDemonstrated large scale

computations involving ~200TB of DB data

DB speeds close to “speed of light” (72%)

Scale-out over SQL Server clusterAggregate I/O over 12 nodes

◦17GB/s for raw IO, 12.5GB/s with SQLVery cost efficient: $10K/(GB/s)Excellent Amdahl number >0.5

Page 20: Petascale  Data Intensive Computing for eScience
Page 22: Petascale  Data Intensive Computing for eScience

Test Hardware LayoutDell 2950 servers

◦8 cores, 16GB memory◦2xPERC/6 disk controller◦2x(MD1000 + 15x750GB SATA)◦SilverStorm IB controller (20Gbits/s)

12 units= (4 per rack)x31xDell R900 (head-node)QLogic SilverStorm 9240

◦(288 port IB switch)