Top Banner
1 NSF CCNIE Integra/on: Bridging, Transferring and Analyzing Big Data over 10Gbps CampusWide SoGware Defined Networks BICLSU (Big Data Research Integra6on with Cyberinfrastructure for LSU) SeungJong (Jay) Park Associate Professor Computer Science Center for Computa/on & Technology Louisiana State University
8

BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

May 29, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

1

NSF  CC-­‐NIE  Integra/on:    Bridging,  Transferring  and  Analyzing  Big  Data  over  10Gbps  Campus-­‐Wide  SoGware  Defined  Networks  

 BIC-­‐LSU  

(Big  Data  Research  Integra6on  with  Cyberinfrastructure  for  LSU)  

 Seung-­‐Jong  (Jay)  Park  

Associate  Professor  Computer  Science  

Center  for  Computa/on  &  Technology  Louisiana  State  University  

Page 2: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

2

Big  Data  Research  at  LSU  q Biology  &  Veterinary  

Ø  Genome  Sequencing  

q Chemistry  Ø  Experiment  &  Simula/on  

q Computer  Science  Ø  Data  Mining  &  Visualiza/on  

q Costal  Science:    Ø  Hazard  Simula/on  &  Modeling  

q Physics  &  Astronomy:  Ø  LIGO  

Fast supercomputer, Big Data requires Large storage,

High speed network

Page 3: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

3

Challenges  @  LSU  

HPC clusters

How  to  Store  

How  to  Transfer  

How  to  Process  

§  Each research lab is located at remote place §  It has slow storages: HDD speed < 1Gbps

§  Network between a Lab and HPC : bandwidth < 1Gbps

§  Massage Passing Interface (MPI) : Hard to program

Page 4: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

4

3  Objec6ves  @  LSU  

HPC clusters

How  to  Store  

How  to  Transfer  

How  to  Process  

1.  Develop 8 SSD Storage Servers = 12TB & 20Gbps I/O Bandwidth

2.  Network between Labs and HPC : bandwidth = 20 Gbps

3.  Develop Virtual Hadoop Cluster

Page 5: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

5

LSU  Cyberinfrastructure  for  Big  Data  

Storage Server @Vet School

10Gbps

Edge OF Switch Pronto 3290

LONI

Cisco AS9000

Storage Server @Chemistry

Storage Server @CCT

Storage Server @Biology

Pluribus Core OF Switch @D Boyd

Aggregation OF Switch Pronto 3780

Hadoop On Demand SuperMike II

@Frey

Hadoop Cluster @Frey

Gene Sequencer

Pluribus Core OF Switch @Frey

Storage Server @Costal

40Gbps

Storage Server @EECS

100Gbps Router @Frey

2 X 10Gbps

40Gbps

Internet2 10Gbps

Collaboration with Samsung For SSD storage servers

Page 6: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

6

Case  Study:  Genome  Sequence  Analysis  

q Human  Genome  Sequencing    Ø  An  NIH  standard  set  of  humane  genome  genome  sequence  has  

470  GB  raw  data  and  requires  more  than  TB  memory  for  assembly  

 

q Hadoop/Giraph-­‐based  soGware  framework  

Ø  Assembling  billions  of  short  reads  into  one  3  billion  base  pair  sequences  

Page 7: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

7

Case  Study:  De  novo  Assembly  q Developing  Giraph/Hadoop  based  De  novo  Assembler  

Page 8: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

8

BIC-­‐LSU:  Milestones  •  1st  year:  

–  2013  Sept:  Project  start  –  2013  Dec:  Constructed  fibers  at  2  sites  (CCT,CS)  –  2013  Mar:  SSD  storage  servers  by  Samsung  –  2013  Apr:  Tes/ng  Openflow  Switches  (PICA8,  HP,  Pluribus)  –  2013  May:  Shipping  SSD  servers  from  Samsung  –  2013  July:  Finish  fibers  at  4  sites  (Bio,  Vet,  Chem,  Coastal)  

•  2nd  year:  –  2013  Aug:  deploy  OF  switches  –  2013  Dec:  develop  a  POX  based  OF  controller    –  2014  Feb:  develop  web-­‐based  Gateway  –  2014  May:  Demonstrate  Genome  Assembly  over  BIC-­‐LSU