Clemson NextNet SDN Use Cases for Life Sciences Research Kuang-Ching “KC” Wang Associate Professor Clemson University Sponsored by NSF grant OCI‐ 1245936 KC Wang Clemson University 1 July 17 2013
Dec 14, 2015
Clemson NextNet
SDN Use Cases for Life Sciences Research
Kuang-Ching “KC” Wang
Associate ProfessorClemson University
Sponsored by NSF grant OCI‐1245936KC Wang Clemson University 1July 17 2013
Clemson NextNet: A NSF CC-NIE Project
July 17 2013 2KC Wang Clemson University
Objectives:• Direct access to I2
100G Innovation Platform
• Science DMZ from anywhere, w/o manual plumbing
• Campus production,end-to-end support
• Flexible, optimized10~40G access to resources on campus and other universities
• Software defined network (SDN)
What is the Fuss About SDN?
KC Wang Clemson University July 17 2013 3
NetworkResearchers:
Industry:
Traditional network gettinging unmanageable (not about bandwidth)!
Traditional Network SDN
What Do Our (Life Sciences) Folks Need?
KC Wang Clemson University July 17 2013 4
Real-time medical imaging
Two Clemson life sciences researchers in attendance today:• Alex Feltus
– Associate Professor in Genetics & Biochemistry
– Faculty Consultant in Clemson University Genomics Institute
– Research: Rapid crop design with massive gene interaction networks
• David Kwartowitz– Assistant Professor in
Bioengineering– Research: Rapid processing stereo
laparoscopic data for real-time pre- and intra-surgery support
PalmettoHPC
Cluster
DataStore
N…
The Feltus Lab Builds Massive Gene Interaction Networks Using RNA Expression Profiles From Next-Generation Sequence (NGS) and Microarray Experiments.
Rice (Oryza sativa)
Goal: Rapidly design new crop varieties for a specific environment including “old” environments with a changed climate…
Personalized Agriculture
Slide prepared by Alex FeltusKC Wang Clemson University July 17 2013 5
Massive amounts of DNA/RNA/Genetic Data in Databases
1.64 Quadrillion base pairs in 5 yrs!
http://www.ncbi.nlm.nih.gov/Traces/sra/ Slide prepared by Alex FeltusKC Wang Clemson University July 17 2013 6
A NGS Biomarker Example Datasets
5.7G Sample_Feltus1_L006_R1.cat.fastq5.7G Sample_Feltus1_L006_R2.cat.fastq5.8G Sample_Feltus1_L007_R1.cat.fastq5.8G Sample_Feltus1_L007_R2.cat.fastq6.7G Sample_Feltus2_L006_R1.cat.fastq6.7G Sample_Feltus2_L006_R2.cat.fastq6.8G Sample_Feltus2_L007_R1.cat.fastq6.8G Sample_Feltus2_L007_R2.cat.fastq6.5G Sample_Feltus3_L006_R1.cat.fastq6.5G Sample_Feltus3_L006_R2.cat.fastq6.6G Sample_Feltus3_L007_R1.cat.fastq6.6G Sample_Feltus3_L007_R2.cat.fastq7.3G Sample_Feltus4_L006_R1.cat.fastq7.3G Sample_Feltus4_L006_R2.cat.fastq7.4G Sample_Feltus4_L007_R1.cat.fastq7.4G Sample_Feltus4_L007_R2.cat.fastq5.6G Sample_Feltus5_L006_R1.cat.fastq5.6G Sample_Feltus5_L006_R2.cat.fastq5.7G Sample_Feltus5_L007_R1.cat.fastq5.7G Sample_Feltus5_L007_R2.cat.fastq8.8G Sample_Feltus6_L006_R1.cat.fastq8.8G Sample_Feltus6_L006_R2.cat.fastq8.9G Sample_Feltus6_L007_R1.cat.fastq8.9G Sample_Feltus6_L007_R2.cat.fastq
2.4G Sample_Feltus1_L007_R1.MERGED.BAM2.4G Sample_Feltus1_L007_R1.MERGED.BAM2.7G Sample_Feltus2_L006_R1.MERGED.BAM2.7G Sample_Feltus2_L007_R1.MERGED.BAM2.6G Sample_Feltus3_L006_R1.MERGED.BAM2.6G Sample_Feltus3_L007_R1.MERGED.BAM3.0G Sample_Feltus4_L006_R1.MERGED.BAM3.0G Sample_Feltus4_L007_R1.MERGED.BAM2.2G Sample_Feltus5_L006_R1.MERGED.BAM2.2G Sample_Feltus5_L006_R1.MERGED.BAM2.9G Sample_Feltus6_L006_R1.MERGED.BAM2.9G Sample_Feltus6_L007_R1.MERGED.BAM
6 RNA Samples in Duplicate163.6 GB (raw) + 31.8 GB (processed) =195.4 GB of critical data files(<6 hours to process on cluster)
Does not include: Intermediate processing filesReference genome (0.72 GB)
RAW DATA (uncompressed) PROCESSED DATA (compressed)
Slide prepared by Alex FeltusKC Wang Clemson University July 17 2013 7
The CUTTERS (Kwartowitz) lab is working to enable remote processing of stereo laparoscopic data for real-time feedback with surgical robot systems
on partner sites (Vanderbilt, Mayo Clinic)
KC Wang Clemson University 8July 17 2013
Clemson, SC
Vanderbilt, TN
Mayo Clinic, MN
PalmettoHPC
Cluster
How Does It Work Today
KC Wang Clemson University July 17 2013 9
ISP 1Internet
ISP 2Internet
R&Enet
……
DataCenter
CampusNetwork
ResearchNetwork
R&Enet 1
G
Down the road• compliances• User-specific
privileges• access control
Porting GENI Research Prototype to ProductionSOS: Seamless Large Data Transport
KC Wang Clemson University 11July 17 2013
Perceived point-to-point or multi-point connection
SOS-enabledswitch
SOS-enabledswitch
SOSController
1
2
3.1
4.1
SOSagent
SOSagent
3.2
4.2
SOS pipe
TCP TCP
SOSUW-Madison
SOSClemson
SOSStanford
SOSSCinet
GENIcore
Steroid OpenFlow Service (SOS)by Aaron Rosen and KC Wang
• Seamless TCP throughput upgrade, e.g., 2.5 Mbps 120 Mbps• Multipath support• Automatic site agent detection
Upcoming demos of SOS:
• NSF 12th GENI conference, Kansas City, MO.• Supercomputing 2011, Seattle, WA.
Significance of IT Support Team to Bootstrap Researcher Use of HPC and SDN
KC Wang Clemson University
May 2010: Galen joins CITI and begins recruiting & training
users
New Palmetto Cluster Users
Num
ber o
f Use
rs
And to Create a Transformative University• a unique coalition among academy, IT, and industrial partners
within and beyond Clemson.
• Synergy with other university research centers: Cyberinstitute, ICAR, and Watts Innovation Center
KC Wang Clemson University July 17 2013 14
Synergy with Cross-Communities Momentum
KC Wang Clemson University July 17 2013 15
Research Communities Companies
Open Source Communities IT Communities
Universities
. . .