ExoGENI Federated Private NIaaS - Internet2 · 4/2/2015 · Dedicated vs. Virtual resources! • GENI provides a distributed software-defined infrastructure (SDI)" – Compute +
Post on 26-Jul-2020
1 Views
Preview:
Transcript
ExoGENI Federated Private NIaaS Infrastructure!
Chris Heermann!ckh@renci.org!
Overview!
• ExoGENI architecture and implementation"• ExoGENI Science use-cases"
– Urgent Computing: Storm Surge Predictions on GENI
– ScienceDMZ as a Service: Creating Science Super-Facilities with GENI
• Support for SDN in ExoGENI"
2!
IaaS: clouds and network virtualization
Cloud Providers
Virtual Compute and Storage Infrastructure
Breakable Experimental Network
Transport Network Providers
Cloud APIs (Amazon EC2 ..) Dynamic circuit APIs (NLR Sherpa, DOE OSCARS, I2 ION, OGF NSI …)
Virtual Network Infrastructure
controller
• ORCA is a “wrapper” for off-the-shelf cloud and circuit nets etc., enabling federated orchestration: + Resource brokering + VM image distribution + Topology embedding + Stitching + Federated Authorization
• GENI, DOE, NSF SDCI+TC • http://geni-orca.renci.org • http://networkedclouds.org
Open Resource Control Architecture
B
SM
AM
aggregate
coordinator
The APIs!
• Simple API, complex description language"– createSlice(sliceName, Term, SliceTopology, Credentials)"
• Topology management"– deleteSlice(sliceName)"– sliceStatus(sliceName)"
• Debugging"– modifySlice(sliceName, TopologyUpdate)"
• Elasticity"– extendSlice(sliceName, NewTerm)"
• Agility"• Description language:"
– NDL-OWL – OWL-based ontology that describes"• Participating in US-EU effort to standardize the IaaS ontology"
– User: Resource requests"– Provider: Resource description, public resource advertisement, manifest"
5!
GENI Federation!• Federated identity"
– InCommon "– X.509 identity certificates"
• Common APIs"– Aggregate Manager"
• ExoGENI has a compatibility API layer supporting AM API v2 "– Clearinghouse"
• Federated access policies"– ABAC"
• Agreed upon resource description language"– RSpec"
• ExoGENI translates relevant portions from NDL-OWL to RSpec and back as needed"• Several major portions"
– ExoGENI, InstaGENI, WiMax, Internet2 AL2S"• Federation with EU"
– Amsterdam XO rack part of SDX demo at GEC21 with iMinds"
6!
Virtual network exchange
Virtual colo campus net to circuit
fabric
Cloud hosts with network control
Building network topologies
Computed embedding
Slice owner may deploy an IP network into a slice (OSPF).
OpenFlow-‐enabled L2 topology
slice
ExoGENI • Every Infrastructure as a Service, All Connected.
– Substrate may be volunteered or rented. – E.g., public or private clouds, HPC, instruments and transport
providers – Contribution size is dynamically adjustable
• ExoGENI Principles: – Open substrate – Off-the-shelf back-ends
• OSCARS, NSI, EC2 etc. – Provider autonomy – Federated coordination – Dynamic contracts – Resource visibility
Breakable Experimental Network
Current topology!
9!
An ExoGENI cloud “rack site”
Management switch
OpenFlow-enabled L2 switch
Sliverable Storage
2x10Gbps
dataplane links
4x1G
bps
man
agem
ent
and
iSCSI
st
orag
e link
s (b
onde
d)
To campus Layer 3network
Dataplane to dynamic circuit backbone (10/40/100Gbps)
Static VLAN tunnels provisioned
to the backbone
Worker nodeWorker nodeWorker nodeWorker nodeWorker nodeWorker nodeWorker nodeWorker nodeWorker nodeWorker node
Management node
(optional)Dataplane to campus
network for stitchable VLANs
Direct L2 Peeringw/ the backbone
option 1:tunnels
option 2:fiber uplink
ExoGENI software structure
Current deployments!
• xCAT"– Operator node provisioning"– User-initiated bare-metal provisioning"
• OpenStack Essex++ (RedHat/CentOS version)"– Custom Quantum plugin to support multiple dataplanes"– Working on Juno port"
• iSCSI user slivering"– IBM DS3512 appliance"
• NetApp iSCSI support in the works"– Linux iSCSI stack"
• Backend support for LVM, Gluster, ZFS"
12!
Tools!
• ORCA Native tools (native APIs, resource descriptions)"– Flukes"– More flexibility"
• Federation tools (federation APIs, resource descriptions)"– Jacks, omni, jFed"– Compatibility"
13!
Tools (continued)!
Presentation title goes here" 14!
ExoGENI – a federation of private clouds!
• Each site is a micro-cloud"– Adding support for HPC batch schedulers"
• Owners decide what portion of resources to contribute"
• Free to continue using native IaaS interfaces"• Have the opportunity to take advantage of
federated identity and inter-provide orchestration mechanisms"
• What is it good for?"– Foundation for future science institutional collaborative CI"
15!
!ExoGENI Science Use-cases!
Presentation title goes here" 16!
Computing Storm Surge!• ADCIRC Storm Surge Model"
– FEMA-approved for Coastal Flood Insurance Studies "– Very high spatial resolution (millions of triangles)"– Typically use 256-1024 cores for real-time (one simulation!) "
ADCIRC grid for coastal North Carolina
Tackling Uncertainty!
Research Ensemble NSF Hazards SEES project
22 members, H. Floyd (1999)
One simulaJon is NOT enough! ProbabilisJc Assessment of Hurricanes
A “few” likely hurricanes Fully dynamic atmosphere (WRF)
Why GENI?!• Current limitations: Real-time demands for compute resource"
– Large demands for real-time compute resources during storms"– Not enough demand to dedicate a cluster year-round"
• GENI enables"– Federation of resources"– Cloud bursting, urgent, on-demand"– High-speed data transfers to/from/between remote resources"– Replicate data/compute across geographic areas"
• Resiliency, performance"
Storm Surge Workflow!
Ensemble Scheduler
Collector
• Whole workflow is 22 ensemble members • Pegasus workflow management system
…
Slice Topology!
• 11 GENI sites (1 ensemble manager, 10 compute sites) • Topology: 92 VMs (368 cores), 10 inter-‐domain VLANs, 1 TB iSCSI storage • HPC compute nodes: 80 compute nodes (320 cores) from 10 sites
Representative Science DMZ!
Dedicated vs. Virtual resources!• GENI provides a distributed software-defined infrastructure
(SDI)"– Compute + Storage + Network"
Emerging Trend: Super Facilities, Coupled by Networks!
Experimental faciliJes are being transformed by new detectors, advanced mathemaJcs, roboJcs, automaJon, advanced networks.
Today’s Demonstration:Real-time data processing and vis. workflow!
h_p://portal.nersc.gov/project/als/sc14/
Data from ALS Experiment
SPADE instance @ Server at Argonne
ExoGENI SPADE VM @ Starlight, Chicago
ESnet
ExoGENI SPADE VM @ Oakland, California
Compute Cluster NERSC, LBL
AL2S, ESnet
• WAN-‐opJmized data transfer nodes and a network slice created programmaJcally (Science DMZ as a service)
• ApplicaJon workflow instanJated to stage data at the GENI rack on Science DMZ slice
• Data is moved opJmally across the WAN1
1 Earlier work, like Phoebus, have instandated the value of this approach
Dedicated vs. Virtual resources!• GENI provides a distributed software-defined infrastructure
(SDI)"– Compute + Storage + Network"
• GENI racks may be deployed on-campus or in provider networks close to the campus"
• ‘Science DMZ as a service’ "– Applications can provision a virtual ‘Science DMZ’ as and when
needed"Programmable infrastructure to enable end-‐users to create dynamic ‘fricJon-‐free’
infrastructures without advanced knowledge/training
Microtomography of High Temperature Materials under stress!
Set collected by materials sciendst Rob Ritchie, LBNL/UCB
What constitutes programmable network behavior? (i.e. what is SDN?)!
• Control over virtual topology"– Link in one layer is
represented by a path in another"
• Control over packet forwarding"– Making decisions about
which interface a packet/frame should be placed"
• Queue management and arbitration"
– Defining packet queues and associated service and scheduling policies "
Layer 1/2/3 VPNs via explicit signaling (MPLS, GMPLS)"
Bandwidth-on-demand services (OSCARS, NSI)"
FlowVisor"
OpenFlow 1.0, Nicira OpenVSwitch, Cisco ONE, OpenDaylight, Juniper Contrail"
Numerous vendor-proprietary APIs,"OpenFlow 1.3"
28!
ExoGENI and OpenFlow (now)!
• OpenFlow experiments using embedded topologies with OVS spanning one or more sites"– e.g. HotSDN ‘14 “A Resource
Delegation Framework for Software Defined Networks” Baldin, Huang, Gopidi"
• Experiments with OF 1.0 in rack switches"– Described in ExoBlog
(www.exogeni.net)"
29!
ExoGENI and OpenFlow (near future)!
• OpenFlow service on BEN (ben.renci.org)"– 40G wave using Juniper EX switches"– FSFW, OF 1.0, multiple controllers"– Topology embedding/VNE for ExoGENI, path service
for other projects."• Slice on AL2S with own controller"
– Topology embedding for ExoGENI, value-add experimenter services with ExoGENI resources"
• Application-specific topology embedding"
30!
Where are we going?!• More sites"
– Georgia Tech [Atlanta, GA], PUCP [Lima, Peru], Ciena [Hanover, MD]"• Updated OpenStack"• Better compute isolation"
– Take NUMA into account for placement decisions"• Better storage isolation"
– Provision storage VLANs/channels with QoS properties to provide predictable performance"
• Better network isolation and performance"– Enable SR-IOV "
• More complex topology management/embedding"– Fully dynamic slices"
• More diverse substrates"– Integration with batch schedulers (SLURM)"– VMWare, other cloud stacks"– Public clouds"
31!
Thank you!!
• http://www.exogeni.net""
32!
top related