SGE Kit UniPortal Kit UniCloud Kit Ganglia Kit UniSight Kit Cacti Kit Nagios Kit OEL Kit OVM Kit UniPlan Kit Amazon EC2 Kit Rackspace Kit GoGrid Kit Everything is a Kit in UniCloud OEL 5.X Kit CentOS 5.X Kit RHEL 5.X Kit Base Kit Tortuga Framework DHCP Server MySQL DB Tortuga Resource Adapter Default CherryPy WS PythonCLI OVM EC2 Rackspace CFM SQLAlchemy Apache httpd Apache Tomcat Log Server Kit Util rsync OS Util Resource Adapters For UniCloud Tortuga Framework for UniCloud GoGrid Tortuga Storage Adapter Nexentia NetApp Tortuga Network Adapter OpenVSwitch Cisco VMWare System Imager Default HPC Kit VMWare Application Kit 1 UniCloud Kit Cacti Kit Nagios Kit Amazon EC2 Kit Rackspace Kit GoGrid Kit OEL 5.X Kit CentOS 5.X Kit RHEL 5.X Kit Base Kit User or Univa Win 7 Win Server Software Profiles Define collection of OS and Kits to Create whole OS and Application Stack Application Kit 2 Middleware Kits Cloud Kits Application Kits Kit Repo Yum Repo distribution Middleware Software Application Software OS Kits Created From Standard Media Win Server Application Kit 1 Win 7 Application Kit 2 OEL 5.X Nagios Kit Application Kit 1 OEL 5.X Base Kit UniCloud Kit RHEL 5.X Base Kit Amazon EC2 Kit distribution Middleware Software Application Software Real or Virtual Hardware Amazon EC2 GoGrid VMWare Physical HW + Rackspace Software Profile + Kits Hardware Profile = Amazon EC2 distribution Middleware Software Application Software Rackspace distribution Middleware Software Application Software or distribution Middleware Software Application Software VMWare or Kits and Software profiles form the complete software stack that can be run on different hardware profiles. The Same software stack is used for all HW profiles Applications, and the whole stack of software can be moved from one real or Virtual environment to another. 0DLQ 8QL&ORXG 9'& $ 9'& % 9'& & +DUGZDUH ;HQ +HDG 1RGH +DUGZDUH 26 $SSOLFDWLRQV &RQILJXUDWLRQ &RPSXWH 1RGH +DUGZDUH 26 $SSOLFDWLRQV &RQILJXUDWLRQ &RPSXWH 1RGH +DUGZDUH 26 $SSOLFDWLRQV &RQILJXUDWLRQ &RPSXWH 1RGH +DUGZDUH 26 $SSOLFDWLRQV &RQILJXUDWLRQ +DUGZDUH ;HQ +HDG 1RGH +DUGZDUH 26 $SSOLFDWLRQV &RQILJXUDWLRQ &RPSXWH 1RGH +DUGZDUH 26 $SSOLFDWLRQV &RQILJXUDWLRQ &RPSXWH 1RGH +DUGZDUH 26 $SSOLFDWLRQV &RQILJXUDWLRQ +DUGZDUH ;HQ &RPSXWH 1RGH &RPSXWH 1RGH +DUGZDUH 26 $SSOLFDWLRQV &RQILJXUDWLRQ &RPSXWH 1RGH +DUGZDUH 26 $SSOLFDWLRQV &RQILJXUDWLRQ &RPSXWH 1RGH +DUGZDUH 26 $SSOLFDWLRQV &RQILJXUDWLRQ +HDG 1RGH +DUGZDUH 26 $SSOLFDWLRQV &RQILJXUDWLRQ &RPSXWH 1RGH +DUGZDUH 26 $SSOLFDWLRQV &RQILJXUDWLRQ 6ZLWFK 0DLQ +HDG 1RGH $SSOLFDWLRQV &RQILJXUDWLRQ &RPSXWH 1RGH &RPSXWH 1RGH 9/$1 7UXQNLQJ 26 $SSOLFDWLRQV &RQILJXUDWLRQ 9/$1 7UXQNLQJ 26 $SSOLFDWLRQV &RQILJXUDWLRQ 9/$1 7UXQNLQJ 26 7UXQNHG 9/$1V 1HWZRUNLQJ Optimization of Electronic Design Automation (EDA) Infrastructure with UniCloud and Virtualization Bill Bryce Univa, Scott Clark Deopli s B C c c c c c P3 P2 M1 M2 Time t Hosts/Cores h h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 h11 h12 1 2 3 4 5 6 7 8 9 10 11 P1 Standard Job Layout in Batch Cluster In infrastructure today: Pre-emption must be used to move lower priority workload so High Priority workload can run. Gaps and waste happens in the host allocation (unless pre-emption is used) Many Applications fail to use all of the cores on Multi-Core Machines Gap, Pre-emption must be used Small jobs must be pre-empted Pre-empted jobs are 'run from the beginning' so computation is lost. Suspending jobs rarely works since high priority jobs need the memory on the machine. Multi-core machines are under utilized. s B C c c c c c P3 P2 M1 M2 Time t Hosts/Cores h h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 h11 h12 1 2 3 4 5 6 7 8 9 10 11 P1 Dynamic Resource Optimization with UniCloud Pre-emption rarely needs to be used. Physical machines are sliced into smaller virtual machines (v). Small jobs are packed onto collections of virtual machines. Virtual machines can be live migrated or suspended to disk Large High priority workload runs on cluster. Work is *not* lost due to Pre-emption Physical hosts 'h' are split into virtual hosts 'v'. These virtual hosts allow a better packing of workload onto the available physical hosts. Applications that did not take advantage of larger memory and cores on modern machines can be split into multiple Virtual Machines and smaller jobs run on those machines. All those cores you bought can now be used!! Result: Better Utilization of Multi-Core machines Runtime is not lost for pre-empted jobs! v vvv vvvvvvvv vvvv vvvv Hosts split into Virtual Machines Bare Metal VM Slowdown 1172 1126 -4.09% Design Compiler Bare Metal VM Slowdown 5663 5774 1.92% Bare Metal VM Slowdown 53,263 53,670 0.76% Bare Metal VM Slowdown 151 153 1.31% 153 151 1% EDA Cluster Configuration with Virtualization Application Bare Metal VM (CPU Affinity) Slowdown Hsim 1 job 28.78 28.21 -2.02% 2 jobs 28.48 28.95 1.62% 4 jobs 29.1 29.11 0.03% 6 jobs 31.78 32.31 1.64% 8 jobs 34.31 34.67 1.04% Synopsys® HSIM: Hierarchical Full-chip Circuit Simulation and Analysis VM (No Affinity) Slowdown 28.4 -1.34% 28.32 -0.56% 31.03 6.22% 32.19 1.27% 34.99 1.94% UniCloud 2.1 Architecture Kits, Software & Hardware Profiles Application Packaging into Kits Benchmark Results for Common EDA Applications In Virtual Machines Simulia Benchmark Amazon EC2 !"! $!!"! %!!!"! %$!!"! &!!!"! &$!!"! '!!!"! '$!!"! (!!!"! ($!!"! ( ) %* '& !"#$%&'( *+,# -./ 0%,1#2 '3 4'2#. 5+,%6+7 589 !"#$%&'( *+,# +,& -&"(./0123 +,& 45678 % 49-6/90 45678 +,& ::%"(./0123 !"! $!!!"! %!!!"! &!!!"! '!!!"! (!!!"! )!!!"! *!!!"! ' + $) &% !"#$%&'( *+,# -./ 0%,1#2 '3 4'2#. 5+,%6+7 !8 !"#$%&'( *+,# ,-% .%"'/01234 ,-% 56789 $ 5:.70:1 56789 ,-% ;;$"'/01234 Compute Node Type CPU CPU Cores Memory [GB] Swap [GB] Interconnect EC2 - m2.4xlarge Intel Xeon X5550 @ 2.67GHz, 8MB Cache 8 68.4 0 1Gbps Ethernet EC2 - cc1.4xlarge Intel Xeon X5570 @ 2.93GHz, 8MB Cache 16 22 10 10Gbps Ethernet Simulia Intel Xeon Woodcrest Intel Xeon X5160 @ 3.0 GHz, 4MB Cache 4 16 Unknown Infiniband • UniCloud Provides a framework for packaging software into 'Kits'. • Kits can be automatically installed and configured onto physical nodes • virtual nodes and even cloud instances in public cloud infrastructures. • Kits are 'Meta-RPMs' and contain the software, meta-data describing the software, • scripts for installing and configuring the software. • UniCloud automates many typical installation and configuration tasks in a cluster environment including Storage, Networking and Node configuration. • UniCloud can be used to create Virtual Data Centers on the same physical compute infrastructure. • Each VDC is isolated from the other using VLANs. • Users can request their own cluster environment and UniCloud can automatically create that environment. • EDA environments are concerned with optimizing License usage and overall cluster throughput. • Current schedulers cannot optimize license usage without pre-emption and re-running workload from the beginning. • UniCloud + Virtualization can live migrate or suspend to disk entire machines running applications. • Applications are not killed so very little CPU cycles are wasted. • Virtualization overhead is in the range of 2-4%. • Project and Customer specific virtual clusters can be created on demand to ensure compute resources meet project deadlines and goals. • Virtualization allows for better use of multi and many core machines by running multiple applications in small virtual machines. • UniCloud + Virtualization can 'pack' more applications onto the existing compute resources ensuring maximum usage of software licenses and cpu hardware. • Running applications in Virtual Machines provides Mobility for the entire application. Similar to a checkpoint but far more robust. A Negative slow down indicates that the application ran slightly faster in a virtual machine than on physical hardware.