Top Banner
Aurora energy efficiency
38
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Aurora hpc energy efficiency

Aurora energy efficiency

Page 2: Aurora hpc energy efficiency

Motivations for energy efficiency

• Energy Efficiency and SuperMUC• Motivation

•Academic and governmental institutions in Bavaria use electrical energy from renewable sources

•We currently pay 15.8 Cents per KWh

•We already know that we will have to pay at least 17.8 Cents per KWh in 2013

quote from Meijer Huber, LRZ

Page 3: Aurora hpc energy efficiency

Motivations for energy efficiency

Motivation• Data centers are highly energy-intensive facilities• 10-100x more energy intensive than an office.• Server racks well in excess of 30kW.• Surging demand for data storage.• ~3% of U.S. electricity consumption.• Projected to double in next 5 years.• Power and cooling constraints in existing facilities.

Sustainable Computing Why should we care?• Carbon footprint.• Water usage.• Mega$ per MW year.• Cost OpEx > IT CapEx!

Thus, we need a holistic approach to sustainability and TCO for the entire computing enterprise, not just the HPC system

quote from Steve Hammond, NREL

Page 4: Aurora hpc energy efficiency

PUEs in various data centersPUE<1.8 requires free cooling or liquid cooling

Global bank’s best data center (of more than 100) 2.25 Air

EPA Energy Star Average 1.91 Air/Liquid

Intel average >1.80 Air

ORNL 1.25 Liquid

Google 1.16 Liquid coils, evaporative tower, hot aisle containment

Leibniz Supercomputing Centre (LRZ) 1.15 Direct liquid

National Center for Atmospheric Research (NCAR) 1.10 Liquid

Yahoo Lockport *(PUE declared in project) 1.08 Free air cooling + evaporative cooling

Facebook Prineville 1.07 Free cooling, evaporative

National Renewable Energy Laboratory (NREL) 1.06 Direct Liquid + evaporative tower

Page 5: Aurora hpc energy efficiency

WAYS TO IMPROVE PUE AND ENERGY EFFICIENCY

Page 6: Aurora hpc energy efficiency

Energy efficiency - methods

IT equipmentMaximize Flops / Watt

Data CenterFacility PUE

Data Center or ecosystem

Reuse thermal energy

1 IT equipmentMaximize efficiency

2

3

Increased work per wattEliminate fansComponent level heat exchangeNewest processors are more efficientLiquid coolingEnergy aware design

Optimize air coolingFree coolingLiquid coolingDirect liquid coolingOptimization of power conversion

Direct liquid coolingMaximize outlet temperatureHolistic view of data center planning

Page 7: Aurora hpc energy efficiency

MAXIMIZE EFFICIENCY (FLOPS/WATT)

Page 8: Aurora hpc energy efficiency

Eurotech energy efficient design• Aurora supercomputers have been designed using standard

component but making choices for the best energy efficiency possible

• Aurora HPCs benefit from the Eurotech experience of making the power conversion chain efficiency progressively increased from 89% to 97%

The approach is:• Choice of the most efficient components in the market. That

is, choosing components (processors, accelerators, voltage regulators, memories, minor electronic parts) that minimize energy consumption giving the same functionality and performance

• Choice of the best «working points» to top the power convertors efficiency curves

• Water cooling to lower the working temperature of components and maximize their efficiency and eliminate fans

Page 9: Aurora hpc energy efficiency

Gain DC/DC conversion efficiency• In the DC/DC choice a gain of over 2% in efficiency, from 95,5 % to

98%• Choice of the optimal current (I) to work on the top of the conversion

curves

Existing DC/DC conversion New upgraded DC/DC conversion

Page 10: Aurora hpc energy efficiency

Water cooling and efficiency178 nodes – AMD Opteron 6128HE CPUs (Magny Cours) - 16GB RAM Measuremets

taken by LRZ

• With aircooling the CPU’s operate at about 5°C below maximum case temparture

• Normal operation of an water cooled server is with water of 20°C, which is about 40°C below the maximum case temperature

Page 11: Aurora hpc energy efficiency

Water cooling = No fans, Low noise

• Ventilators consume about 20 Watt per node in «normal» operation! • This is roughly 5% of peak power…per se a small contribution but the SUM of

all of the contributions described gives a considerable positive delta in energy efficiency

Page 12: Aurora hpc energy efficiency

Some efficiency resultsMeasures taken on a single Aurora Tigon node card

Intel Xeon E5 + Nvidia K20x: peak efficiency 3.57 Gflop/s per Watt

Page 13: Aurora hpc energy efficiency

Some efficiency resultsMeasures taken on a single Aurora Tigon node card

Intel Xeon E5 + Nvidia K20x: peak efficiency 3.57 Gflop/s per Watt

Page 14: Aurora hpc energy efficiency

Some efficiency resultsMeasures taken on a single Aurora Tigon node card

Intel Xeon E5 + Nvidia K20x: peak efficiency 3.63 Gflop/s per Watt

Page 15: Aurora hpc energy efficiency

Some efficiency resultsScalability• Single node analysis gives the settings to

find the most efficient working point

• Scalbility over a rack (128 nodes) is 90%

• Rmax measured over a 128 nodes system 215 Tflop/s

• Efficency over a 128 nodes system 3.2 Gflop/s per Watt

128 Node cards2 x Nvidia K20 GPUs

2x Intel Xeon E5 2687W CPUs16 GB of RAM

Direct Water CoolingWater temperature 19 +/- 1 °C

Page 16: Aurora hpc energy efficiency

ENERGY EFFICIENCYGREEN 500 TOP POSITION

Page 17: Aurora hpc energy efficiency

Number 1 in Green 500 list

The Eurora supercomputer, built on the same Aurora Tigon architecture, is installed at Cineca.

The energy efficiency measured on Eurora following the Green500 reported 3.21 GFlops/Watt This result has placed Eurora at the first place of the Green500 list

Eurora supercomputer

Page 18: Aurora hpc energy efficiency

The world most efficient architecture

The measurements were taken on a calibrated power meter with the system running a customized version of LINPACK

Eurora supercomputer

System Eurora supercomputer: 64 nodes, 128 CPUs, 128 GPUs

Node Card Intel Xeon E5-2687W (150W),

n.2 nVIDIA K20s, n.1 Infiniband QDR

 NVIDIA® Tesla® K20

Ambient Temperature 20°C+/-1°C

Coolant Temperature 19°C+/-1°C

Coolant water

Flowrate 120lph +/-7lph each EuroraBoard

Page 19: Aurora hpc energy efficiency

DATA CENTER EFFICIENCYREDUCE HOUSE LOAD

Page 20: Aurora hpc energy efficiency

Efficiency and economics - Energy use in data centers

Data from APC

Page 21: Aurora hpc energy efficiency

Efficiency and economics - “typical” power breakdown in datacenters

Data from APC

Page 22: Aurora hpc energy efficiency

Reducing cooling energy

Ways to reduce cooling energy consumption• Air cooling optimization (hot and cold aisle containment…)• Free cooling: avoid compressor based cooling (chillers) using cold air coming from

outside the data center. Possible only in cold climate or seasonal• Free cooling with heat exchangers (dry coolers). Dry coolers consume much less

energy than chillers!• Liquid cooling to increase the cooling efficacy and reduce the power absobed by chillers• Liquid cooling with free cooling: the liquid is not cooled by chillers but by dry coolers• Hot liquid cooling allows the use of dry coolers all year round and also in warm climates• Liquid cooling using a natural source of • Alternative approaches: spray cooling, oil submersion cooling

Eurotech Aurora approach:• Direct Hot Water Cooling with no chillers but only dry coolers

Page 23: Aurora hpc energy efficiency

Internal cooling Loop

Loop #1

Loop #6

Loop #12

Pump

Heat exchanger

Dry cooler

Filter

Aurora liquid cooling infrastructure

Page 24: Aurora hpc energy efficiency

LOOP #1 LOOP #2

By passheater

Chiller sDry Coolers

Pumps consume energy but they can control the flowrate

Increasing the flowrate is much less energy demanding that swicthin on a chiller

Page 25: Aurora hpc energy efficiency

Advantages of the Eurotech approach

Hot liquid cooling no chillers save energy• Avoid/limit expensive and power hungry chillers with the only

cooling method that requires almost always only dry coolers • Minimize PUE and hence maximize energy cost savings• Reuse thermal energy for heating, air conditioning, electrical

energy or industrial processes• “Clean” free cooling: no dust, no filters needed to filter

external air

Direct liquid cooling via cold plates effective cooling• Allow very limited heat spillage• Maximize the effectiveness of cooling allowing for hot water

to be used (up to 55 °C inlet water)

Comprehensive more efficiency• Cools any source of heat in the server (including power

supply)

Page 26: Aurora hpc energy efficiency

Optimize power conversion

Data from Intel

Standard power distribution steps

Page 27: Aurora hpc energy efficiency

Moving towards DC reduces steps in power conversion

Data from Intel

Page 28: Aurora hpc energy efficiency

Aurora power distribution

10 V

48 Vdc

230 V

OptionalUPS

97% efficiency 98% efficiency

Page 29: Aurora hpc energy efficiency

Aurora results

• Use of telecom technology• Low cost• High reliability• Low level of maintenance• Easy to control

• Power conversion with high level of redundancy• Very high efficiency for EVERY load of the computer

• The AC/DC conversion efficiency increased from 96% to close to 97%

• 11 kW telecom rectifier shelfs of 1U, digitally controlled• Distributing the power over the backplane at 54V,

minimizing Ohmic losses

Page 30: Aurora hpc energy efficiency

ADDITIONAL GREEN!THERMAL ENERGY RECOVERY

Page 31: Aurora hpc energy efficiency

Three Stages Cooling + Heat Recovery

Liquid to LiquidHeat exchanger

1 MW

0.13 MW

30° C

0.87 MW

55° C

Thermal energy re-use

Liquid to Liquid Heatexchanger

Computingsystem

rack 1

Computingsystem

rack 2

Computingsystem

rack #n

25° C20° C 30° C

Minimize waste: thermal energy re-use

PUE < 1 !!

Page 32: Aurora hpc energy efficiency

Minimize waste: thermal energy re-use

• The ability to effectively re-use the waste heat from the outlets increases with higher temperatures.

• Outlet temperatures starting from 45°C can be used to heat buildings, temperatures starting from 55°C can be used to drive adsorption chillers.

• Higher temperatures may even allow for trigeneration, the combined production of electricity, heating and cooling

• Warm water can be used also in industrial processes

Page 33: Aurora hpc energy efficiency

Thermal energy recovery and swimming pools

Swimming pool 50 m, 4 lanes, 2m deep that looses 2°C per day if not heatedThe heat exchange system has 90% efficiency

Volume water = 2,50m x 4 x 50m x 2m = 1000m^3 = 10^6 litri = 10^6 KgWater specific heat= specificheat = 4186 Joule / Kg KWater target temperature = 28°C How much power do I need to keep the swimming pool at 28°C?

P(W) = Q(Joule)/t(sec) = m(kg) * c_specif (Joule/Kg K) * deltaT (K)/t(sec) = 10^6 Kg * 4186 Joule/Kg K * 2K ( 24*60*60 sec ) = 96900 W = 96,9 KW So we need a supercomputer generating roughly 110 kW. Assuming an energy efficiency of 900 Mflops/W… …to heat the swimming pool we would need to install a 100 Tflop/s system.This is, for instance, one Eurotech Aurora HPC 10-10 rack

Page 34: Aurora hpc energy efficiency

IMPACT ON TOTAL COST OF OWNERSHIP

Page 35: Aurora hpc energy efficiency

Comparison - investment

Investment (K US dollars) Datacenter A Datacenter B Datacenter C

Servers $6,200 $6,200 $6,200Network and other IT $440 $440 $440Building $1,260 $540 $360Racks $280 $120 $60Cooling $2,670 $3,060 $1660Electrical $3,570 $3,570 $2,420TOTAL INVESTMENT $14,420 $13,930 $11,140

Data center A – PUE 2.2

Data center B – PUE 1.6

Data center C – PUE 1.05

Medium density (20 kW per rack) – air cooled

High density (50 kW per rack) – optimized air cooling, rear door liquid cooling

High density (87 kW per rack) – direct hot liquid cooling, floating Tamb

Page 36: Aurora hpc energy efficiency

Comparison – annualized TCO

Annual cost (K US dollars) Datacenter A Datacenter B Datacenter CCost of energy $1,970 $1,430 $640Retuning and additional CFD $6 $3 $0Total outage cost $270 $270 $230Preventive maintenance $150 $150 $150Annual facility and infrastructure maintenance. $310 $290 $140Lighting $5 $3 $2Annualized 3 years capital costs $2,040 $2,000 $1,980Annualized 10 years capital costs $880 $940 $540Annualized 15 years capital costs $130 $60 $40ANNUALIZED TCO $5,761 $5,146 $3,722

Data center A – PUE 2.2

Data center B – PUE 1.6

Data center C – PUE 1.05

Medium density (20 kW per rack) – air cooled

High density (50 kW per rack) – optimized air cooling, rear door liquid cooling

High density (87 kW per rack) – direct hot liquid cooling, floating Tamb

Page 37: Aurora hpc energy efficiency

Main Areas of impact on TCO Links to sustainability

Energy savings

Lower cost due to less energy consumed

Space savings

Savings in real estate, racks,

electrical, cooling and network

Reliability

Savings in downtime indirect cost and

maintenance

Sustainability impact

High

Sustainability impact

High

Sustainability impact

Medium

Page 38: Aurora hpc energy efficiency

Green effects

1 Petaflop/s installation – CO2 savings with water cooling compared to air

˜28000 tons of CO2 saved in 5 years.Equivalent to:

3800 cars that do not circulate for 1 year30100 saved adult trees40 Km2 of rain forest left untouched