This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
OTREI A ANOASIONAL NRSISTSE OF ,DUANCED NDTRS IAL CIENCE AND ECHNOLOG
,
IcXV T_̲ fXff ba ba V[T_̲_̲XaZXf TaW bccbegha g Xf bY agXZeTgXWc[bgba Vf a YhgheX WTgTVXagXef
79I? *()-‐‑‒6Jfh hUT */ @Ta *()-‐‑‒
0WPECSASIONR FO OPSICAL NESVOF OM SHE UIEVPOINS OF R RSEM
ROFSVA E EREA CH
TSLINEs JeXaWf a WTgTVXagXe eXfXTeV[ TaWWXiX_̲bc Xag
s 7?IJ ?CFKBI Feb]XVgs Mbe _̲bTW TaT_̲lf fs FebcbfXW TeV[ gXVgheX2:TgTqbj VXage V Vb chg aZ
*
NS ODTCSIONs 8 Z:TgT f T _̲_̲Xe Tcc a WTgTVXagXef geXdh eXf T V_̲XTa f_̲TgX TeV[ gXVgheX _̲ XDIRAGG EGASION DASACENSE IN A BOW
s PSICAL NESVO IR E gb T aZ g[Xs Ecg VT_̲ cTg[ aXgjbe T_̲_̲ bcg VT_̲ cTg[UXgjXXa XaW gb XaW a T WTgTVXagXer Febf2 [hZX UTaWj Wg[ XaXeZl XvV XaVlr 9baf2 cTg[ fj gV[ aZ _̲TgXaVl hg _̲ mTg ba
s Jb gT X TWiTagTZX bY bcg VT_̲ cTg[ aXgjbeT aXj WTgTVXagXe EI f XffXag T_̲r AXl WXT2 Vbageb_̲ WTgT c_̲TaX fXcTeTg ba
+
PSICAL ESVO IN / Rs I _̲Te VbaVXcg ����������� ���be ��� ��� ���������� ceb]XVgf TeX _̲ThaV[XW eXVXag_̲lr EcXa 9b chgX Feb]XVg <TVXUbbr HTV IVT_̲X 9b chg aZ ?agX_̲r kgeX X_̲l I[e a aZ 9b chg aZ ?8Cr J[X CTV[ aX Fr < eX8bk K98r 9JH 9bafbeg h C?J
s Ecg VT_̲ aXgjbe aV_̲hW aZ c[bgba V X_̲XVgeba VVbaiXeZXaVX TaW f[beg 3) eXTV[ agXeVbaaXVg ba f Xl gb We iX aabiTg ba a YhgheXWTgTVXagXef
,
-‐‑‒
Architecture
Service Cluster Back-End Cluster
Front-End Cluster
Web250 racks
Ads 30 racks
Cache (~144TB)
Search Photos Msg Others UDB ADS-DB Tao Leader
Multifeed 9 racks
Other small services
o<_̲Tf[ Tg <TVXUbb t<_̲Tf[ Ih g *()+
Standard Systems
IWeb
IIIDatabase
IVHadoop
VPhotos
VIFeed
CPUHigh
2&x&E5*2670
High
2&x&E5*2660
High
2&x&E5*2660Low
High
2&x&E5*2660
Memory LowHigh
144GB
Medium
64GBLow
High
144GB
Disk LowHigh&IOPS
3.2&TB&Flash
High
15&x&4TB&SATA
High
15&x&4TB&SATAMedium
Services Web,&Chat DatabaseHadoop
(big&data)Photos,&Video
MulPfeed,
Search,&Ads
Five Standard Servers
PEN OMPTSE : OJECS
.
s E9F jTf YbhaWXW gb bcXa_̲l f[TeX WXf Zaf bYWTgTVXagXe cebWhVgf Ul <TVXUbb a 7ce _̲ *())
s I[ Yg Yeb Vb bW gl cebWhVgf gb TRE D IUENDERIGN gb cebiX g[X XaXeZl XvV XaVl bY _̲TeZXfVT_̲X WTgTVXagXefr ?aWhfgel IgTaWTeW2 ) 1 FKr EcXa 9b chgX Feb]XVg2 ) (/ FK
s IcXV pVTg baf2 fXeiXe fgbeTZXeTV aXgjbe fj gV[ XgV
s FebWhVgf2 GhTagT HTV Zb N=?=78OJ :TgT9XagXe Ib_̲hg ba
● Dense integration w/o miniaturization�● Reduction of the wiring length for power saving ● Introduction of Ge and III-V channels by simple stacking process ● Innovative circuit by using Z direction�
Simulations�
Anisotropic magnetic material�
First-principle sim.�TCAD�- Large-scale simulation for 3D structure, novel devices, and latest material �
●Large-scale silicon photonics based cluster switches�●DWDM, multi-level modulation, highly integrated “elastic” optical interconnects ●Ultra-low energy consumption network by making use of optical switches�
! Ultra-compact switches based on silicon photonics
! 3D integration by amorphous silicon
! A new server architecture
Hjgg\di<hiXi\=e]=i_\=Xgi<Tm�A@@KYfh<p<�<E>ABTYfh�
Hjgg\di<\b\Zig`ZXb<hl`iZ_\h���AC@TYfh<p<�E@@QYfh�
Optical Network Technology for Future Datacenters�
Architecture for Big Data and Extreme-scale Computing�
Real-time Big data�
Optimal arrangement of the data flow� Resource management / Monitoring�
Storage�
Server Module�
IXiX<Z\di\g<PS�
Input� Output�Conv.� Ana.�Data flow�
Data$flow$centric$warehouse$scale$compu%ng�
1 - Single OS controls entire data center�
2 - Split a data center OS into the data plane and the control plane to guarantee real-time data processing
Connect to universal processor / hardware and storage by using optical network
:E FO MANCE 0RSIMASIONs fg TgX g[X cXeYbe TaVX bY glc VT_̲ Ubg[ F9 TaW8 Z:TgT jbe _̲bTWf ba T YhgheX WTgTVXagXe flfgX
s IM2 ID RIMTLASOr I h_̲Tgbe bY _̲TeZX fVT_̲X W fge UhgXW IlfgX f fhV[ Tf=e Wf 9_̲bhWf F9 TaW F*F
*)
http://simgrid.gforge.inria.fr
SimGrid Overview
MSG
Simple application-
level simulator
SimDag
Framework for
DAGs of parallel tasksapplications on top of
a virtual environment
Library to run MPISMPI
virtual platform simulator
SURF
Contrib
Grounding features (logging, etc.), data structures (lists, etc.) and portability
XBT
TRACE
Tracing
simulation
User Code
SimGrid user APIsI If your application is a DAG of (parallel) tasks ; use SimDag
I To study an existing MPI code ; use SMPI
I In any other cases ; use MSG(easily study concurrent processes and prototype distributed applications)
Da SimGrid Team SimGrid User 101 Introduction Installing MSG Java lua Ruby Trace Config xbt Performance CC 8/28
SimGrid is not a Simulator
logs
stats
visu
Availibility
Changes
Platform
Topology
Application
Deployment
Simulation Kernel
Application
Simulator
OutcomesScenario
ApplicativeWorkload
Parameters
Input
That’s a Generic Simulation Framework
Da SimGrid Team SimGrid User 101 Introduction Installing MSG Java lua Ruby Trace Config xbt Performance CC 23/28
O LOAD ( IMPLE ERRAGE :ARRINGs ?gXeTg ba bY aX Z[Ube Vb ha VTg ba Ubggb _̲XYgs 8 Z cTVg bY aVeXTf aZ _̲ a UTaWj Wg[ Y TaTcc_̲ VTg ba f aXgjbe agXaf iX
**
Compute power (FLO)�
Relative execution time�
Data size (byte)�
#node: 10000 Link bandwidth: 0.1, 1, 10Tbps Link latency 100ns CPU power: 10TFLOPS Data size�1012�1024B�
…�
…�
1/100�
O LOAD ) 3: ,PPLICASIONs D7I FTeT_̲_̲X_̲ 8XaV[ Te *-‐‑‒. cebVf V_̲Tff 9
r Bbj _̲TgXaVl f beX cbegTag g[Ta [hZX UTaWj Wg[r J[X cebU_̲X f mX f gbb f T_̲_̲ gb hg _̲ mX [hZX UTaWj Wg[
*+
CPU power�1TFLOPS Link bandwidth�1Tbps
CPU power�1TFLOPS Link latency�0.1us
uXVg bY eXWhV aZ g[X _̲ a _̲TgXaVl uXVg bY aVeXTf aZ g[X _̲ a UTaWj Wg[
HX_̲TgiXXkXVhgba
gX
HX_̲TgiXXkXVhgba
gX
O LOAD AP EDTCE
*,
s A:: 9hc *()* JeTV *2 ceXW Vg g[X V_̲ V g[ebhZ[eTgX bY TWf hf aZ TWbbc TaW iX T_̲_̲s CTV[ aX _̲XTea aZ f 9FK agXaf iXs J[X XuXVg bY [hZX UTaWj Wg[ f _̲ gXW UXVThfX
r J[X VbaVheeXaVl bY g[X hfXW bWX_̲ f abg XabhZ[r TWbbc f bcg mXW gb T X ]bUf eha YTfgXe ba g[XVheeXag ? E WXi VXf
: f ? E UTaWj Wg[ CUcf
kXVhgbag
X fXVbaW
kXVhgbag
X fXVbaW
HX_̲Tg iX 9FK cbjXe UTfX )/*=<BEFI
CPU power: 17.2TFLOPS
Disk bandwidth: 200Mbps Network bandwidth: 10Gbps
Memory�
Optical I/O�
Logic�
Board�Chassis�����
10TB�
1TB/s�
100GB/s�
1TB/s�
100 TFLOPS�
����
Data�
/ASA OUEMENS : OBLEMR
*-‐‑‒
Memory�
Electrical I/O�
Logic�
Board�Chassis�
����
9heeXag <hgheX
128GB�
14.9GB/s (DDR3)�
7GB/s (IB FDR)�
224*2 GFLOPS�
1TB/s�
����
8GB/s (PCIe G3)�
* AIST Super Green Cloud (ASGC)�
Data�Bottleneck
Bottleneck
:A ALLELI ASION INRIDE :AC AGEMemory� Logic�
10TB�
1TB/s�
100 TFLOPS�• DXXW XvV XagcTeT_̲_̲X_̲ mXWfgehVgheX
• Ea V[ caXgjbe
• ?agXeVbaaXVg ba
-‐‑‒OSSLENEC R IN AP EDTCE
*/
Map� Map� Map� Map�
Reduce� Reduce� Reduce� Reduce�
1. Disk Throughput �
2. Network Congestion �
3. Serialized Reduce �
Shuffle�
N RSO AGE NESVO P OCERRING
*0
Mappers�
Shuffle and Reduce�
XeTeV[ VT_̲ TaW cTeg T_̲eXWhVX cebVXff aZ aXTV[ aXgjbe abWX gbTib W aXgjbeVbaZXfg ba TaWfXe T_̲ mXW eXWhVX
9b chgX bWh_̲Xf TeXTggTV[XW a fgbeTZX gbTk mX g[X eXTW
g[ebhZ[chg YebfgbeTZX
N RSO AGE NESVO P OCERRING
*1
Mappers�
Shuffle and Reduce�
FebVXff aZ9b cbaXag
Ecg VT_̲?agXeVbaaXVg
DLH7C
+(
PU� MEM� PU� MEM�
PU� MEM� PU� MEM�
PU� MEM� PU� MEM�
PU� MEM� PU� MEM�
: eXVg bcg VT_̲ ? E VbaaXVg bagb aba ib_̲Tg _̲X X belbWh_̲Xf W fge UhgXW ba T V[ c
9b ha VTg babiXe :M:C
N RSO AGE NESVO P OCERRINGHA DVA E DERIGN
/I ECS EMO OP OUE / /s 7ffh X cebVXffbe
X bel X UXWWXWcTV TZX j g[ M:CagXeVbaaXVg
s Jb Yh_̲_̲l hg _̲ mX g[X [hZX? E UTaWj Wg[ eXT_̲ mXWUl :M:C
s Ch_̲g c_̲X X bel U_̲bV fVTa UX fXag eXVX iXWf h_̲gTaXbhf_̲l hf aZh_̲g c_̲X jTiX_̲XaZg[f
s CX bel VXage V aXgjbef T f _̲Te WXT QF79J)+R
+)
Processor�
Memory block�
Processor cores� Memory Bank� WDM
Interconnect�
Memory block�
Single package compute node �
Cache/
MMU�
From Wavelength bank�
+*
2014 2020 2030
���
���
Optical(Network
3D stacked package2.5D stacked packageSeparated packages
: 0 /ASACENSE• 7 f aZ_̲X EI Ybe WTgTVXagXe j WX bcg mTg ba bYg[X XaXeZl XvV XaVl TaW g[X cXeYbe TaVX
• IXcTeTg ba bY WTgT c_̲TaX TaW Vbageb_̲ c_̲TaX2- :TgT c_̲TaX f Ta Tcc_̲ VTg ba fcXV pV _̲ UeTel EI- 9bageb_̲ c_̲TaX TaTZXf eXfbheVXf fXeiXe aXgjbe XgV
:TgTVXagXe EI
7cc
:TgTc_̲TaX
7cc
:TgTc_̲TaX
7cc
:TgTc_̲TaX ʼ’
�� ��� ����������������������
9bageb_̲ c_̲TaX 9bageb_̲ c_̲TaX
+,
: 0 /ASACENSE/ASA PLANE• 7cc_̲ VTg ba fcXV pV _̲ UeTel EI
- Z TV[ aX _̲XTea aZ WTgT fgbeX XgV• C g ZTgX g[X EI biXe[XTW gb Yh_̲_̲l
hg _̲ mX [ Z[ cXeYbe TaVX WXi VXf
7cc
:TgTc_̲TaX
7cc
:TgTc_̲TaX
7cc
:TgTc_̲TaX
9bageb_̲c_̲TaX
9FK 9FK =FK
CX ? E CX ? E CX ? E
ONS OL PLANE• HXfbheVX TaTZX Xag• BbZ VT_̲ TaW fXVheX eXfbheVX
cTeg g ba aZ Ybe WTgT c_̲TaXf• Hhaa aZ ba g[X pe jTeX
+-‐‑‒
ELASED Os :TgTVXagXe j WX eXfbheVX TaTZX Xag
r EcXaIgTV 7cTV[X 9_̲bhWIgTV AhUXeaXgXfr TWbbc O7HD 7cTV[X CXfbf
s :TgTqbj cebVXff aZ XaZ aXr =bbZ_̲X 9_̲bhW :TgTqbjr BT UWT TeV[ gXVgheX
s 9bageb_̲ TaW WTgT c_̲TaXf fXcTeTgXW WXf Za aEIr 7eeT f K MTf[ aZgbar ?N IgTaYbeW
+.
TMMAs DXj i f baf bY YhgheX WTgTVXagXef2oW fTZZeXZTg bat TaW oWTgTVXagXe a T Ubkt
s PSICAL NESVO IR E gb T aZ g[Xs TeWjTeX TaW fbYgjTeX CO DERIGN f Ve g VT_̲s Ecg VT_̲ cTg[ aXgjbe XaVbheTZXf &/REPA ASION a T WTgTVXagXe EIr ONS OL PLANE TaTZXf eXfbheVXf TaW XfgTU_̲ f[XfT cTg[ UXgjXXa WTgT cebVXff aZ Vb cbaXagf
Optical Interconnect: As the bandwidth demand for traditionally electrical wireline interconnects has accelerated, optics has become an increasingly attractive alternative for interconnects within computing systems. Optical communication offers clear benefits for high-speed and long-distance interconnects. Relative to electrical interconnects, optics provides lower channel loss. Circuit design and packaging techniques that have traditionally been used for electrical wireline are being adapted to enable integrated optical with extremely low power. This trend has resulted in rapid progress in optical ICs for Ethernet, backplane and chip-to-chip optical communication. ISSCC 2014 includes a 2-dimensional (12×5) optical array achieving an aggregate data-rate of 600Gb/s [8.2]. Pre-emphasis using group-delay filtering extends the useful date rate of a 25Gb/s VCSEL to 40Gb/s [8.9]. Additional examples of low-power-linear and non-linear equalizers tackle electronic dispersion compensation in multi-mode and long-haul cables [8.1, 8.3]. Concluding Remarks: Continuing to aggressively scale I/O bandwidth is both essential for the industry and extremely challenging. Innovations that provide higher performance and lower power will continue to be made in order to sustain this trend. Advances in circuit architecture, interconnect topologies, and transistor scaling are together changing how I/O will be done over the next decade. The most exciting and most promising of these emerging technologies for wireline I/O will be highlighted at ISSCC 2014.
Per-pin data-rate vs. year for a variety of common I/O standards.
Non-Volatile Memories (NVMs): In the past decade, significant investment has been put into emerging memories to find an alternative to floating-gate based non-volatile memory. The emerging NVMs, such as phase-change memory (PRAM), ferroelectric RAM (FeRAM), magnetic spin-torque-transfer (STT-MRAM), and Resistive memory (ReRAM), are showing potential to achieve high cycling capability and lower power per bit in read/write operations. Some commercial applications, such as cellular phones, have recently started to use PRAM, demonstrating that reliability and cost competitiveness in emerging memories is becoming a reality. Fast write speed and low read-access time are the potential benefits of these emerging memories. At ISSCC 2014, a high-density ReRAM with a buried WL access device is introduced to improve the write performance and area. The next Figure highlights how MLC NAND Flash write throughput continues to improve. However, while the Figure following shows no increase in NAND Flash density over the past year, recent devices are built with finer dimensions or more sophisticated 3-dimensional vertical bit cells.
:E PIN DASA ASE OF COMMON & RSANDA DR
Z[ 8TaWj Wg[ CX bel
?II99*(), JeXaWf
?II99*(), JeXaWf
:TgT fbheVX2[ggc2 VchWU fgTaYbeW XWh
*k ) -‐‑‒lef
: OCERRO RCALING S ENDR
,)
[ggc2 jjj _̲ ahkYbhaWTg ba beZ Vb_̲_̲TUbeTgX jbe Zebhcf aXgjbe aZ XeaX_̲Sqbj