Top Banner
Swiss Federal Institute of Technology - Microelectronic Systems Laboratory Microelectronic Systems Laboratory Swiss Federal Institute of Technology EPFL CH 1015 Lausanne Switzerland http://lsm.epfl.ch 3D STACKED MULTI-CORE PROCESSOR PLATFORM WITH IMPROVED TESTABILITY DATE 2012 P. Giovannini G. Beanato A. Cevrero P. Athanasopoulos Y. Leblebici tel: +41 21 693 6955 fax: +41 21 693 6959 [email protected] [email protected] [email protected] [email protected] [email protected] First homogeneous architecture for 3D integrated Multi-Core processor formed by identical KGD chips ABSTRACT 3D MODULAR MULTI-CORE ARCHITECTURE Increased core count. Reduced die area and form factor. Improved core to core communication. No additional design effort. Re-usability of the platform. Simple pre-bond testing. On chip TSV yield measurements. Processing Element : LEON III 32 bit RISC processor form Gaisler; 8 KB I-Cache, 32 KB RAM/ROM. Peripheral Subsystem with a 32 KB Shared data memory regulated using a semaphores system. Network-on-Chip (Switch and Network Interfaces NIs). One individual clock domain per layer (PLLs). Data synchronization at the interface (Dual Clock FIFOs). Layer identification signal generated post manufacturing. Platform adaptable for stand alone 2D-CMP, Homogeneous 3D-CMP and further heterogeneous stacking on demand. An innovative modular 3D stacked multi-processor architecture is presented. The platform is composed of identical stacked dies connected together by TSVs. Each die features four 32-bit processors and associated memory modules, interconnected by a 3D NoC, capable of routing packets in the vertical direction. Homogeneous integration minimizes design effort and manufacturing costs, ensuring at the same time high flexibility and re-configurability. Selecting the appropriate number of layers, the platform can target different market segments, being usable as stand alone chip or in 3D stacked fashion. Fully functional samples have been fabricated using a conventional UMC 90nm CMOS process and stacked using a Via-Last Cu-TSV process, developed in-house at EPFL- LSM. Initial results show a target operative frequency of 400 MHz, supporting a vertical data bandwidth of 3.2 Gbps. TSVs redundancy: 2 TSVs for data signals; 3 TSVs for Clock, Layer-ID, Reset. Overall TSVs number 120 Total Power TSVs 54 Total Signal TSVs 66 TSV sizes 40 x 50 μm TSV capacitance 1 pF TSV resistance 0.7 Ω TEST CHIP AND EXPERIMENTAL RESULTS Process technology UMC 90 nm CMOS Die size 4 mm 2 Core footprint 800 x1650 μm Max. operative frequency 400 MHz Vertical data bandwidth 3.2 Gbps 3D TESTABILITY Pre- and post- bonding testability ensured by the homogeneous architecture: JTAG private modules for each Processing Element and Peripheral; JTAG multiplexers interface for debug signals management. Parallel JTAG access to all per-layer cores before stacking directly through I/O pads. Serial JTAG testing (Boundary Scan Chain) to all cores of the 3D system no more directly accessible (bottom layers). COPPER TSVs AND REDUNDANCY LOGIC TSV macro: 4 pads for each signal; ESD protection; Logic for boot test and yield statistics collection. Copper TSVs matrix fabricated on testing chip (in-house process) Testing protocol definition Check cores ID-Code Check private ROM content R/W private and shared RAM Binary download & execution Establish scan chain of multiple cores Complete 3D system validation on FPGA emulation. Pre bond verification on single fabricated dies exploiting: OpenOCD as software debugger; PCBs and FPGA complex setup integrated on a Probe Station.
1

P. Giovannini G. Beanato 3D STACKED MULTI-CORE … · Reduced die area and form factor. Improved core to core communication. ... LSM. Initial results show a target operative frequency

Sep 05, 2018

Download

Documents

trinhthuy
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: P. Giovannini G. Beanato 3D STACKED MULTI-CORE … · Reduced die area and form factor. Improved core to core communication. ... LSM. Initial results show a target operative frequency

Swiss Federal Institute of Technology - Microelectronic Systems Laboratory

Microelectronic Systems Laboratory

Swiss Federal Institute of Technology EPFL

CH – 1015 Lausanne Switzerland http://lsm.epfl.ch

3D STACKED MULTI-CORE PROCESSOR PLATFORM

WITH IMPROVED TESTABILITY

DATE 2012

P. Giovannini G. Beanato

A. Cevrero P. Athanasopoulos

Y. Leblebici tel: +41 21 693 6955 fax: +41 21 693 6959

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

First homogeneous architecture for 3D integrated Multi-Core processor formed by identical KGD chips

ABSTRACT

3D MODULAR MULTI-CORE ARCHITECTURE

Increased core count.

Reduced die area and form factor.

Improved core to core communication.

No additional design effort.

Re-usability of the platform.

Simple pre-bond testing.

On chip TSV yield measurements.

Processing Element :

• LEON III 32 bit RISC processor form Gaisler;

• 8 KB I-Cache, 32 KB RAM/ROM.

Peripheral Subsystem with a 32 KB Shared data memory

regulated using a semaphores system.

Network-on-Chip (Switch and Network Interfaces NIs).

One individual clock domain per layer (PLLs).

Data synchronization at the interface (Dual Clock FIFOs).

Layer identification signal generated post manufacturing. Platform adaptable for stand alone 2D-CMP, Homogeneous 3D-CMP and further

heterogeneous stacking on demand.

An innovative modular 3D stacked multi-processor architecture is presented. The platform is composed of identical stacked dies connected together by TSVs.

Each die features four 32-bit processors and associated memory modules, interconnected by a 3D NoC, capable of routing packets in the vertical direction.

Homogeneous integration minimizes design effort and manufacturing costs, ensuring at the same time high flexibility and re-configurability. Selecting the

appropriate number of layers, the platform can target different market segments, being usable as stand alone chip or in 3D stacked fashion. Fully functional

samples have been fabricated using a conventional UMC 90nm CMOS process and stacked using a Via-Last Cu-TSV process, developed in-house at EPFL-

LSM. Initial results show a target operative frequency of 400 MHz, supporting a vertical data bandwidth of 3.2 Gbps.

TSVs redundancy:

2 TSVs for data signals;

3 TSVs for Clock, Layer-ID, Reset.

Overall TSVs number 120

Total Power TSVs 54

Total Signal TSVs 66

TSV sizes 40 x 50 µm

TSV capacitance 1 pF

TSV resistance 0.7 Ω

TEST CHIP AND EXPERIMENTAL RESULTS

Process technology UMC 90 nm CMOS

Die size 4 mm2

Core footprint 800 x1650 µm

Max. operative frequency 400 MHz

Vertical data bandwidth 3.2 Gbps

3D TESTABILITY

Pre- and post- bonding testability ensured by the homogeneous

architecture:

• JTAG private modules for each Processing Element and Peripheral;

• JTAG multiplexers interface for debug signals management.

Parallel JTAG access to all per-layer cores before stacking directly through

I/O pads.

Serial JTAG testing (Boundary Scan Chain) to all cores of the 3D system

no more directly accessible (bottom layers).

COPPER TSVs AND REDUNDANCY LOGIC

TSV macro:

4 pads for each signal;

ESD protection;

Logic for boot test and

yield statistics collection.

Copper TSVs

matrix fabricated

on testing chip

(in-house process)

Testing protocol definition

Check cores ID-Code

Check private ROM content

R/W private and shared RAM

Binary download & execution

Establish scan chain of multiple cores

Complete 3D system validation on FPGA emulation.

Pre bond verification on single fabricated dies exploiting:

• OpenOCD as software debugger;

• PCBs and FPGA complex setup integrated on a Probe Station.