Top Banner
Architectural tricks to maximize Memory Bandwidth Deepak Shankar CEO, Mirabilis Design
22

Architectural tricks to maximize memory bandwidth

Apr 13, 2017

Download

Technology

Deepak Shankar
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Architectural tricks to maximize memory bandwidth

Architectural tricks to maximize Memory Bandwidth

Deepak ShankarCEO, Mirabilis Design

Page 2: Architectural tricks to maximize memory bandwidth

Why Focus on Memory Sub-System

• Processors have huge number of cycles and bandwidth – How do you take advantage of this?

• Memory access is a major bottleneck– Especially in high-performance systems like multimedia

and networking• Memory access forms the largest power

consumption– Too many ACT(RAS, RP and RCD) will dramatically

increase the power

Page 3: Architectural tricks to maximize memory bandwidth

Reports

Page 4: Architectural tricks to maximize memory bandwidth

Introduction

• Importance of improving Memory Performance

• Addressing challenges with Architecture Level Memory explorations

• Need for Performance vs. Power trade-off analysis

• Memory addressing scheme on Performance

Page 5: Architectural tricks to maximize memory bandwidth

About Mirabilis Design

• Provider of system-level architecture exploration solution for electronics and semiconductors

• Platform to conduct power-performance trade-offs, hardware-software partitioning and topology design

• VisualSim- Modeling and simulation software• Based in Silicon Valley with experts in system

modeling and architectures• Largest source of system modeling library with

embedded timing, functionality and power

Page 6: Architectural tricks to maximize memory bandwidth

Explore/Simulate a Memory System

• Key attributes– DRAM datasheet– Memory Controller attributes– Connected Bus topology– Workloads including rate, size, command and back

pressure

Page 7: Architectural tricks to maximize memory bandwidth

Statistical Memory Model for Performance Analysis

Page 8: Architectural tricks to maximize memory bandwidth

Challenges in Memory Usage

• Product– Multimedia, Networking, HPC, Avionics

• Situation– Using an off-the-shelf Processor, FPGA or SoC

• Challenge– What will be the performance and power consumption for

my use-cases?• Metrics

– Power per frame or packet– Latency from sensor input to HDMI output

Page 9: Architectural tricks to maximize memory bandwidth

Opportunities in Memory Usage

• Vary the data sizes• Memory configuration• Ordering of tasks in the use-case• Multiple Masters making asynchronous

request to memory- Addresses• Task and data distribution across multi-core

Page 10: Architectural tricks to maximize memory bandwidth

Full System Analysis

Page 11: Architectural tricks to maximize memory bandwidth

Processor Performance

Page 12: Architectural tricks to maximize memory bandwidth

Challenges in Memory System Design

• SoC interface to memory• AXI bus and NoC topology to minimize the

overhead for each Master• Single vs. dual channels• Memory controller algorithm

Page 13: Architectural tricks to maximize memory bandwidth

Opportunity and Advantage of Design

• Consolidate read and write• Split transaction• Group transaction• Read re-ordering• Transaction priority assignment• Lower clock frequency vs. wider bus

Page 14: Architectural tricks to maximize memory bandwidth

Cycle-accurate Memory Model for Architecture Exploration

Page 15: Architectural tricks to maximize memory bandwidth

Power vs. Timing

Page 16: Architectural tricks to maximize memory bandwidth

About VisualSim

Architecture

Exploration

Performance Analysis

Power Analysis

HW-SW Partitioning

Software

InterfacesRTOS

Hardware

• Graphical and hierarchical modeling

• Large library of stochastic and cycle-accurate components and IP blocks with embedded timing and power

• Library blocks are used to assemble hardware, software, network, traffic, reports and use-cases

Page 17: Architectural tricks to maximize memory bandwidth

System- vs. Pin-level Modeling

Mirabilis Design Inc.

One Router

System Design Transaction-level Cycle-accurate Signal-level

VisualSim

Schematics and RTL are very slow and to detailed for end-to-end metrics

Page 18: Architectural tricks to maximize memory bandwidth

System- vs. Pin-level Modeling

Similarity• Hardware attributes- width,

clock speed, buffer depths• Timing• Algorithms & arbitration• Data & control flow logic• Use addresses

Differences• Data & control combined in

transaction not bits• No pin definitions• No signal handshaking• Skip cycles with no change• Flexible to make major

changes• 100-1000X Faster

05/03/2023 Mirabilis Design Inc. Confidential Slide18

System model accuracy and simulation is sufficient for the explorations

Page 19: Architectural tricks to maximize memory bandwidth

How can System Level Explorations Help improve Memory Performance

• Evaluate performance and power advantages of different types of memory technologies.

• Early prediction of latency, throughput, power, and energy

• Evaluation of next gen Storage device for high bandwidth and less latency requirements

• Spend more time on analysis and less time on implementation

Page 20: Architectural tricks to maximize memory bandwidth

Modeling Libraries - Semiconductors

SoC•AMBA (AHB/ APB/ AXI)•CoreConnect- PLB & OPB•NoC, Virtual Channel•USB

Memory•SDR, DDR, DDR2, DDR3•QDR, RDRAM•LPDDR, LPDDR2, LPDDR3, LPDDR4•HBM•Flash

Processors•ARM•PowerPC- Freescale and IBM•Intel and AMD•TI•MIPS•Tensilica•Renesas SH

Interfaces•PCI, PCI-X, PCIe•RapidIO•NVMe•Serial Switch•Crossbar•Ethernet•Fibre Channel

Page 21: Architectural tricks to maximize memory bandwidth

BenefitsFeatures Benefits

Facilitating transition from concept to design • Creating realistic workload scenarios

driving simulations • Models enable experimentation and

enhance innovation • Simulations facilitate analysis and

exchanges between teams

Increasing productivity • Rapid Exploration and analysis• Graphics are better suited to handle

complexity • Graphics are 10x more efficient than C/C++

programming Optimizing design • HW Footprint, buffers, timings, power

Facilitating implementation and validation • Providing executable specifications for

implementation • Reusing test cases for validation

Page 22: Architectural tricks to maximize memory bandwidth

Deepak ShankarCEO, Mirabilis Design

[email protected]/new/

Phone - 408-245-8992