Top Banner
Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008
31

Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Review of OS Controlled NoCfrom IMEC

Jim Stevens

RC Reading Group

01/30/2008

Page 2: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Today’s Papers

• Operating-system controlled network on chip.Nollet, V.; Marescaux, T.; Verkest, D. Design Automation Conference (DAC), 2004. Proceedings. 41stVolume , Issue , 2004 Page(s): 256 - 259

• Centralized run-time resource management in a network-on-chip containing reconfigurable hardware tiles. Nollet, V.; Marescaux, T.; Avasare, P.; Verkest, D.; Mignolet, J.-Y. Design, Automation and Test in Europe (DATE), 2005. Proceedings Volume , Issue , 7-11 March 2005 Page(s): 234 - 239 Vol. 1

Page 3: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Operating-System Controlled Network on Chip

V. Nollet, T. Marescaux, and D. Verkest

DAC 04

Page 4: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Abstract

• Managing NoC is challenging

• OS needs to control NoC

• Tight integration allows for efficiency

• OS can – Optimize communication resource usage– Reduce interference between applications

Page 5: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Introduction

• Future systems will consist of tiles of processing elements (PE)

• Tiles connected by NoC• Mapping tasks onto tiles and dynamically

managing communication is extremely challenging

• Goals– Ensure that compute power matches communication

needs– Provide required QoS

Page 6: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Multiprocessor Emulation

• System consists of a StrongARM processor in a Compaq iPAQ PDA connected to an FPGA using iPAQ expansion port

• Two NoCs built in FPGA– Packet-switched 3x3 bidirectional mesh called data

NoC– Another network for OS control messages

• Both networks at 30 MHz• StrongARM at 206 MHz

Page 7: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Transport Layer

Page 8: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Data Network Interface

• PEs connect to data NoC with data Network Interface Component (dNIC).

• dNIC responsibilities:– Buffer I/O messages for PE– Provide higher level interface to data router– Collect statistics

• Blocked message count: number of received messages that were blocked in the data router while waiting for the PE input buffer to be released.

• Injection rate control mechanism: throttles rate of messages being sent from PE

Page 9: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Control Network Interface

• Connected to control network by cNIC

• Provides OS with unified view of communication resources

• Collects stats from dNIC

• Allows OS to:– Dynamically set routing tables– Manage injection rate of dNIC

Page 10: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Operating System

• One PE is denoted as master– Monitors system and assigned tasks to slave

PEs

• Slaves contain a basic RPC-like mechanism to execute OS functions for master

• Slaves can also call back to the OS using similar functionality for tasks such as synchronization

Page 11: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Operating System Diagram

Page 12: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

NoC Control Tools

• Dynamic Statistics Collection– OS polls cNICs to get traffic stats– Collects the blocked message count to see if congestion is

occuring• Dynamic Injection Rate Control

– Modifies the send window of PE to reduce congestion– Setting window tasks is deterministic and fast (57 μs)

• OS-Controlled Adaptive Routing– Modify routing tables to reduce congestion– Complex operation: temporary stop messages on a channel by

sending sync messages, update routing tables using cNIC OS interface, and finally notify all relevant tasks to resume sending.

– Non-deterministic because it depends on network traffic and complexity of table update.

Page 13: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Send Window Parameters

Page 14: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Case Study

• Tested system with MJPEG decoder

• Consists of four tasks running on PEs

• Two tasks run on StrongARM (tile 3)

• Two other tasks are hardware blocks:

– Huffman decoder/dequantisation– 2D-IDCT and YUV to RGB

converter• Added message gen/sink modules

to put traffic on the network to interfere with channel from node 7 to node 6.

• OS samples cNICs every 20 ms.

Page 15: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Decoder Communication

• Played same sequence with two different windowing techniques (window spreading and allocating continuous blocks) with no interference

• Decrease the window size from 100% to ~0.02%• For window spreading, throughput of the video decoder

does not decrease until effective window is less than 2% of bandwidth, half throughput occurs at 1.5% of bandwidth

• For continuous allocation, half throughput occurs at 75% of bandwidth

• When inference is enabled, window spreading helps reduce jitter because communication is more evenly spread.

Page 16: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Decoder with interference

Page 17: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Centralized Run-Time Resource Management in a Network-on-Chip

Containing Reconfigurable Hardware Tiles

V. Nollet, T. Marescaux, P. Avasare, D. Verkest, J-Y. Mignolet

DATE 05

Page 18: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Introduction

• Same assumptions and system setup as previous paper

• This paper focusing on the task assignment heuristic and dynamic task migration

• Claims to be first paper to address run-time task migration in an NoC context

Page 19: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

System Description

• Same as before

• Task mapping heuristic must find the best PE for each task

• Want to reduce internal fragmentation and optimize communication paths

Page 20: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Resource Management Heuristic

• Requires application specification, user requirements, and current resource usages as input

• Specification is given by a task graph that contains properties of each task such as computation and communication needs

• User requirements given by a simple QoS specification

Page 21: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Heuristic Steps

• Calculate requested resource load

• Calculate task execution variance

• Calculate task communication weight

• Sort tasks according to mapping importance

• Sort PEs for most important unmapped tasks

• Map task to best computing resource

Page 22: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Backtracking

• Some inputs will result in no valid mapping• Use backtracking to attempt to find another

mapping– Undo previous N steps, select second best PE instead of

best PE, then remap remaining N-1 steps– If fails, then try again with N+1

• If backtracking fails, options are:– Use run-time task migration– Use hierarchical configuration– Restart heuristic with reduced user requirements

Page 23: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

RH Add-ons

• For reconfigurable hardware (RH), must task into account internal fragmentation fo reconfigurable area

• If both first and second best tasks are reconfigurable, then want to pick one with lowest internal fragmentation

• Also consider if a regular PE could be used for this task instead of RH– Want to map only computationally intensive tasks to RH

• Can also create softcore processors on RH-They refer to this as “hierarchical configuration”

Page 24: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Heuristic Performance

• Compared to algorithm that explores full solution space for a multimedia pipeline application

• Defined LIGHT, MEDIUM, and HEAVY computational loads for previous load of the platform– If more than 50% of a PE’s resources are used, then the PE is

considered used, otherwise free.• Table 1 shows the success rate for the heuristic with

varying number of backtracking steps with respect to searching the full mapping solution space.

• Demonstrates that RH add-ons to algorithm improve performance.

• Use hop-bandwidth product to show mapping quality.

Page 25: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Mapping Success

Page 26: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Hop-bandwidth Mapping Quality

Page 27: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Run-Time Task Migration

• Goal is to move a task from source tile to destination tile

• Can only migrate at predefined checkpoints in execution (we’ve seen this before)

• They cite a paper to discuss moving between a PE to RH

• Must assure communication consistency – Current methods (buffering or dropping) are not well

suited to NoC for various reasons

Page 28: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Migration Process

• OS issues a migration request• Wait until process reaches a checkpoint

– OS does not know how long this will take• When checkpoint is reached, OS is signaled• OS tells other processes to stop sending to process

– Last message sent by a process has a tag• OS migrates the process, but does not delete the original process• OS tells other processes, including the source tile, to update their

routing tables (DLT) to contain the new task location• When all tagged messages have been received at the new task location,

the OS tells the other processes to start sending again and it frees the original location

Page 29: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Migration Process for Pipelines

• Based on assumption that pipelined apps have stateless points (units of work in the pipeline are independent)

• To start migration, flush the pipeline with a tagged message.

• Move pipeline task that needs to be migrated• Update routing tables for pipeline tasks and restart

pipeline

Page 30: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Benchmarking Migration

• Reaction time: migration request to when task is ready to migrate (checkpoint or stateless point)

• Freeze time: amount of time the migrating task is suspended

• If free resources are available, can start migration during reaction time for pipelines

Page 31: Review of OS Controlled NoC from IMEC Jim Stevens RC Reading Group 01/30/2008.

Conclusion (Both Papers)

• IMEC has developed a NoC prototype that allows the OS to control network traffic

• dNIC and cNIC provide network interface to PEs and RH

• OS can control injection rate, change routing tables, and migrate tasks at run-time to reduce congestion.

• Static task-mapping heuristic based on specification can find efficient ways to take advantage of both communication and computation resources.