M. S. Ramaiah School of Advanced Studies 1
M. Sc. (Engg.) in Electronics System Design
Engineering
GREESHMA SCWB0913004 , FT-2013
6th Module Presentation
Module code : ESE2511
Module name : Microcontrollers and Interfacing
Module leader : Mr. Nagananda S.N.
Presentation on : 07/05/2014
ARM Boards for DSP Applications
M. S. Ramaiah School of Advanced Studies 2
• INTRODUCTION
• ARM9E-S
• DM3730
• FUNCTIONAL BLOCK DIAGRAM
• BLOCK DIAGRAM
• SOFTWARE ARCHITECTURE
• CHARACTERISTICS OF DSP PROCESSORS
• FEATURES OF DM3730
• REPRESENTING A DIGITAL SIGNAL
• ADDITION AND SUBTRACTION OF FIXED-POINT SIGNAL
Overview
M. S. Ramaiah School of Advanced Studies 3
• MULTIPLICATION AND DIVISION OF FIXED-POINT SIGNAL
• SQUARE ROOT OF FIXED POINT SIGNAL
• DSP ON ARM9E
• DSP ON ARM10E
• FIR FILTER
• IIR FILTER
• THE DISCRETE AND FAST FOURIER TRANSFORM
• APPLICATIONS
• CONCLUSION
• REFERENCES
Overview
M. S. Ramaiah School of Advanced Studies 4
Introduction
Emerging standards for algorithms in many application areas have put further demands
on the ability of processing platforms to deliver efficient control capability
ARM’s approach has been to design RISC core architectures with instruction sets that
provide efficient support for particular applications, with optimal balance between
hardware and software implementation
To accelerate signal-processing algorithms ARM adds new DSP instructions to the
ARM instruction set
ARM DSP extensions broaden the suitability of the ARM CPU family to applications
that require intensive signal processing and at the same time retaining the power and
efficiency of a high performance RISC microcontroller
The ARM DSP extensions have already been implemented in the ARM926EJ-S,
ARM946E-S, ARM966E-S, ARM9E-S
M. S. Ramaiah School of Advanced Studies 5
Introduction
Processing digitalized signals requires high memory bandwidths and fast multiply
accumulate operations
A microcontroller handles the user interface, and a separate DSP processor manipulate
digitalized signals such as audio
A single-core design can reduce cost and power consumption over a two-core solution
The ARMv5TE extensions available in the ARM9E and later cores provide
efficient multiply accumulate operations
DSP applications are typically multiply and load-store intensive
Filtering is most commonly used signal processing operation
Another very common algorithm is the Discrete Fourier Transform
M. S. Ramaiah School of Advanced Studies 6
Introduction
M. S. Ramaiah School of Advanced Studies 7
ARM9E-S
The ARM9E-S core has the ARM architecture v5TE
This includes an enhanced multiplier design for improved DSP performance
It is a 32-bit microcontroller
It offers high performance for very low power consumption and gate count
The ARM architecture is based on Reduced Instruction Set Computer (RISC)
principles
The reduced instruction set and related decode mechanism are much simpler than
those of Complex Instruction Set Computer (CISC) designs
This simplicity gives
• a high instruction throughput
• an excellent real-time interrupt response
• a small, cost effective, processor macrocell
M. S. Ramaiah School of Advanced Studies 8
DM3730
Based on enhanced device architecture
Integrated on TI’s advanced 45-nm technology
Device supports HLOS and RTOS
Fully backward compatible
M. S. Ramaiah School of Advanced Studies 9
Functional Block Diagram
Figure 1 : DM3730 Functional Block Diagram
M. S. Ramaiah School of Advanced Studies 10
Block Diagram
Benefits
• 2000DMIPS for Oss like linux, Win CE,
RTOS
• 3-D graphics up to 20M polygons per
second for robust GUIs
• Backward compatible with OMAP3530
Figure 2 : DM3730 Block
Diagram
Application
• Smart connected devices
• Patient monitoring
• Media Player
M. S. Ramaiah School of Advanced Studies 11
Software Architecture
Figure 3 : Software Architecture of DM3730
Industry Standard OS component
TI provider component
Open Source
M. S. Ramaiah School of Advanced Studies 12
Characteristics of DSP processor
Harvard Architecture
High performance MAC
Saturating math
SIMD instruction for parallel computation
Barrel shifters
Floating point hardware
M. S. Ramaiah School of Advanced Studies 13
Features of DM3730
ARM microprocessor subsystem
Enhanced direct memory access controller
Video hardware accelerators
Tile based architecture delivering up to 20MPoly/sec
DSP instructions/data little Endian
NEON multimedia architecture
Load store architecture with Non-aligned support
64 32-Bit General purpose registers
Six ALUs, each supports single 32-bit, dual 16-bit, or quad-8 bit , Arithmetic per
clock cycle
M. S. Ramaiah School of Advanced Studies 14
Representing a Digital Signal
Figure 4 : Digitalizing an Analogue Signal
x is signal and t is time
In an analogue signal x[t ], the index t and the value x are both continuous real
variables
ARM uses fixed point representation
M. S. Ramaiah School of Advanced Studies 15
Addition and Subtraction of Fixed-Point Signals
The general case is to convert the signal equation
Fixed-point format
or in integer C
n = m = d. Therefore normal integer addition gives a fixed-point
Provided d = m or d = n
M. S. Ramaiah School of Advanced Studies 16
Contd…
There are four common ways you can prevent overflow
• Ensure that the X[t ] and C[t ] representations have one bit of spare headroom each
• Use a larger container type for Y than for X and C
• Use a smaller Q representation for y[t ]
• For example, if d = n − 1 = m − 1, then the operation becomes
• Use saturation
M. S. Ramaiah School of Advanced Studies 17
Multiplication of Fixed-Point Signals
The general case is to convert the signal equation
Fixed point format
or in integer C
Division of Fixed-Point Signals
The general case is to convert the signal equation
fixed point format
or in integer C
M. S. Ramaiah School of Advanced Studies 18
Square Root of a Fixed-Point Signals
The general case is to convert the signal equation
Fixed point format
or in integer C
M. S. Ramaiah School of Advanced Studies 19
DSP on the ARM9E
The ARM9E core has a very fast pipelined multiplier array that performs a 32-bit by 16
-bit multiply in a single issue cycle
Writing DSP Code for the ARM9E
The ARMv5TE architecture multiply operations are capable of unpacking 16-bit halves
from 32-bit words and multiplying them
The multiply operations do not early terminate. Therefore use MUL and MLA for
multiplying 32-bit integers. For 16-bit values use SMULxy and SMLAxy
Multiply is the same speed as multiply accumulate. Use the SMLAxy instruction
rather than a separate multiply and add
M. S. Ramaiah School of Advanced Studies 20
DSP on the ARM10E
The ARM10E implements a background loading mechanism to accelerate load and store
multiples
It uses a 64-bit-wide data path that can transfer two registers on every background cycle
Writing DSP Code for the ARM10E
Load and store multiples run in the background to give a high memory bandwidth
Ensure data arrays are 64-bit aligned so that load and store multiple operations can
Transfer two words per cycle
The multiply operations do not early terminate. Therefore use MUL and MLA for
multiplying 32-bit integers. For 16-bit values use SMULxy and SMLAxy
The SMLAxy instruction takes one cycle more than SMULxy
M. S. Ramaiah School of Advanced Studies 21
FIR filters
The finite impulse response (FIR) filter is a basic
building block of many DSP applications
FIR filter to remove unwanted frequency ranges, boost
certain frequencies, or implement special effects
The FIR filter is the simplest type of digital filter
The filtered sample y(t) depends linearly on a fixed, finite number of unfiltered
samples x(t)
Calculating accumulated values A[t ]
M. S. Ramaiah School of Advanced Studies 22
IIR filters
An infinite impulse response (IIR) filter is a digital filter
that depends linearly on a finite number of input samples
and a finite number of previous filter outputs
Mathematically
Factorize the filter into a series of bi quads—an IIR filter with M = L = 2
Z-Transform
M. S. Ramaiah School of Advanced Studies 23
The Discrete Fourier Transform
The Fast Fourier Transform
The Discrete Fourier Transform (DFT) converts a time domain signal to a
frequency domain signal
A FFT is an algorithm to compute the discrete Fourier transform and its inverse
M. S. Ramaiah School of Advanced Studies 24
Applications
Portable data terminals
Navigation
Auto Infotainment
Gaming
Medical Imaging
Home automation
Single board
M. S. Ramaiah School of Advanced Studies 25
Conclusion
DM3730 cost effective
It is low power and has high performance
DM3730 delivers a nearly 40% increase in ARM performance
Over 50% increase in DSP performance
Has twice the graphics capability, while reducing power consumption
Use a fixed-point representation for DSP applications where speed is critical with
moderate dynamic range
M. S. Ramaiah School of Advanced Studies 26
Reference
1. DM3730, http:// www.ti.com/lit/ds/symlink/dm3730.pdf
2. DM3730, http://www.ti.com/lit/ml/sprt571/sprt571.pdf
3. DM3730, http://media.digikey.com/pdf/ DM3730_AM3703TorpedoSOMBrief.pdf
M. S. Ramaiah School of Advanced Studies 27