This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
All other trademarks are the property of their respective owners and are acknowledged
Page 2 of 12
The focus of this paper – the central processor in the Digital Signal Controller
One of the biggest challenges of most systems requiring digital signal processing is to manage the
data flow through the system. The input and output data can represent different real world signals
including motor position, audio signals, video signals, RF signals (GPS, etc.), sensors etc. Moreover,
there are many characteristics of a DSC including the CPU, the peripherals, memory, number of GPIO
pins, connectivity, etc., that contribute to its applicability for a particular design. For the purpose of
this paper, we will focus on the characteristics of the central processor, the architecture of which
significantly influences the software techniques employed for optimum signal processing
throughput.
3. The ARM Cortex-M4 processor – an excellent CPU for 32-bit DSCs A processor specifically designed for DSC devices is the ARM Cortex-M4 processor. This new
processor extends the ARM Cortex-M family of processors into signal processing markets through a
software compatible upgrade migration path for Cortex-M0 and Cortex-M3 users.
Cortex-M4 - microcontroller characteristics
The Cortex-M family of processors has a set of common technologies that make them an excellent
candidate for microcontroller applications. These features have already gained a lot of popularity
through the success of the Cortex-M0 and Cortex-M3 and form a key reason for the high rate of
adoption of the Cortex-M processors in the microcontroller marketplace today.
RISC processor core Thumb-2 technology
High performance 32-bit CPU Deterministic operation Low latency 3-stage pipeline
Optimal blend of 16/32-bit instructions Very high code density No compromise on performance
Low power modes Nested Vectored Interrupt Controller (NVIC)
Integrated sleep state support Multiple power domains Architected software control
Low latency, low jitter interrupt response No need for assembly programming Interrupt service routines in pure C
Tools and RTOS support
CoreSight debug and trace
Broad 3rd party tools support Cortex Microcontroller
Software Interface Standard (CMSIS) Maximizes software effort reuse
JTAG or 2-pin Serial Wire Debug (SWD) connection
Support for multiple processors Support for real-time trace
Table 1 : Microcontroller characteristics of the Cortex-M4 processor
All other trademarks are the property of their respective owners and are acknowledged
Page 5 of 12
Single cycle dual 16-bit MAC
The Cortex-M4 processor can even perform two 16-bit MACs in parallel in a single cycle. This
effectively doubles the raw computational power of the core for 16-bit data and gives it a clear edge
compared to 16-bit devices.
OPERATION INSTRUCTION
(16 x 16) ± (16 x 16) = 32 Sum/difference of dual 16-bit signed multiply
(16 x 16) ± (16 x 16) + 32 = 32 Dual 16-bit signed multiply with single 32-bit accumulator
(16 x 16) ± (16 x 16) + 64 = 64 Dual 16-bit signed multiply with single 64-bit accumulator
Table 5 : Single cycle dual 16-bit MAC operations of the Cortex-M4 processor
Figure 3 shows the use of packed data for a dual 16-bit multiply operation with a single 64-bit
accumulator.
Figure 3 : Cortex-M4 packed data and dual 16-bit MAC
Single precision floating point unit ( FPU )
The FPU in the Cortex-M4 processor offers a wider dynamic range because it can represent a wide
range of numbers. It is also very easy to program, since designers need not worry about the
constraints imposed by fixed-point processing. So far, the availability of floating point hardware in
microcontrollers has been limited due to the higher silicon area costs. The low cost Cortex-M4
processor FPU now opens the path to a wide range of floating point enabled DSC devices.
The Cortex-M4 FPU provides functionality compliant with the IEEE 754 standard. The FPU supports single-precision data-processing instructions and data types. Some of the floating point operations supported are shown in Table 6 below.
FLOATING POINT OPERATION CYCLE COUNT
Add/Subtract 1
Divide 14
Multiply 1
Multiply Accumulate (MAC) 3
Fused MAC 3
Square Root 14
Add/Subtract 1
Table 6 : Selected Cortex-M4 FPU operations and execution times
All other trademarks are the property of their respective owners and are acknowledged
Page 7 of 12
Software tools
The integrated signal processing features of the Cortex-M4 simplify the development of application software by offering a single tool-chain and processing device, when compared to architectures containing separate applications processors coupled with programmable DSPs or fixed-function accelerators. The single tool-chain environment speeds time-to-market as software plays an increasingly important role in product development.
Many of the high performance signal processing instructions of the Cortex-M4 processor can be
taken advantage of through the compiler. When further optimization is required, C compilers
support intrinsic functions for low-level assembly operations. Intrinsics allow you to leverage the
power of assembly programming in a C development environment while hiding much of the
complexity of pure assembly language.
Cortex Microcontroller Software Interface Standard ( CMSIS )
Typically, industries use standards to improve product quality and enable component sharing across
projects. The electronics industry is full of such standards, but the microcontroller market has many
proprietary CPU architectures which prevent the introduction of efficient software standards. This
situation is rapidly changing primarily due to wide adoption of ARM Cortex-M processors. For the
first time ever, the embedded microcontroller industry has the ability to standardize on a single
popular hardware platform.
Figure 5 : CMSIS – a hardware abstraction layer providing consistent access to CPU and peripherals
ARM has created the Cortex Microcontroller Software Interface Standard (CMSIS) that enables
silicon vendors and middleware providers to create software that can be easily integrated. CMSIS
has been developed in close partnership with several key silicon and software vendors. CMSIS is a
vendor-independent hardware abstraction layer that provides a common approach to interfacing
peripherals, real-time operating systems, and middleware components. The standard is scalable to