7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
1/16
May 2011 Altera Corporation
WP-01159-1.0 White Paper
Subscribe
2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS,QUARTUS and STRATIX are Reg. U.S. Pat. & Tm. Off. and/or trademarks of Altera Corporation in the U.S. and o ther countries.All other trademarks and service marks are the property of their respective holders as described atwww.altera.com/common/legal.html. Altera warrants performance of its semiconductor products to current specifications inaccordance with Alteras standard warranty, but reserves the right to make changes to any products and services at any timewithout notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, orservice described herein except as expressly a greed to in writing by Altera. Altera customers are advised to o btain the latestversion of device specifications before relying on any published information and before placing orders for products or services.
101 Innovation Drive
San Jose, CA 95134
www.altera.com
Feedback
Enabling High-Performance DSPApplications with Arria V or Cyclone V
Variable-Precision DSP Blocks
This document highlights the benefits of variable-precision digital signal processing
(DSP) architecture in Alteras new Arria
V and Cyclone
V FPGAs. Altera's variable-precision DSP block allows designers to tailor the precision on a block-by-block basis,thereby saving resources and power while increasing performance.
IntroductionDSP designs use hundreds or thousands of multipliers as basic building blocks toimplement filters, fast Fourier transforms (FFTs), and encoders that digitally processsignals. Depending on the specific type of filter required, varying precision levels may
be required within a design at each stage of FIR filters, FFTs, detection processing,adaptive algorithms, or other functions. In addition, DSP algorithms with varyingprecision levels often require precision higher than 18 bits. The following sections
discuss the benefits of Alteras variable-precision DSP architecture available inArria V and Cyclone V devices.
Key DSP Design TrendsThe range of DSP precision requirements varies by application, as shown in Figure 1.Video applications use multipliers ranging from 9x9 to 18x18. Wireless and medicalapplications push precision requirements even further when implementing complex,multi-channel filters that must maintain data precision after each filter stage. Military,test, and high-performance computing also push the performance and precisionrequirements, sometimes requiring single- and double-precision floating-pointcalculations for implementing complex matrix operations and signal transforms.
Figure 1. Applications and Precision Range
Video
Surveillance
Broadcast
Systems
Wireless
Basestations
Medical
Imaging
Military
Radar
High-Performance
Computing
9-Bit Precision
100 GMACS
Floating-Point Precision
TeraFLOPS
Applications Moving to Variable and Higher Precisions
https://www.altera.com/servlets/subscriptions/alert?id=WP-01159http://www.altera.com/common/legal.htmlhttp://www.altera.com/common/legal.htmlhttp://www.altera.com/mailto:[email protected]?subject=Feedback%20on%20WP-01159mailto:[email protected]?subject=Feedback%20on%20WP-01159https://www.altera.com/servlets/subscriptions/alert?id=WP-01159https://www.altera.com/servlets/subscriptions/alert?id=WP-01159mailto:[email protected]?subject=Feedback%20on%20WP-01159mailto:[email protected]?subject=Feedback%20on%20WP-01159mailto:[email protected]?subject=Feedback%20on%20WP-01158https://www.altera.com/servlets/subscriptions/alert?id=WP-01158mailto:[email protected]?subject=Feedback%20on%20WP-01155https://www.altera.com/servlets/subscriptions/alert?id=WP-01155http://www.altera.com/http://www.altera.com/common/legal.html7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
2/16
Variable-Precision DSP at 28nm Page 2
May 2011 Altera Corporation Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks
The DSP architecture of the 28-nm Arria V and Cyclone V FPGAs is optimized tosupport both high-performance and variable data precision that enables area andpower efficient implementation of both fixed and floating-point operations.
High-Precision DSP Applications
Many cutting-edge applications require high-performance DSP designs that supporthigher than 18-bit precision, as shown in Figure 2. Precision in this context meansthe size of a multiplier, for example 9x9, 12x12, 18x18, 27x27, and other sizes. Morespecifically, precision refers to the width of each operand applied to a multiplier.
Many traditional DSP functions such as FIR filters, FFTs, and custom signalprocessing datapaths have high-precision requirements. These functions arecommonly implemented in military, medical, and wireless systems. When designsrequire precision higher than 18-bit, designers may implement floating-point signalprocessing to reach this precision level in high-end designs, such as military space-time adaptive radars and MIMO processing on LTE channel cards. Alteras 28-nmsilicon architecture introduces the industry's first variable-precision DSP architecturethat allows designers to tailor the precision of each DSP block to perfectly suit theapplication.
Variable-Precision DSP at 28nmThe variable-precision DSP block in Arria V and Cyclone V FPGAs allow designers toselect from 9x9 precision to implement a video processing design, all the way up tofloating-point precision required for advanced radar designs. Designers canindividually set each DSP block precision to efficiently accommodate bit growth andrequired precision increases within the DSP datapath. In addition, the Arria V andCyclone V DSP block is backward-compatible with all modes supported by Alterasprevious generation 65-nm and 40-nm device families. Figure 3 illustrates theprecision ranges supported by a single Arria V or Cyclone V DSP block.
Figure 2. High-Performance Applications
MILITARY
MEDICAL
WIRELESS
HIGH-PERFORMANCE COMPUTING
TEST AND MEASUREMENT
High-Precision Multiply
Accumulate
High-Precision Finite ImpulseResponse (FIR) Filters
High-Precision Fast Fourier
Transforms (FFTs)
Floating-Point FFTs
Floating-Point Matrix
Operations
http://-/?-http://-/?-7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
3/16
Variable-Precision DSP at 28nm Page 3
May 2011 Altera Corporation Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks
Variable-Precision DSP BlocksFigure 4 maps the multiplier precision required by various FPGA markets to thesupported multiplier precisions in Arria V DSP blocks. The Arria V DSP blocknatively supports nearly all of precision levels required by these applications. Thefollowing sections describe the full-precision, 18x18 with pre-adder mode that iseffective in the wireless market.
Figure 3. Architecture with Selectable Precision
Vid
eo
Wireless
Milita
ry
9x
918x18
18x
2527x27
36x
36
54x54
Set the Precision Dial to Match Your Application
Figure 4. Precision Requirements and Arria V Precisions
IndustrialVideo
BroadcastSystems
WirelessSystems
MedicalImaging
MilitaryRadar
High-
PerformanceComputing
Precision
Requirements9x9
12x12
16x16
18x18
18x18
18x25
18x36
27x27
18x18
18x25
18x36
27x27
27x27
54x54
Supported Precisions9x9
12x12
16x16
18x18
18x18
18x25
18x36*
27x27
18x18
18x25
18x36*
27x27
27x27
36x36*
54x54*
Supported Precisions
(Competitive FPGAs) 18x25 18x25 18x25 18x25 18x25 18x25
* Requires additional logic outside of the DSP block to implement
7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
4/16
Variable-Precision DSP at 28nm Page 4
May 2011 Altera Corporation Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks
Variable-Precision Modes
The Arria V, Cyclone V, and Stratix V DSP block are the first to offer two nativeprecision modes, as shown in Figure 5.
The available modes are 18-bit mode, and high-precision mode for 27x27multiplications. Figure 6 shows the various multiplier precision modes available inthe Arria V (and Cyclone V) DSP block. Designers can implement an 18x36 multiplier
by using one DSP block plus additional logic outside the DSP block. Similarly,designers can implement a 36x36 multiplier by using two DSP blocks and additionallogic outside the DSP block, or a 54x54 multiplier by using 4 DSP blocks andadditional logic outside the DSP block.
Variable-Precision Efficiency
While the key advantage of variable precision is the ability to take advantage of block-
by-block implementation efficiencies, the Arria V variable-precision DSP block alsoprovides the highest number of multipliers of different precisions compared tocompeting architectures, as shown in Figure 7.
Figure 5. Arria V and Cyclone V DSP Modes
108 Bits
InputRegister
+/- X
X
CoefficientBank
Coefficient
Bank
+
-+-
+-
IntermediateMultiplexer
64 Bits
64 Bits
18 Bits
18 Bits
Feedback
Multiplexer
Feedback
Register
OutputMultiplexer
OutputRegister
74 Bits
+/-
18
18
18
18 18x19
18x19
+-
+-
IntermediateMultiplexer
64 Bits
64 Bits
Feedback
Multiplexer
Feedback
Register
OutputMultiplexer
OutputRegister
74 Bits108 Bits
25
27
25
X
27x27
27 bitsInputRegister
CoefficientBank
+/-
18-Bit Precision Mode High-Precision Mode
Figure 6. Precisions Available in Arria V and Cyclone V FPGAs
Within 1 DSP Block Within 2 DSP Blocks
Within 3 DSP Blocks
Quantity QuantityMultiplier Mode Multiplier Mode
3
2
2
2
1
1
1
1
1
1
1
1
9x9
12x12
16x16
18x19
18x25
27x27
18x36*
18x18 complex multiply
36x36* complex multiply
18x25* complex multiply
27x27 complex multiply
54x54* complex multiply
Within 4 DSP Blocks
1 18x36* complex multiply
* Requires additional logic outside of the DSP block to implement
7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
5/16
Variable-Precision DSP at 28nm Page 5
May 2011 Altera Corporation Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks
DSP Multiplier Comparison
Variable-precision DSP blocks provide significant advantages when implementingmultipliers of varying precision. Figure 8 compares an Arria V device of 363 KLEs and1045 variable-precision DSP blocks, against a Kintex-7 device of 356 KLCs and 1440DSP blocks. When compared with the Kintex-7 XC7K355T device, the Arria V5AGXB3 device variable-precision DSP blocks provide a clear advantage whenimplementing multipliers of different precisions. Nearly across the board, variable-precision DSP blocks provide more multipliers per device.
Although competing solutions may offer a few more multipliers in the 18x25 mode,this mode accounts for only a small portion of actual user configurations. Figure 9provides a comparison of Cyclone V FPGA multipliers against competitive solutions.In general, the Cyclone V device offers more multipliers of different precisions thanthe Artix-7. The only exception is in the case of 18x25 precision.
Figure 7. Multiplier Precision Comparison
Figure 8. Arria V FPGA Multiplier Count Comparison
2X to 3X Number of Multipliers per Variable-Precision
DSP Block Means Power Reduction
Multiplier Precision
Arria V & Cyclone V
Variable-PrecisionDSP Block
18x25
DSP48 Block
9x9 (industrial video) 3 per block 1 per block
12x12 (broadcast) 2 per block 1 per block
16x16 (broadcast, digital cinema) 2 per block 1 per block
18x18 (wireless, medical, military) 2 per block 1 per block
18x25 (wireless, medical, military) 1 per block 1 per block
18x36 (medical, military) 1 per block* 0.5 per block
27x27 (military, high-performance
computing)1 per block 0.5 per block
* Requires additional logic outside of the DSP block to implement
More DSP Resources vs. the Competition
Multiplier PrecisionArria V FPGA
5AGXB3
Kintex-7
XC7K355T
9x9 (industrial video) 3,135 1,440
12x12 (broadcast) 2,090 1,440
16x16 (broadcast, digital cinema) 2,090 1,440
18x18 (wireless, medical, military) 2,090 1,440
18x25 (wireless, medical, military) 1,045 1,440
18x36 (medical, military) 1,045 720
27x27 (military, high-performance
computing)1,045 720
7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
6/16
DSP Block Evolution Page 6
May 2011 Altera Corporation Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks
DSP Block EvolutionAlteras DSP block architecture has evolved at each process node over time, asillustrated in Figure 10. The fundamental theme of this evolution is backwards-compatibility and new features that support the next generation of DSP systemdesigns.
Historically, the Arria device DSP block implemented four independent 18x18multipliers. The Arria II device DSP block continues to support this mode and adds
more efficient implementation of eight 18x18 multipliers in sum mode via a 44-bitcascade bus. Designers can effectively use this mode to implement common FIR filterstructures.
The latest 28-nm, variable-precision DSP blocks in Arria V and Cyclone V devicesmaintain compatibility with previous generation devices, while increasing capabilityfor higher precision signal processing. The Arria V and Cyclone V DSP blockarchitecture fabric is enhanced to implement the highest performance and highestprecision DSP application data paths.
Figure 9. Cyclone V FPGA Multiplier Count Comparison
More DSP Resources vs. the Competition
Multiplier Precision
Cyclone V FPGA
5CGXC4
Artix-7
XC7A50T
9x9 (industrial video) 210 120
12x12 (broadcast) 140 120
16x16 (broadcast, digital cinema) 140 120
18x18 (wireless, medical, military) 140 120
18x25 (wireless, medical, military) 70 120
18x36 (medical, military) 70 60
27x27 (military, high-performance
computing)70 60
Figure 10. Evolution of DSP Blocks in Arria V FPGA
36
36
36
36
36
36
36
36
72
72
72
72
36
36
36
36
108
108
108
108
74
74
74
74
DSP Half Block
DSP Half Block
Variable-Precision
DSP Block
Variable-Precision
DSP Block
Variable-Precision
DSP Block
Variable-Precision
DSP Block
Eight 18x18 Multipliers
(Sum)
Four 18x18 Multipliers
(Independent)
Arria II FPGA Arria V FPGAArria GX FPGA
DSP Block
Eight 18x19 Multipliers
(Sum)
Eight 18x19 Multipliers
(Independent)
Four High-Precision
Mode Blocks
Four 18x18
Multipliers
(Independent)
7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
7/16
DSP Block Evolution Page 7
May 2011 Altera Corporation Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks
Key DSP Enhancements
Arria V and Cyclone V DSP blocks include the following enhancements:
Pre-adders
18x19 Multipliers
Coefficient Banks
Feedback Registers
Independent Multipliers
The following sections discuss these enhancements in greater detail.
Pre-adders
The Arria V and Cyclone V DSP block is enhanced to include pre-adders to reducemultiplier count in symmetric FIR filters, as shown in Figure 11. These pre-addersaccept full 18-bit operands, including sign bits. These pre-adders are referred to ashard pre-adders because they are implemented in dedicated hardware resources,
rather than as FPGA logic gates.
Figure 12 provides a more detailed view of the hard pre-adders. The next sectionprovides an example application that uses pre-adders in a FIR filter design.
Figure 11. Pre-Adders High Level View
108 Bits
InputRegister
+/- X
X
Coefficient
Bank
Coefficient
Bank
+
-+
-
+
-
IntermediateMultiplexer
64 Bits
64 Bits
18 Bits
18 Bits
Feedback
Multiplexer
Feedback
Regis
ter
OutputMultiplexer
OutputRegister
74 Bits
+/-
18
18
18
18 18x19
18x19
Pre-Adders
7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
8/16
Page 8 DSP Block Evolution
Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks May 2011 Altera Corporation
Figure 13 illustrates the use of pre-adders in a FIR filter. Typically designers use pre-adders for building symmetric FIR filters. As the filter data is shifted across the
coefficient set, two data samples can be multiplied by a common coefficient due to thesymmetry. The pre-adder adds two samples prior to multiplication, which allows theuse of one, rather than two, multipliers for every two data samples. Pre-adders reduce
by half the number of required multipliers for symmetric FIR filters, and eliminate theneed to implement such adders using the logic gates in the FPGA. This techniqueincreases logic efficiency and performance. Designers can use this hard pre-adder aseither a dual 18-bit pre-adder, or as a single 27-bit pre-adder, depending on therequired precision.
Figure 12. Hard Pre-Adder Detail
Figure 13. Usage of Pre-adders in Symmetric FIR Filter
+
_
X
19
18
38
C0
C1
18
18
18
Enhanced Pre-Adders
+/- X
19
18
18
+/-
+
_
X
19
18
38
18
18
18
Enhanced Pre-Adders
+/- X
19
18
18
+/-
+
D3 D3D2 D2D1
D1
D0
D0
C0 C0 C0C1 C1 C1
X
XXXXX XX
+
+
+
++
+/-
+/-
+
-
Benefit: reduce from
4 to 2 multipliers
Pre-Adders
18x19
18x19
18-Bit
Coeffic
ient
18-Bit
Coeffic
ient
http://-/?-http://-/?-7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
9/16
DSP Block Evolution Page 9
May 2011 Altera Corporation Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks
18x19 Multipliers
The Arria V and Cyclone V DSP block is enhanced to include an 18x19 multiplier, asshown in Figure 14.
Previous generation devices included only an 18x18 multiplier. The 18x19 multiplieraccepts 19-bit results from the output of the 18+18 pre-adder. Designers can use theextra bit in each pre-adder operand to represent the + or - sign of each operand.Figure 15 shows a close-up view of the 18x19 multiplier.
Figure 14. 18x19 Multipliers
Figure 15. Close-up View of 18x19 Multiplier
108 Bits
InputRegister
+/- X
X
Coefficient
Bank
Coefficient
Bank
+
-+
-
+
-
IntermediateMultiplexer
64 Bits
64 Bits
18 Bits
18 Bits
Feedback
Multiplexer
Feedback
Register
OutputMultiplexer
OutputRegister
74 Bits
+/-
18
18
18
18 18x19
18x19
18x19
Multipliers
+
_
X
19
18
38
C0
C1
18
18
18
18x19 Multipliers
+/- X
19
18
18
+/-
http://-/?-http://-/?-7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
10/16
Page 10 DSP Block Evolution
Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks May 2011 Altera Corporation
Figure 16 illustrates an example application of the 18x19 multiplier.
Coefficient Banks
Arria V and Cyclone V DSP blocks include a coefficient storage bank that isdynamically selectable on each clock cycle, as illustrated by Figure 17.
This feature is especially helpful in DSP designs that include FIR filters implementedin hardware using a parallel or partially parallel structure, which often require only asmall number of coefficients per multiplier. Alteras variable-precision DSParchitecture provides an internal coefficient bank that designers can set to support 18-
bit and higher precision signal processing. In 18-bit mode, the coefficient bank is
Figure 16. Usage of 18x19 Multiplier
Figure 17. Coefficient Blocks within the DSP Block
MLAB MLAB
MLAB
MLAB
MLAB
MLAB
MLAB
MLAB
+
+
+
CoefficientBank*
+Coefficient
Bank*
18
18
18
18
18
18
19
19
19
18x19
18x19
18x19
DSP BlockMLAB: N accumulator
registers for N channels
18x19 Multipliers
Benefit: 18x19 multiplier accepts
18+18 data => full 18-bit add
with 19-bit result
* Use memory logic array block (MLAB) for cases where
number of coefficients per multiplier > 8
+
+Coefficient
Bank* 18
+
108 Bits
InputRegister
+/- X
X
Coefficient
Bank
Coefficient
Bank
+
-+
-
+
-
IntermediateMultiplexer
64 Bits
64 Bits
18 Bits
18 Bits
Feedback
Multiplexer
Feedback
Register
OutputMultiplexer
OutputRegister
74 Bits
+/-
18
18
18
18 18x19
18x19
CoefficientBanks
7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
11/16
DSP Block Evolution Page 11
May 2011 Altera Corporation Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks
configured as two, 18-bit wide register banks, each capable of storing eightcoefficients per multiplier. In the high-precision mode, the coefficient bank isconfigured as a single, 27-bit wide register bank capable of storing eight coefficientsper multiplier. The coefficient banks allow designers to select which of the eightregisters should be used as a coefficient source for the multiplier for every clock cycle.
Use of the internal coefficient bank eases timing closure complexity and reduces on-
chip memory and register resource usage, both of which are critical in DSP designs.Figure 18 shows the coefficient bank in the 18-bit mode and in the 27-bit mode.
Figure 19 shows how a serial filter is implemented, making use of the two 18-bitcoefficient banks. The DSP architecture of the Arria V and Cyclone V FPGA effectivelysupports this type of filter because the coefficient banks, the 18x19 multipliers, and theoutput register are all contained in one DSP block. In addition, the output can becascaded to the next block in a sequential chain. Having the coefficient bank inside theDSP block reduces logic and routing utilization, thus improving filter performance.
Figure 18. Structure of Coefficient Bank
Figure 19. Usage of Coefficient Bank in Filter
0
1
2
3
4
5
6
7
18 Bits
0
1
2
3
4
5
6
7
27 Bits
OR
18 Bits
InputRegister
64 Bits
18 bits
Coefficient Banks
Benefit: reduce logic and
routing utilization =>
improve filter performance
18-Bit
Coeffici
ent
+/-
X27x27
18-Bit
Coeffic
ient
+/-
18 bits
18 bits
18 bits
18 bits
18 bits
X
18x19
18-bit
18-bit
X
18x19
+-
OutputRegister
+-
Bias
Register
Fe
edback
R
egister
7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
12/16
Page 12 DSP Block Evolution
Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks May 2011 Altera Corporation
Feedback Registers
Arria V and Cyclone V DSP blocks include feedback registers that can serve as thesecond stage in a two-stage accumulator comprised of the output register andfeedback registers. The relative position of the feedback register in the DSP block isillustrated in Figure 20.
Figure 21 shows how a polyphase serial filter is implemented, with the feedbackregister enabled to provide a feedback path. This structure enables two independentserial-filter channels in one single DSP block. Each channel has its own set of input.The feedback path is time multiplexed, allowing processing of the real part and the
imaginary part of a complex signal in alternating clock cycles. Only N/2 adders areneeded because the Arria V and Cyclone V DSP block in 18-bit mode has two 18x19multipliers per DSP block. This implementation is efficient and saves resources.
Figure 20. Feedback Register
Figure 21. Feedback Register Usage
108 Bits
InputRegister
+/- X
X
Coefficient
Bank
Coefficient
Bank
+
-+
-
+
-
IntermediateMultiplexer
64 Bits
64 Bits
18 Bits
18 Bits
Feedback
Multiplexer
Feedb
ack
Regis
ter
OutputMultiplexer
OutputRegister
74 Bits
+/-
18
18
18
18 18x19
18x19
Feedback
Register
+
+
+
++
M10K
M10K
M10K
M10K
Feedback Register
18x19
18x19
18x19
DSP Block
N:1
Complex
Input
DataDemultiplexer
N/2:1
Multistage
Adder,
UsingComplex
Inputs and
Outputs
7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
13/16
DSP Block Evolution Page 13
May 2011 Altera Corporation Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks
Independent Multipliers
Arria V and Cyclone V DSP blocks include independent multipliers. This means thatthe output(s) of the multiplier(s) can be routed to the output port of the DSP blockdirectly, without going through any adder. Figure 22 shows two 18x19 multiplierswhich can be configured to work in the sum mode or independent mode
Each DSP Block contains two 18x19 multipliers. These blocks can be used as twocompletely independent multipliers with inputs fed from outside the DSP block, asshown on the left-hand side of Figure 23, or each multiplier having one operand fedfrom a coefficient bank, and the outputs of the multipliers delivered independently, as
shown on the right hand side of Figure 23.
The output port of the DSP block in Arria V and Cyclone V is 74-bits wide andtherefore can accommodate the output of 37 bits of the two independent 18x19multipliers. This means that all 37 bits from each multiplier are directly accessible onthe output port.
Figure 22. Input/Output Ports
Figure 23. Application Example
108 Bits
InputRegister
+/- X
X
Coefficient
Bank
Coefficient
Bank
+
-+
-
+
-
IntermediateMultiplexer
64 Bits
64 Bits
18 Bits
18 Bits
Feedback
Multiplexer
Feedback
Register
OutputMultiplexer
OutputRegister
74 Bits
+/-
18
18
18
18 18x19
18x19
Independent
Multipliers
X
19
18
37
37
C0
C1
18
18
18
18x19 Multipliers
+/-
37
18
19
X
X
37
18
19
X
19
18
18
+/-
7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
14/16
Page 14 Altera Floating-Point Precision
Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks May 2011 Altera Corporation
Altera Floating-Point PrecisionDepending on the application, the precision requirement may require thatmultiplications are performed with single-precision, floating-point multiplications, ordouble-precision, floating-point multiplications. The Arria V and Cyclone V DSP
block is capable of both levels of precision, as described in the following sections.
IEEE Standard 754 floating point is the most common representation of floating-pointnumbers. In this format, single-precision floating point is 32-bits wide with a 24-bitmantissa, while double-precision floating point is 64-bits wide and has a 53-bitmantissa.
Floating-point computations involve mantissa multiplication and exponent addition.The Altera variable-precision DSP architecture can implement mantissamultiplication for a single-precision, floating-point number using one block ORmantissa multiplication for a double-precision, floating-point number.
Single-Precision Floating-Point Multiplication
Using the high-precision mode, the variable-precision block is uniquely suited forimplementing single-precision, floating-point operations. Mantissa multiplication can
be implemented using only one variable-precision block configured in the high-precision mode. This resource efficiency is an FPGA industry first. Traditionallydesigners had to cascade multiple blocks to implement this operation. The coefficientsmay be applied externally as shown on the left-hand side or internally as shown onthe right-hand side in Figure 24. Competing DSP architectures with 18x25 bitresolution require multiple blocks, as well as external logic to implement a floating-point mantissa multiplication, resulting in a lower performance and higher powerimplementation.
Figure 24. Single-Precision Floating-Point Multiplication
+-
+
-
IntermediateMultiplexer
64 Bits
64 Bits
Feedback
Multiplexer
Feedback
Register
OutputMultiplexer
OutputRegister
74 Bits108 Bits
25
27
25
X
27x27
27 bitsInputRegister
CoefficientBank
+/-
27 Bits
27 Bits 27 Bits
InputRegister
InputRegister
X
27x27Acc
Reg
64 X
27x27Acc
Reg
64
CONFIGURABLE
27-Bit
Coeffic
ient
7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
15/16
Competitive Summary Page 15
May 2011 Altera Corporation Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks
Double-Precision Floating-Point Multiplication
Double-precision mantissa multiplication requires four DSP blocks all cascaded byusing the dedicated 64-bit cascade bus in the DSP block, as shown in Figure 25.
This technique is an FPGA industry first, because competing architectures requirecascading two 18x25 blocks for single-precision, floating-point mantissamultiplication and up to nine blocks (with extra logic) to implement a 54x54 double-
precision mantissa multiplier.
Competitive SummaryWith the introduction of the variable-precision DSP architecture, Altera has opened aDSP technology gap against competing architectures, as summarized in Figure 26.Alteras latest 28-nm devices can natively, and within a single block, implement a27x27 multiplier useful for high-precision, fixed-point DSP, or for emerging floating-point DSP applications. Variable precision means that designers set the DSParchitecture precision to match the algorithm, not the other way around. Also with a64-bit cascade bus and accumulator, designers don't have to forgo precision when thealgorithm implementation requires multiple DSP blocks.
Figure 25. Floating-Point Modes
Single-Precision Mantissa Multiplication
(27x27 Mode)
Double-Precision Mantissa Multiplication
(54x54 Mode)
OR
+-
+
-
IntermediateMultiplexer
64 Bits
64 Bits
Fee
dback
Multiplexer
Feedback
Register
OutputMultiplexer
OutputRegister
74 Bits108 Bits
25
27
25
X
27x27
27 bitsInputRegister
Coefficient
Bank
+/-
+-
+
-
IntermediateMultiplexer
64 Bits
64 Bits
Feedback
Multiplexer
Feedback
Register
OutputMultiplexer
OutputRegister
74 Bits108 Bits
25
27
25
X
27x27
27 bitsInputRegister
CoefficientBank
+/-
+-
+
-
IntermediateMultiplexer
64 Bits
64 Bits
Feedback
Multiplexer
Feedback
Register
OutputMultiplexer
OutputRegister
74 Bits108 Bits
25
27
25
X
27x27
27 bitsInputRegister
CoefficientBank
+/-
+-
+-
IntermediateMultiplexer
64 Bits
64 Bits
Feedback
Multiplexer
Feed
back
Reg
ister
OutputMultiplexer
OutputRegister
74 Bits108 Bits
25
27
25
X
27x27
27 bitsInputRegister
CoefficientBank
+/-
+
-
+
-
IntermediateMultiplexer
64 Bits
64 Bits
Feedback
Multiplexer
Feedback
Register
OutputMultiplexer
OutputRegister
74 Bits108 Bits
25
27
25
X
27x27
27 bitsInputRegister
Coefficient
Bank
+/-
http://-/?-http://-/?-http://-/?-http://-/?-7/29/2019 Altera Enabling High-Performance DSP Applications With Arria v or Cyclone v Variable-Precision DSP Blocks
16/16
Page 16 Conclusion
Enabling High-Performance DSP Applications with Arria V or Cyclone V Variable-Precision DSP Blocks May 2011 Altera Corporation
ConclusionAltera's variable-precision DSP block allows the designer to tailor the precision on a
block-by-block basis. For symmetric filters, hard pre-adders in the DSP block reducethe required multiplier count by 50%, thus saving resources and power. The 18x19multipliers accommodate full 18+18 addition, including sign bits. Internal coefficient
banks enable higher multiplier performance and save logic resources.
The Arria V and Cyclone V DSP block is optimized for FIR filters, and the feedbackregister allows implementation of two independent serial-filter channels per DSP
block. The independent multipliers allow operands to be applied directly to themultipliers and allow the multiplier outputs to be observed directly on the DSP blockoutput port. Finally, Altera offers the industry's first floating-point function in anFPGA architecture.
Further Information Arria V FPGA Family Overview
http://www.altera.com/products/devices/arria-fpgas/arria-v/overview/arrv-overview.html
Arria V Device Family Advance Information Briefhttp://www.altera.com/literature/hb/arria-v/av_51001.pdf
Cyclone V FPGA Family Overview
http://www.altera.com/products/devices/cyclone-v-fpgas/overview/cyv-overview.html
Cyclone V Device Family Advance Information Briefhttp://www.altera.com/literature/hb/cyclone-v/cyv_51001.pdf
Acknowledgements Pat Fasang, Senior Member of Technical Staff, DSP Marketing, Altera Corporation
Figure 26. Competitive Comparison
Feature XilinxArria V and
Cyclone V FPGAs
Native support for 27x27 multiply mode
Variable-precision multiplier size:
27x27 or 18x19 (dual)
Efficient implementation of floating point
Coefficient register banks within the
DSP block
Efficient 2-stage accumulator
(feedback register)
Accumulator size 48 bits 64 bits
Width of cascade bus 48 bits 64 bits
Pre-adder support for symmetric filters
Support for systolic FIR filters
http://www.altera.com/products/devices/arria-fpgas/arria-v/overview/arrv-overview.htmlhttp://www.altera.com/products/devices/arria-fpgas/arria-v/overview/arrv-overview.htmlhttp://www.altera.com/literature/hb/arria-v/av_51001.pdfhttp://www.altera.com/products/devices/cyclone-v-fpgas/overview/cyv-overview.htmlhttp://www.altera.com/products/devices/cyclone-v-fpgas/overview/cyv-overview.htmlhttp://www.altera.com/literature/hb/cyclone-v/cyv_51001.pdfhttp://www.altera.com/products/devices/arria-fpgas/arria-v/overview/arrv-overview.htmlhttp://www.altera.com/products/devices/cyclone-v-fpgas/overview/cyv-overview.htmlhttp://www.altera.com/literature/hb/cyclone-v/cyv_51001.pdfhttp://www.altera.com/literature/hb/arria-v/av_51001.pdf