Click here to load reader
The World Leader in High Performance Signal Processing Solutions
BlackfinPresentation
ADI programmable processor Architecture
TigerSHARCHigh Performance
SHARCLow Cost
Floating Point
2.5G/3G InfrastructureMedical ImagingIndustrial ImagingMultiprocessing
Per
form
ance
ADSP-21xxPower efficient
Fixed Point
BlackfinADSP-BF53x
Media enabledFixed Point
Wired VoiceWireless VoiceVOIP/VONIndustrial Control
Image compression3G TerminalsDigital Still/Video CameraMMOIPTelematicsBiometrics
AudioInfotainmentIndustrial
What does Enable?
Micro-Processing
Image Processing
Digital Signal
Processing
Wireless ConnectivityBluetoothGSM3rd Generation
Digital ImagingCODECs
MPEGJPEGH.263H.264
System Control / Applications Software
Wired ConnectivityUSBTCP/IPMOST NetworkH.323/MEGACO
Human InterfaceSpeech RecognitionText To SpeechHandwritingAudio
Operating Systems /RTOS
Designed for High Level Language
ControlNetworkingRTCWatchdogRTOSMMUByte addressable
Blackfin Processors Perform Signal Processing and Micro-controller Functions
MCU
SignalProc
SignalProc
SignalProc ASIC Traditional
model
Interfaces to sensorsBroad peripheral mixMemory
New model
Blackfin can perform all of these functions
Blackfin – Micro Signal Architecture
The Micro Signal Architecture was crafted with the requirements of a controller and a DSP in mind
Blackfin IS NOT just a DSP with an enhanced instruction setBlackfin IS NOT just a processor with a couple of arithmetic units added
Blackfin IS an architecture that is optimized to perform equally well on both control and numeric algorithmsBlackfin CAN easily be programmed in assembler, C/C++, or mixed
Blackfin – A Convergent Processor
BLACKfin is a high performance dual MAC DSP with features more normally seen on a 32-bit RISC microprocessor
Supervisor and User ModesMemory ProtectionByte addressing8-, 16-, 32-bit mathMultimedia processing extensions
Single Processor target for software developmentSingle development tools environmentSingle Programming ModelSingle Instruction SetSimplified emulation of otherwise asynchronous cores
BlackfinNative Hooks in core for Audio & Video processing
Video CompressionUp to four 8-bit math operations in a single cycle ~300 cycle execution for an 8*8 DCT (the foundation of MPEG motion estimation)IEEE 1180 Rounding maximizes efficiency Motion Estimation Executes four partial Sum Absolute Differences in a single cycle Huffman Coding Field Deposit / Extract instruction
AudioVoice Codecs: On-The-Fly Saturation for 2G, 3G Extended precision for Dolby decoding
OtherInstruction Set Support for Complex Math, Bit Interleaving, Population Count, Viterbi Dual Add-Compare-Select, and CRC
Blackfin – Dynamic Power Management lowers power consumption
Self contained, software programmable power management system that allows for independent control of either frequency or voltage
Variable FrequencyClock dividers (1x to 63x) enable low latency changes in system performance
Variable VoltageOn-Chip Voltage Regulator generates accurate voltage from 2.25 – 3.6V input
Core voltage programmable from 0.8V to 1.2V (50 mV increments)
Pow
er (m
W)
600 MHz, 1.2V
200 MHz, 0.8V
200 MHz, 1.2V
500 MHz, 1.2V
500 MHz, 1.0V
Frequency Only
Voltage & Frequency
Power Savings
Audio ProcessingVideo Processing
Blackfin – MMU Protects You
Supervisor & User Protection of Memory and Registers
System Code and Event Handlers
Application Code
Power DownStates
Supervisor
User
Emulation
UserUser
Protected SystemEnvironment
Peer-Peer Protection
BlackfinIndustry Standard RTOS and OS Support
VCSE
Blackfin
Real Time DSP Code
RTOS
OS
Control Applications
Operating SystemsµCLinux – Now
Real Time Operating SystemsVDK from ADI - NowUnicoi Fusion - NowAccelerated Technology Nucleus - NowExpress Logic ThreadX - NowQuadros RTXC - NowGreen Hills INTEGRITY– NowGreen Hills VelOSity – NowuITRON (API) - Now
Networking StacksKadak Kwik-Net – NowUnicoi Fusion Net – NowExpress Logic Net-X – Now
The World Leader in High Performance Signal Processing Solutions
ArchitectureCore
WatchdogAnd Timers
DMA Controller
UART0/1IRDA
Real Time Clock
Programmableflags
SPORTs SPI
EBIU
1KB internalBoot ROM
System Bus Interface Unit
32 Core D1 bus 64 Core I bus
CoreTimer
JTAG/Debug
Performance Monitor
Core Processor
L1InstructionMemory
L1 Data
MemoryLD1 32
64
PPI
Peripheral Access Bus (PAB)DMA Access Bus
(DAB) External Access Bus (EAB)
Power Management
Event Controller
32DMA Masteredbus
Core DA0 bus32 32Core D0 bus
Core DA1 bus32Core Clock (CCLK) Domain
System Clock (SCLK) Domain
LD0 32
16 16
16
External Port Bus (EPB)
DMA Ext Bus (DEB)
16
DMA Core Bus (DCB)16
SD32
DataAddressControl
ETHERNET MAC
CANTWI
PORTS BF536/7 Only
Blackfin DSP : MSA Architecture
Fixed point DSP MathDual 32/40 bit Data ALUs with 40-bit AccumulatorsDual 16-bit MACsDual 32-bit Data Address Generators (DAG) ALUs
Conventional addressing modes (with pointers) for C codeModified Harvard Memory Architecture
Two data ports and one code port for the unified 4G Byte addressable memory architecture
Configurable hierarchical data and instruction memories
Caches or SRAM configured by the userInstruction Set Optimizations
Dual instruction lengths – 16 & 32 bits for “Control” and “DSP” operationsDual instruction dispatch forms (16/32 bit & 64-bit (1x32, 2x16))
Blackfin : MSA Architecture
Two 16-bit MultipliersTwo 32/40-bit ALUsFour 8-bit video ALUsBarrel ShifterSixteen 16-bit math registers/Eight 32-bit math registersTwo DAGs, with byte addressing supportEight 32-bit pointer registersFour sets of 32-bit index, modify, length and base registers
Acc1
40BarrelShifter
Acc0
40
16168 8 8 8
Address Arithmetic Unit
DAG0 DAG1
I3 L3 B3 M3I2 L2 B2 M2I1 L1 B1 M1I0 L0 B0 M0
P0P1P2P3P4P5FPSP
R0R1R2R3R4R5R6R7
Data Arithmetic Unit
Sequencer
Blackfin : Register Set
LT0LB0
Loop CounterLoop TopLoop Bottom
ASTAT
RETS
RETI
RETX
RETN
RETE
Arithmetic Status
Subroutine Return
Interrupt Return
Exception Return
NMI Return
Emulation Return
LT1LB1
System Config
Sequencer Status
SYSCFG
SEQSTAT
LC0
LC1
SystemRegisters
I0
I1
I2
I3
L0
L1
L2
L3
B0
B1
B2
B3
M0
M1
M2
M3
31 0 31 0 31 0 31 0
P0
P1
P2
P3
P4
P5
31 0
FP
SP
USP
Address Registers
R0
R1
R2
R3
R4
R5
R6
R7
R0.LR0.H
R1.LR1.H
R4.LR4.H
R7.LR7.H
1531
A1A0
A1X
A0X
Data Registers
Blackfin : Data
Support for three data lengths.32-bit, 16-bit and 8-bit data
Integer and fractional data types for 32-bit/16-bit data.8-bit data is always integerSigned/Unsigned data support for integers
Little Endian data format.ALU operands may be 16-bit or 32-bit.Multiplier operand types are specified in the instruction.Shifter operates on signed/unsigned operandsASTAT FLAGs are updated upon ALU/MULT/SHIFT operations
Blackfin : ALU
2 ALUs exist in Blackfin MSA coreSupport arithmetic and logical operations on fixed point dataALU instructions operate on 16-bit, 32-bit and 40-bit integer (Accumulation) operands
Key ALU Instruction CategoriesFixed point addition and subtraction (register based and with immediate values)Accumulation of multipliesLogical operations (OR, AND, XOR, NOR etc.)Functions : ABS, MAX, MIN, Division primitives etc.
Blackfin : Register View of Math
031 16
R3
MAC 0ALU 0
MAC 1ALU 1
R6 R7A0
RegisterFile
32 32
32
R201631
Dual ALU / MACfunctions are “Vector” functions
Two Pairs of operands areavailable
A1
32
Blackfin : Multiplier & Accumulator
Blackfin supports 2 multipliers Can perform 2 Multiply and 2 Accumulates (2 MACs)/cycle2 Accumulators – 40 bits each (8 guard bits)
Destination register may be an R register or an Accumulator.Saturation to either 32/40 bits possible in AccumulatorOptional rounding possible for destination R register.Input formats can be
FU – Fractional UnsignedIS – Integer SignedM – Mixed signed and unsigned operands
Multiplier overflow status bits exist in ASTAT
Blackfin : Data Address Generators
• Four sets of 32-bit Index, Base, Length Registers for DSP Circular buffers. • Four Modify Registers used with any of the Index Registers access 16-bit and 32-bit aligned data•Separate stack pointer for user and supervisor modes aliased to SP
• Six 32-bit Pointer Registers for general use to access 8, 16 and 32-bit data
I0I1I2I3
L0L1L2L3
B0B1B2B3
M0M1M2M3
31 0 31 0 31 0 31 0P0P1P2P3P4P5
31 0
FPSPUSPAddressing modes : Indirect, auto-
increment/decrement, post modify with non-unity stride, indexed with immediate offset, Circular buffer and Bit reverseAddressing modes
Blackfin : Addressing Modes
Indexed Addressing with index register
R0 = [I2] ;R0 = W[I2] ;
Indexed Addressing with pointer register
R1 = [P0] ;B [P1++] = R3 ;
Auto increment and Auto decrement addressing
R0 = W [P1++] (Z)R2 = [P2--]
Premodify Stack pointer[--SP] = R3 ;
Indexed Addressing with immediate offset
P5 = [P1 + 0x100] ;Post modify Addressing
R5 = [P1 ++ P2] ;R6 = W[P4++P5](Z) ;R2 = [I2 ++ M1] ;
DAG and Pointer register modifications
I1 += M2 ;Alignment exception support
Program Sequencer
The Program Sequencer controls all program flow
Maintains LoopsSubroutinesJumpsIdleInterrupts and Exceptions
Contains an 10-stage instruction pipelinePipeline is fully interlocked -> all the data hazards are hidden from the programmer
Avoiding Pipeline Stalls
Most common numeric operations have no instruction latencyVisualDSP++ Pipeline Viewer highlights Stall, Kill conditions
Unconditional Branches in the Pipeline
2 31 4 5 6 7 8 9 10 11 12 13IF1
IF2IF3DCACEX1
EX2EX3
EX4WB
I1I1
I1I1
I1I1
I1I1
I1I1
BrBr
Br
BrBr
BrBr
BrBr
Br
I2I2
I2
I3
I3I3
I4
I4I4
BT
BTBTNOP
NOPNOP
NOPNOP
NOP NOP
NOP
NOP
NOPNOP
NOP
NOPNOP
NOP
NOPNOP
NOP
NOPNOP
NOP
NOPNOP NOP
NOP
BTBT
BTBT
I5
I5I5
Branch target address calculation takes place in the AC stage Branch Target address is sent to the Fetch address bus at EX1 stageLatency for all unconditional branches is 4 cycles
I1: Instruction Before the Branch Br: Branch Instruction BT: Instruction at the Branch Target
Conditional Branches (Jumps)
Conditional Branches are executed based on the CC bit. A static prediction scheme (based on BP qualifier in instruction) is used Branch is handled in the AC stage. In the EX4 stage, true CC bit is compared to the predicted value.
Outcome Not taken
Latency
Taken
8 cycles 0 cycles
Prediction Not Taken Taken (BP)
Taken
4 cycles
Not taken
8 cycle
Hardware Loop
Support two zero-overhead nested loops Load Loop registers by using the Loop Setup (LSETUP) instruction
Registers can also be set manuallyIf more than 2 nested loops are required, the stack must be used
Events (Interrupts / Exceptions)The Event Controller manages 5 types of Events:
Emulation ResetNon-Maskable Interrupt (NMI) ExceptionInterrupts
Hardware ErrorCore Timer9 General-Purpose Interrupts for servicing peripherals
Blackfin : Interrupt Management
IMASK Register – bits 0-15Indicates when the corresponding interrupt is enabled
ILAT Register – bits 0 - 15A set bit indicates an interrupt has been ‘latched’ to be serviced
IPEND Register – bits 0 – 15A set bit indicates an interrupt service is ‘pending’; that is, the highest bit set indicates which service routine is being executed
RAISE n Instruction ( n = 0-15 ) Forces a bit to be set in ILAT. It ‘raises’ the priority of the execution
EXCPT n Instruction ( n= 0-15 ) Forces an exception to occur : EVSW bit is set in ILAT and ‘n’ determines which exception routine to execute
Blackfin : InterruptsEmulator 0 EMU
RESET 1 RST
Non Maskable Interrupt 2 NMI
Exceptions 3 EVSW
- 4 -
Hardware Error 5 IVHW
Timer 6 IVTMR
General Purpose 7 7 IVG7
General Purpose 8 8 IVG8
General Purpose 9 9 IVG9
General Purpose 10 10 IVG10
General Purpose 11 11 IVG11
General Purpose 12 12 IVG12
General Purpose 13 13 IVG13
General Purpose 14 14 IVG14
General Purpose 15 15 IVG15
Highest
Lowest
Blackfin : Configurable Memory SystemSupports a Cache Memory Model and an SRAM Memory Model
Sustained Dual Data Accesses for DSP ApplicationsSupports accesses of 8,16, 32 bit dataSeparate Multi-ported L1 Instruction and Data Memories
Cache Memory for Microcontrollers & SRAM for DSP Applications
Memory management for Cache Protection
Processor Core
L1 InstructionSRAM & Cache
DMA
L2Instruction
& DataSRAM
L1 Data SRAM & Cache
Scratchpad SRAM
Blackfin : Memory Architecture
Configurable Memory ? Why ?As processor speeds increase ( 300Mhz – 1 GHz ), it becomes increasingly difficult to have large memories running at full speed – routing wire delays are very significant
The SolutionSolution is Hierarchical Memory with L1 memory not in the critical speed path
Two methods can be used to fill the L1 memory – Caching and Dynamic Downloading – Blackfin Supports Both
Micro-controllers have typically used the caching method, as they have large programs often residing in external memory and determinism is not as importantDSPs have typically used Dynamic Downloading as they need direct control over which code runs in the fastest memory
MSA allows the programmer to chose one or both methods to optimize system performance
Blackfin : L1 Instruction Cache
16KB SRAMFour 4KB single-ported mini-banksAllows simultaneous DMA access to different banksEntire 16K configured as cache or SRAM
Code to Core
DMA
4KBmini-bank
Fill by L2
4KBmini-bank
4KBmini-bank
4KBmini-bank
• 16 KB cache • 4-way set associative with arbitrary locking of ways• True LRU • No DMA access when configured as cache
Blackfin : L1 Data Memory
Data 1
Data 0
DMA A
16KBsuper-bank B
16KBsuper-bank A4KB
SRAM
Fill A
DMA B
Fill B
Two 16KB super-banks
Each super-bank can be cache or SRAM
4KB scratch SRAM (stack can be located here for fast context switching)
Blackfin : L1 Data Super bank Architecture
Four 4KB single-ported mini-banks
Multi-ported data access when using different mini-banks
Data 1
Data 0
DMA
4KBmini-bank
Fill
4KBmini-bank
4KBmini-bank
4KBmini-bank
• When Used as Cache– Each bank is 2-way set-
associative– No DMA access– Entire 16K Configured as
cache/SRAM
• When Used as SRAM– Dual Data Access– DMA Access
L2
Blackfin : L1 Memory Configurations
L1 Data memory can be configured in SRAM or Cache Modes
32K SRAM 16K SRAM & 16K Cache32K Cache
Additional 4K Byte of Scratch pad SRAM
Stacks and heaps can be stored in scratch pad SRAM
L1 memories operate at Core Clock frequencyWrite through and Write back modes are supported
L1 Instruction memory cannot be accessed directly through DAGs
Use L2 memory for such accesses.
Core and DMA can access L1 memory banks simultaneously.DMA controller has higher priority over core accesses.
Blackfin : MMU
Memory Management Unit
Caching and Protection Look-Aside Buffers (CPLBs)Cache/protection properties determined on a per memory page basis (1K, 4K, 1M, 4M byte sizes ) User/supervisor, and task/task protection Future products will support address translation
MMU Property Descriptors
Page SizeDirty/CleanValid/InvalidWrite-through/Write-backCacheable/Non-cacheableSupervisor/User access protection bitsRead/Write protection bits
System Clocking - Variable Frequency
÷ 1, 2, 4, 8
÷ 1 : 15
Modificationon the fly
Modification requires PLL Sequencing
5
1CCLK
CLKIN PLL1x - 63x
10
SCLK
CLKIN can be driven from external oscillator or crystal SCLK =< CCLK
SCLK =< 133MHz
Reset values
On-chip Voltage Regulation
+-
VREF
VDDINT
VDDCTRL
VDDEXT
DSPINTERNALCIRCUIT
EXTERNALCOMPONENTS
2.25V -> 3.6V
TANTALUMOR
ELECTROLYTIC
CERAM IC
10 µF .1µF
Ind10µH
Uz=4V
Generates core voltage from external 2.25V to 3.6V inputCore voltage programmable in 50mV increments from 0.8V to 1.2V Optional bypassMinimal external components required
External Memory Interface
External Bus Interface
1M Byte Asynchronous
1M Byte Asynchronous
1M Byte Asynchronous
1M Byte Asynchronous
Exte
rnal
Mem
ory
Inte
rfac
e 19 Address
16 Data
512M Byte Synchronous
EBIU Overview
EBIU – External Bus Interface Unit16-bit parallel interface
Comprised of Asynchronous Memory Controller (AMC) and Synchronous DRAM Controller (SDC)
Share some pinsAMC supports devices such as SRAM, ROM, FIFOs, Flash memory, and ASIC/FPGA designsSDC supports PC-133 SDRAM devicesEBIU runs at the system clock rate (SCLK)
ADSP-BF537 External Memory Map
One memory region is dedicated to SDRAM
Up to 512MBYTE
Four ASYNC banks1MByte each
Start address SDRAM is 0x0000 0000.
Blackfin DMA capabilities
DMA Setup
Two Types of DMA transfers availableRegister-based
Program the DMA control registers directly Upon DMA completion, control registers are automatically updatedwith their original setup values in Autobuffer Mode (multiple transfers)The DMA Channel can also be configured to gracefully shut off with Stop Mode (single transfer).
Descriptor-basedRequires a set of parameters stored within memory to initiate a DMA sequence.Supports chaining of multiple DMA transfers.
Descriptor Array ModeStart_Addr[15:0]
Start_Addr[31:16]
DMA_Config
X_Count
X_Modify
Y_Modify
Y_Count
Start_Addr[15:0]
Start_Addr[31:16]
DMA_Config
X_Count
X_Modify
Y_Modify
Y_Count
Start_Addr[15:0]
Start_Addr[31:16]
DMA_Config
……….…………………………….
Descriptor Block 1
Descriptor Block 2
Descriptor Block 3
0x0
0x2
0x4
0x6
0x8
0xA
0xC
0xE
0x10
0x12
0x14
0x16
0x18
0x1A
0x1C
0x1E
0x20
Descriptor List (Small Model) ModeNext_Desc_Ptr[15:0]
Start_Addr[15:0]Start_Addr[31:16]
DMA_ConfigX_Count
X_Modify
Y_ModifyY_Count
Next_Desc_Ptr[15:0]Start_Addr[15:0]Start_Addr[31:16]
DMA_ConfigX_Count
X_Modify
Y_ModifyY_Count
Next_Desc_Ptr[15:0]Start_Addr[15:0]Start_Addr[31:16]
DMA_ConfigX_Count
X_Modify
Y_ModifyY_Count
Descriptor List (Large Model) Mode
Next_Desc_Ptr[15:0]Start_Addr[15:0]Start_Addr[31:16]
DMA_ConfigX_Count
X_Modify
Y_ModifyY_Count
Next_Desc_Ptr[31:16]
Next_Desc_Ptr[15:0]Start_Addr[15:0]Start_Addr[31:16]
DMA_ConfigX_Count
X_Modify
Y_ModifyY_Count
Next_Desc_Ptr[31:16]Next_Desc_Ptr[15:0]
Start_Addr[15:0]Start_Addr[31:16]
DMA_ConfigX_Count
X_Modify
Y_ModifyY_Count
Next_Desc_Ptr[31:16]
Descriptor Blocks
2-D Direct Memory Access
Data Capture & Storageto Linear L2 Memory
A E F GC DB
PONMLKJI
H
LKJIH
FG
EDCBA
....
2-D DMA to L1 Memory A, B, I, J
ProgrammableX &Y Count & Stride Values
ProgrammableX &Y Count & Stride Values
2-D DMA significantly decreases S/W overhead in video applications!
Blackfin Peripheral Interfaces
Parallel Peripheral Interface
Up To 16-bit ParallelData
PPICLKSYNC
Appliances
External ClockUp to 66MHz
Bidirectional, half-duplex interfacePPI supports CCIR-656 Video Converter Interface
PPI - What is it?
Programmable bus width (8 – 16 bits)Bidirectional (half-duplex) parallel interfaceSynchronous Interface
Driven by an external clock (PPI_CLK)Up to 66MHz rate (SCLK/2)Asynchronous to SCLK
Includes three frame syncs to control the interface timingApplications
High speed data convertersVideo CODECs
Used in conjunction with a DMA channel
PPI I/O Modes
8- or 10-bit data w/embedded control
CLK
PPIPPIx
PPI_CLK
CCIR-656 ‘656-Compatible Video Source
GP - Mode HSYNCVSYNC
8-16 bits data
CLK
PPI_FS3PPI
PPI_FS1PPI_FS2
PPIx
PPI_CLK
Video Source FIELD
SPORTs
Two synchronous serial portsIndependent receive and transmit Internal or external generated clocks and frame syncsBuilt in hardware for u-law & A-law compandingSupport for multi-channel TDM interfacesDedicated DMA channels Generates optional interruptsOperates up to 1/2 System bus clock rate (SCLK)
TX Data 1TX Data 2Tx ClockTx Sync
Rx Data 1RX Data 2Rx ClockRx Sync
Serial Peripheral Interface (SPI)Shift Registers Simultaneously Shift Data In And Out
SCK
PFx
MOSI
MISO
SPI_TDBRSPI_RDBR
Full duplex synchronous serial interface
Master and Slave mode supported
Enable to communicate with multiple devices
Up to SCLK/4 Operation
Universal Asynchronous Receiver/Transmitter
UART options5-8 data bits1, 1½ or 2 stop bitsNone, even or odd parityBaud rate = SCLK/(16*DIVISOR)Supports half-duplex IrDA (9.6/115.2 Kbps rate)Autobaud detection support through the use of the TimersSeparate TX and RX DMA supportData is ALWAYS Transmitted/Received LSB First
ADSP-BF53x
Industry Standard 16450 Compatible
Blackfin Timers
CoreTimer
WatchdogTimer
RTC
RTC Power
RTC clock
GPTimer
GPTimer
GPTimer
int
int
int
int intint
resetNMI
BEleven timers:
One Core TimerOne Watchdog TimerOne Real-time Clock (RTC)Eight general purpose timers
PWM ModePulse Capture ModeExternal Clock Mode
Core Timer
TSCALE8 bit
TCOUNT32 bit
TPERIOD32 bit
CCLK IRQ 6
Interrupt rate = CCLK x (TSCALE + 1) x TPERIOD
Use to generate interrupts at multiples of CCLK rate 32-bit tick timer
Dedicated Interrupt Priority 6 (fixed)Autoreload is optional
Watchdog Timer
Generating an event when the timer expires.
The event can be programmed to be:a reset (software reset takes place)a non maskable interrupta general purpose interrupt
Clocked by the system clock (SCLK).
Must be periodically serviced by software
Real-Time Clock
Typically used to implement real-time watch or “life counter”Time of day, alarm, stopwatch count-down, and elapsed time since last system reset
Maintains time/day with 4 counters - Seconds, Minutes, Hours, DaysCurrent time (Seconds, Minutes, Hours, Days) read/written in RTC Status Register (RTC_STAT)
Equipped With Two Alarm featuresDaily and Day-And-Time
Uses dedicated 32.768 kHz crystal to RTXI / RTXOSetting pre-scalar (bit 0 in RTC_PREN) RTC can be pre-scaled to 1 Hz to count time and days
Uses dedicated power supply pinsIndependent of any reset
Eight Peripheral Timers
Eight Identical Timers Can Be Configured In 3 ModesPulse Width Modulation (PWM_OUT)Width and Period Capture (WDTH_CAP)External Event Counter (EXT_CLK)
Multiplexed Pins TMR7-0One Programmable Interrupt EachThree 32-bit Registers EachWidth (TIMERx_WIDTH)
Period (TIMERx_PERIOD)Counter (TIMERx_COUNTER) (read-only)
One 16-bit Configuration Register Each (TIMERx_CONFIG)Three Common Registers Affect All 8 Timers Simultaneously
Timer EnableTimer DisableTimer Status (Interrupt requests, overflows, slave enables
General Purpose I/O
• Features:− The processor supports up to 48 bi-directional GPIO (General
purpose Input/Output modules)− To simplify the programming model, the 48 GPIOs are managed by
three different modules, each one associated PORTF, PORTG, and PORTH − Each module independently controls 16 GPIOs.
− Each GPIO can be configured as either an input or output by using the GPIO Direction registers.
− For GPIO Input:− Level or edge sensitive trigger of input source− Rising or falling edge trigger of input source− Single edge or both edges trigger of input source
General Purpose I/O Pins48 bi-directional GPIO pins availableEach can be configured as an output, input, or an interrupt pin
SPORT1 PPI PGxPG0
PG15
PG7
PG8
PPI D0PPI D1PPI D2PPI D3PPI D4PPI D5PPI D6PPI D7
DR1SECDT1SECRSCLK1
RFS1DR1PRITSCLK1
TFS1DT1PRI
PPI D8PPI D9
PPI D10PPI D11PPI D12PPI D13PPI D14PPI D15
Two InterruptRequests (FLAGA/FLAGB)
The World Leader in High Performance Signal Processing Solutions
What’s new in ADSP-BF534/6/7?
ADSP-BF537/BF536/BF534 • The BLACKfin family offers a variety of pin- and code-compatible
derivatives
Sparse MBGA,MiniBGA
Sparse MBGA,MiniBGA
132 KBytes132 KBytes
500, 600 MHz, 1000, 1200 MMACs
500, 600 MHz, 1000, 1200 MMACs
ADSP - BF537ADSP - BF537
Package Options
On Chip RAM
Performance
Sparse MBGA,MiniBGA
Sparse MBGA,MiniBGA
100 KBytes100 KBytes
500, 600 MHz, 1000, 1200 MMACs
500, 600 MHz, 1000, 1200 MMACs
ADSP - BF536ADSP - BF536
Spare MBGA,MiniBGA
Spare MBGA,MiniBGA
132 KBytes132 KBytes
400, 600 MHz, 1000, 1200 MMACs
400, 600 MHz, 1000, 1200 MMACs
ADSP - BF534ADSP - BF534
ADSP-BF537/BF536/BF534Enhanced Blackfin Processors
High System Integration
Video I/O connects directly to ITU-R 656 encoders and decoders
SPORTs support 8 Channels of I2S Audio
Core Voltage Regulator
Microcontroller features include WDT, RTC, SDRAM controller
Up to 600MHzBlackfin
Processor Core
SDRAM
FLASH/SRAM
Interfaces
RTC
Watchdog
JTAG
System Peripherals
Up to80KBytesPM
4KBytes
Enhanced DMA
SPI 1 UART 2
Ethernet
GPIO 48
User Peripherals
PLL
DynamicPower
Management
EnhancedSPORTs 2
PPIVideo I/O Switching
Regulator
Memory32KBytesPMROM
Up to64KBytesDM New Blackfin features
TWI
CAN
Timers 8 BF536/7 Only
TWITWO WIRE INTERFACE
Fully compliant to the Philips I2C bus protocolSee Philips I2C Bus Specification version 2.1
7-bit addressing100 Kb/s (normal mode) and 400Kb/s (fast mode) data ratesGeneral call address support
Supports Master and Slave operationSeparate receive and transmit FIFO
SCCB (Serial Camera Control Bus) supportOnly in Master mode
Slave mode cannot be used because the TWI controller always issues an Acknowledge in slave mode
DMA Enhancements
4 more DMA channelsAll twelve peripheral DMA channels can be assigned to any of theconnected peripherals.
SYNC BitAllows more control over the DMA interrupt generation process.
DMA controller enhanced to provide the MAC further control over the assigned DMA channels
Ex. Peripheral (MAC) detects incorrect checksum condition on incoming data stream. The peripheral can skip the data stream byissuing a RESTART command to the DMA channel. The DMA simply reloads its current registers again.
DMA Enhancements (cont’d)
Handshaking Memory DMA Good for asynchronous FIFOs or off-chip interface controllers, between Blackfin memory and hardware buffers
Two edge-sensitive DMA request inputs: DMAR0 and DMAR1Each is associated with one of the Memory DMA units: MDMA0 and
MDMA1Enables Memory DMA units to be synchronized by external hardware
When Handshake operation is enabled, the Memory DMA no longer runs freelyReduces the need of core interaction, it also increases data throughput, since GPIO polling or GPIO-driven interrupts can be avoidedTransfers can be done on block or word basis
The World Leader in High Performance Signal Processing Solutions
ADSP-BF537 CAN Overview
What Is Controller Area Network?
CAN is a serial field bus with multi-master capabilities
All CAN nodes are able to transmit data and several CAN nodes can request the bus simultaneously
Automotive and Industrial Electronics
Developed by Bosch
First deployed in 1986
Up to 1 Mbps
CAN – Low Layer Specification Only
3-Wire Half-duplex Field Bus
Controller
Transceiver
RX TX
CA
N_H
CA
N_L
Node 1
ADSP-BF534
Transceiver
RX TX
CA
N_H
CA
N_L
Node 2
ADSP-BF537
Transceiver
RX TX
CA
N_H
CA
N_L
Node N
GN
D
GN
D
GN
D
Multi-master capabilities 120 Ohm resistors are used between CAN_H and CAN_LTransceiver must be as close as possible to transmission lineBit rate is limited by cable length and number of nodes
CAN Bit Rate vs Bus Length
Bus length (m) Max. bit rate (b/s)
40 1M
100 500k
200 250k
500 125k
6 km 10k
Full CAN vs Basic CAN Controllers
Basic CANCAN controller features 1 receive buffer and 1 transmit bufferSoftware overhead
Full CANCAN controller features dedicated buffers for individual messagesAcceptance filtering done by hardware
Blackfin Provides Full CAN Implementation8 dedicated transmit mailboxes8 dedicated receive mailboxes16 configurable transmit/receive mailboxesAcceptance mask and data filteringAutomatic response to remote requests
Note: DMA is not supported on the CAN peripheral
The World Leader in High Performance Signal Processing Solutions
BF537 BOOTING
SUPPORTED BOOT MODESBMODE DESCRIPTION Pin Muxing
these pins are used by the respective peripheral during booting
000 Bypass boot ROM(execute from external memory
0x2000 0000)
-
001 8/16-bit Parallel Flash on /AMS0
-
010 Reserved -011 SPI Master
(8/16/24-bit SPI devices)PF11,PF12, PF13
100 SPI Slave PF11,PF12, PF13,PF14
101 TWI Master -110 TWI Slave -111 UART Slave PF0,PF1
Boot From 8/16-bit Prom/FlashPhysical connections:
_______AMS(0)
____AOE____
AWE
____AMS___OE
__ ___R/W or WR
ADDR[N+1:1] ADDR[N:0]
DATA[15:0] DATA[15:0]
Blackfin16-Bit Flash/PROM
_______AMS(0)
____AOE____
AWE
____AMS___OE
__ ___R/W or WR
ADDR[N+1:1] ADDR[N:0]
DATA[7:0] DATA[7:0]
Blackfin
8-Bit Flash/PROM
The Blackfin will boot from Asynchronous Bank 0 upon RESET which maps to location 0x2000 0000 (DSP address).
SPI Master booting
Uses Slave Select 1 which maps to PF10
i.e., PF10 is used as chip select
On-Chip Boot Rom sets the Baud Rate Register to 133 (0x85)
Baud Rate = SCLK / (2 * SPI_BAUD)
E.g., :for a 25 MHz CLKIN: SCLK = 2*25 MHz = 50 MHz
Baud Rate = 50 MHz / (2 * 133) = 188 KHz
Support for 8-,16-, and 24-bit addressable parts
SPI Master booting
SPICLK
PF10
MOSI
SPICLK__CS
MOSIMISO MISO
ADSP-BF537(Master SPI Device)
10KΩ
SPI Memory(Slave SPI Device)
VDDEXT
Boot From A Host via SPI Slave Mode(BMODE = 100)
Host(Master SPI Device)
Blackfin(Slave SPI Device)
SPICLK_____S_SEL
MOSIMISO
FLAG/Interrupt
SPICLK_____SPISS
MOSIMISOHWAIT
HWAIT is used to hold off host when the Blackfin processor is not able to consume any more data
During the processing of Init or Zero fill blocksIt can be any GPIO except PF11-14
TWI Master boot (BMODE = 101)Boot from a device whose slave address is 1010000x
1010: I2C EEPROM device identifier000: device “chip” select (A2, A1, A0)Memory device needs to be 16-bit addressablex: direction of transfer
SDASCL
ADSP-BF537 TWI DEVICE
SCLSDA
A0
A1
A2
VSS
UART BOOTING (BMODE 111)UART booting only possible thru UART0HWAIT is used to hold off host when the Blackfin processor is not able to consume any more data
During the processing of Init or Zero fill blocksIt can be any GPIO except PF0 or PF1
TXRX
CTS/Interrupt
UART0_RX
UART0_TX
HWAIT
UART HOSTADSP-BF537 UART SLAVE
The World Leader in High Performance Signal Processing Solutions
EthernetADSP-BF537 EMAC
OSI model - TCP/IP
OSI (Open Systems Interconnection)• The standard description or "reference model" for how messages should be transmitted between any two points in a telecommunication network.
Application Layer (7)
PresentationLayer (6)
SessionLayer (5)
Transport Layer (4)
NetworkLayer (3)
Data LinkLayer (2)
PhysicalLayer (1)
OSI model
Application
Transport (TCP)
Internet (IP)
Network
TCP/IP model
User Data ProtocolUDP
Internet Protocol IP
HTTP FTP SMTPNameServer
NFS
XDRR
RPC
Transmission Control Protocol TCP
EthernetIEEE 802.3
Token Ring
twisted Pair
optical fiber
Coaxial cable
DQDB
TCP/IP application protocols
Comparison between OSI – TCP/IP model
OSI, TCP/IP, and Blackfin BF536/7
Web Server,
HTTP, FTP, Telnet etc.
lwIP, uIP, 3rd Party
Driver for the ADSP-BF536/7 EMAC peripheral
provided by ADI
SMSC LAN83C185, Realtek RTL8201, etc.
Application
Application Protocol
TCP/IP Stack
EMAC Device Driver
BF536/7 EMAC Peripheral
PHY Transceiver
Layers 5, 6 and 7
(Session, Presentation and Application)
Layers 3 and 4
(Network, Transport)
Layer 2
(Data Link)
Layer 1
(Physical)
TCP/IP Stack Header Structure
Application runs on top of the TCP/IP Stack
--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------
MAC Header IP Header TCP/UDP-Header DATA Trailerdata link layer
IP Header TCP/UDP-Header DATA network layer
DATA
TCP-Header DATA
transport layerrespectively
DATA application layer
UDP-Header TCP/IP Stack supports the transport and network layer
ADSP-BF536/7 Ethernet MACperipheral supports the data link layer
header structure
Layer 1 - Physical
This layer consists of the physical and electrical interface to the network: PHY Transceiver, 10BASE-T connector, 100BASE-TX, switches and routers
The PHY transceiver performs electrical signal conditioning for transmission over the medium: RJ45 cable
The 2-wire MII Management Interface allows layer 2 devices such as the BF536/7 EMAC to monitor and control PHY operation
MII is a 4-bit bidirectional interface running at 25 MHzRMII is a 2-bit bidirectional interface running at 50 MHz
100BASE-TX
MII
uses two pairs of twisted-pair cable (Cat5 UTP or STP) in the physical star topology (up to 500 m)
encoding
Blackfin Ethernet System Overview
ERxD[3:0]
ERxDV
ERxCLK
ERxER
MDIO
MDC
YINT
ETxD[3:0]
ETxEN
COL
CRS
ETxCLK
Magnetics
TPO+TPO-
TPI+TPI-
RJ45
ADSP-BF537 PHY
25M
Hz
MII
MDI
Layer 2 – Data Link
This layer enables the functional and procedural aspects of network data transfer as well as physical layer error checking
There are two sub-layers: the Media Access Control (MAC) sub-layer and the Logical Link Control (LLC) sub-layer
On BF536/7, the MAC sub-layer is implemented as the EMAC peripheral and the LLC sub-layer is implemented as the EMAC device driver
Ethernet MAC (EMAC) Architecture and Features
Independent DMA-driven RX and TX channels
MII/RMII interface10Mbit/s and 100Mbit/s operation
(Full or Half Duplex)Automatic network monitoring
statisticsFlexible address filteringFlexible event detection for interruptValidation of IP and TCP (payload)
checksums.Remote-wakeup Ethernet frames Network-aware system power
management
BLACKFIN CORE
Clocking
PHYCLKOE (PHY Clock Output Enable)
25 MHz source for MII PHY
50 MHz source for RMII PHY*
The PHY can be clocked with an external clock, or the CLKBUF buffered clock
The CLK_BUF pin is enabled by the PHYCLKOE bit in the VR_CTL register
The default PLL multiplier is 10x, so a 50 MHz CLKIN can violate max CCLK frequency
Ethernet Pins on Ports H and J
PH0 until PH15 are multiplexed
Port H provides most of the signals of the MII or alternate RMII interface.
MII used 18 PH Pins
RMII used merely 11 PH Pins.
For MII and RMII operation, bits of the Function Enable register (PORTH_FER) must be set.
On PORT J, the two MII pins are not multiplexed at all.
Layer 2 – Data Link – EMAC Device Driver
The EMAC device driver configures and monitors the EMAC peripheral and DMA engine to handle the flow of Ethernet frames between memory buffers and the PHY
The device driver oversees the following core functions:Ethernet address and frame filteringChained DMA transfersInterrupt managementCollision detection
Two device driver models have been developed corresponding to the lwIP and uIP TCP/IP stacks
Layers 3-6 – TCP/IP
There are currently two implementations of the TCP/IP protocol set for BF536/7:
lwIP – The ‘light-weight IP’ stack is a multi-threaded TCP/IP implementation using VDK or another RTOS
Compliant with the system services modelBuilt into VisualDSP++ 4.0For further information, consult www.sics.se/~adam/lwip/
uIP – stand-alone TCP/IP implementation tailored to microcontroller requirements
Event-based hierarchy (no RTOS required)Very small memory footprint (stack code size ≈ 6 Kbytes)ADSP-BF536/7 implementation available at www.blackfin.orgFor further information, consult www.sics.se/~adam/uip/
Additional implementations are available from 3rd partiesQuadros Systems RTXC Quadnet (www.quadros.com)Unicoi Systems FusionNet (www.unicoi.com)KADAK KwikNet TCP/IP Stack (www.kadak.com)
Layer 7 – Application
This layer contains the application protocol, high-level software application and user interface
lwIP provides an API based on BSD sockets and runs under VDK
uIP allows the processor to run a single networked application pointed to by UIP_APPCALL()
Example applications running on the ADSP-BF536/7 EZ-Kit:HTTP serverHTTP client for streaming compressed audio
The World Leader in High Performance Signal Processing Solutions
LwIP STACK based on VDKLwIP STACK based on VDK
lwIP stack overall structure
The stack consists of three major components
The TCP/IP library itself
An interface library to the kernel that is being used
A driver library to connect the stack to the Ethernet controller
Currently we have kernel interface library for VDK
Currently we have driver libraries for the BF537 and the USB-LAN
lwIP stack library
kernel interfacelibrary
Ethernet driverlibrary
Application
BF537 EMACUSB-LAN Extender
VDKThreadX
Folder structure
original example structure
TCP/IP STACK
Driver for ADSP-BF537 and SMSC LAN91C111
Documentation based on html
Examples for ADSP-BF533ADSP-BF537
Host programs and Source Code for the examplesSource Code of the TCP/IP STACK
lwIP project wizard
The project wizard generates a VDK based application which uses the stack
It can generate an application for either the BF537 or the USB-LAN extender
It provides the code needed toinitialize and start the stack operating
If DHCP is being used it will also wait till the IP address is obtained
Configuration plugin - General tab
Displays the name of the associated configuration file
Allows you to specify the protocols to be supported by the build stack
Eliminating a protocol will reduce the size of the stack
Controls the level of statistical data that the stack will accumulate
Configuration plugin - IP tabCheck IP Forward if you wish to have the ability to forward IP packets across network interfaces. If you have only one network interface, you should leave this box unchecked.
Check IP options if IP options are to be allowed. If this box is left unchecked then packets with unrecognized IP options are dropped.
If IP Fragmentation is checked then IP packets will be segmented appropriately.
If IP Reassembly is checked then support for re-assembling fragmented packets is provided.
The number of network interfaces must be specified
Configuration plugin - Network tab
If DHCP is not to be used by a network adapter then you must configure the appropriate setting for the IP Address of the adapter, its subnet mask and the IP address of the gateway for the subnet.
The number and size of the buffers to be supplied to the Ethernet driver must be specified as appropriate to the expected loading on the adapter.
If more than one network interface then they can be configured separately.
Configuration plugin - TCP/UDP/ARP tabSpecify sufficient UDP ‘connection’s and TCP connections depending on the expected maximum number of simultaneous UDP receives and active connections.
Specify the maximum number of open sockets and incoming queued connections to be supported.
The MSS field specifies the maximum size of TCP segment that the stack will support.
The window size specifies the maximum TCP receive window that the stack will support.
The ARP table size specifies the maximum number of address resolution mapping entries that the stack will maintain.
Configuration plugin - Debug tab
Specify the level of debug checking and reporting the stack should provide
Specify which events the stack should provide
Configuration plugin - Memory tab
Specifies the number of pool buffers and the size of each pool buffer
Each frame is stored in a linked list of pool buffers
Setting the pool buffer size to low will increase processor overheads
Setting the pool buffer size to high will increase memory overheads
Specify the total size of memory heap that the stack can utilise for non pool buffer memory requests
Blackfin Processor Product RoadmapAdvanced TechnologyIn Development
2xPPI, 4xSerial Ports, USB2.0
BF56x
PRESENT FUTURE
Perf
orm
ance
BF535
BF561
msp500SoftFone
PCI, USB4xTimers, Watchdog Timer16xGPIO, RTC
400-600 MHzPPI, 16-bit EBIU4xSerial Ports
GSM/GPRS/EDGE;Full 4-slot receive; Low standy power
500–600 MHz2xPPI, 32-bit EBIU4xSerial Ports
BF56x4xPPI, 4xSerial Ports, 10/100 Ethernet MAC, PCI, HPI
BF539
PPI, 10xSerial Ports, I2C, CAN
BF536/537
PPI, 5xSerial Ports, I2C, CAN, 10/100 Ethernet MAC
Mobile Handsets, Smart Phones and PDAs
Consumer Media
msp5xxSoftFone
Multimode, multimediawireless handsets;Low standy power
BF533-750
BF561-750
BF534
PPI, 5xSerial Ports, I2C, CAN, 48xGPIO
PPI, 3xSerial Ports, USB2.0
BF53x
BF533
BF532
BF531
Automotive, Industrial and Instrumentation
756 MHzPPI, 16-bit EBIU4xSerial Ports
756 MHz2xPPI, 32-bit EBIU4xSerial Ports
The World Leader in High Performance Signal Processing Solutions
Introduction to VisualDSP++
VisualDSP++ 4.0
VisualDSP++ is an integrated development environment that enables efficient management of projects.
Key Features Include:EditingBuilding
Compiler, assembler, linkerDebugging
Simulation, Emulation, EZ-KITRun, Step, HaltBreakpoints, WatchpointsAdvanced plotting and profiling capabilitiesPipeline and cache viewers
VisualDSP++
What comes with VisualDSP++?Integrated Development and Debugger Environment (IDDE), C/C++ Compiler, Assembler, Linker, VDK, Emulation and Simulation Support, On-line help and documentation
Part #: VDSP-BLKFN-FULLFloating License Part #: VDSP-BLKFN-PCFLOAT
VisualDSP++ is a common development environment for all ADI processor families
BlackfinADSP-BF5xx
TigerSharcADSP-TSxxx
SharcADSP-21xxx
Each processor family requires a separate license
Features of VisualDSP++ 4.0
Integrated Development and Debugger Environment (IDDE)Multiple workspaces, projects, project groups
Project WizardCreate/configure a DSP project
High level language support including C and C++Expert Linker
Graphical support for managing linker description filesCode profiling support
Easy to use Online HelpBTC (Background Telemetry Channel) Support
Data Streaming and LoggingEasy to test and verify applications with scripts (TCL, VB, Java)VisualDSP++ RTOS/Kernel/Scheduler (VDK) Integrated Source Code ControlDevice Drivers and System Services
Software Development Flow
GenerateAssembly
Source(.ASM)
GenerateC/C++Source
(.C/CPP)
and / or
Assembler.DOJ
C/C++ Compiler.S
Linker.DXE
VisualDSP++Simulator
WorkingCode?
NO
Code Generation
SoftwareVerification
Hardware EvaluationEZ-Kit Lite
ROM ProductionLOADER
.LDR
Target VerificationICE
YES
SystemVerification
LinkerDescription File
.LDF
.DXE
.DXE
.DXE
.DXE
PROM Burner
Invoking the Software Tools• Software tools may be configured and called by the IDDE
− Software tools are configured via property pages− The IDDE calls the software tools it needs to complete the build
− GUI front end to a command line ‘make’ utility• Software tools can be invoked from a Command line
− C Compiler: ccblkfn sourcefile -switch [-switch...]− Assembler: easmblkfn sourcefile -switch [-switch...]− Linker: linker object [object…] -switch [-switch…]− Loader: elfloader executable -switch [-switches...]
• For the complete list of switches see the appropriate tools manual
Integrated Development and Debugger Environment (IDDE) Features
• IDDE allows one to manage the project build • The user configures the project and the development tools via
property pages• Project Property pages configure the project
– Project Property Page– General Property Page– Pre Build Property Page– Post Build Property Page
• Development Tools Property Pages are used to configure the development tools– Assembler Property Page– Compiler Property Page– Linker Property Page– Loader Property Page
Project Development
• Create a project– All development in
VisualDSP++ occurs within a project.
– The project file (.DPJ) stores your program’s build information: source files list and development tools option settings
– A project group file (.DPG) contains a list of projects that make up an application (egADSP-BF561 dual core application)
Project Property Page• Configure project
options– Define the target
processor and set up your project options (or accept default settings) before adding files to the project.
– The Project Options dialog box provides access to project options, which enable the corresponding build tools to process the project’s files correctly
Enable building for a specific revision of silicon- No need to specify ‘-si-revision’ switch- Automatic will attempt to determine revision of the attached target- or specify a specific rev level (eg 0.3)
Property Pages
C/C++ Compiler Property Page
Assembler Property Page
Property PagesLinker Property Page
Loader Property Page
Property PagesGeneral Property Page
Post Build Property Page
Pre Build Property Page
Selecting VisualDSP++ Sessions
• Sessions define Debug Environments• Select Sessions pull down menu
– Choose Sessions List– Select Session to activate
• Define New Session from Session List– Select New Session– Configure session as required e.g.
Debug target : ADSP-BF53x Family SimulatorPlatform : ADSP-BF53x Single Processor SimulatorSession name : ADSP-BF533 ADSP-BF53x Single
Processor Simulator• Click OK
– Session name will appear in Session List
• Click Activate– IDDE session will open
Debug FeaturesSingle StepRunHaltSet BreakpointsRegister ViewingMemory
ViewingPlotting Dump/Fill
Code Optimization UtilitiesProfilingPipeline ViewerCache Viewer
Compiled SimulationHigh Level Language debug support
Mixed mode
Online Help
Fully searchable and indexed online help
Includes quick overviews on using VisualDSP++ and all of its features.
Excellent supplement to the manual for things that are better represented visually such as what various plot windows should look like.
Customizable by using the “Favorites” window
On Line Help Example