-
©1995 Xilinx Inc. For the latest revision of the specifications,
see the Xilinx WEBLINX at http://www.xilinx.com .
Features
• Third Generation Field-Programmable Gate Arrays- Select-RAMTM
memory: on-chip ultra-fast RAM with - synchronous write option -
dual-port RAM option- Fully PCI compliant- Abundant flip-flops-
Flexible function generators- Dedicated high-speed
carry-propagation circuit
- Wide edge decoders (four per edge)- Hierarchy of interconnect
lines- Internal 3-state bus capability- 8 global low-skew clock or
signal distribution network
• Flexible Array Architecture- Programmable logic blocks and I/O
blocks
- Programmable interconnects and wide decoders• Sub-micron CMOS
Process
- High-speed logic and Interconnect - Low power consumption•
Systems-Oriented Features
- IEEE 1149.1-compatible boundary scan logic support-
Programmable output slew rate (2 modes)- Programmable input pull-up
or pull-down resistors- 12-mA sink current per output- 24-mA sink
current per output pair
• Configured by Loading Binary File- Unlimited
reprogrammability
XC4000EField Programmable Gate Array Family
September 1, 1995 (Version 1.03) Preliminary Product
Specifications
- Six programming modes - Readback capability• Backward
Compatible with XC4000 Family• XACTstep Development System runs on
‘386/’486/
Pentium-type PC, Sun-4, and Hewlett-Packard 700series
- Interfaces to popular design environments including VIEWlogic,
Mentor Graphics and OrCAD- Fully automatic partitioning, placement
and routing- Interactive design editor for design optimization-
Unified Libraries, including 288 soft macros and 34 Relationally
Placed Macros (RPMs)
- RAM/ROM compiler
Introduction
The XC4000E family of high-performance, high-densityField
Programmable Gate Arrays (FPGAs) provides thebenefits of custom
CMOS VLSI, while avoiding the initialcost, time delay, and inherent
risk of a conventionalmasked gate array.
The result of eleven years of FPGA design experience andfeedback
from thousands of customers, the XC4000E fam-ily combines
architectural versatility, on-chip Select-RAMmemory with
edge-triggered and dual-port modes,increased speed, abundant
routing resources, and new,sophisticated software to achieve fully
automated imple-mentation of complex, high-performance designs.
Table 1: XC4000E Family of Field Programmable Gate Arrays
Device XC4003E XC4005E XC4006E XC4008E XC4010E XC4013E XC4020E
XC4025E
ApproximateGate Count
3,000 5,000 6,000 8,000 10,000 13,000 20,000 25,000
CLB Matrix 10 x 10 14 x 14 16 x 16 18 x 18 20 x 20 24 x 24 28 x
28 32 x 32
Number ofCLBs
100 196 256 324 400 576 784 1,024
Number ofFlip-Flops
360 616 768 936 1,120 1,536 2,016 2,560
Max DecodeInputs per side
30 42 48 54 60 72 84 96
Max RAM Bits 3,200 6,272 8,192 10,386 12,800 18,432 25,088
32,768
Number ofIOBs
80 112 128 144 160 192 224 256
-
XC4000E Field Programmable Gate Array Family
2
mented in the XC4000E, then migrated to one of
Xilinx’100%-compatible HardWire mask-programmed devices.
Table 2 shows density and performance for a few commoncircuit
functions that can be implemented in XC4000Edevices.
Taking Advantage of Reconfiguration
FPGA devices can be reconfigured to change logic functionwhile
resident in the system. This capability gives the sys-tem designer
a new degree of freedom not available withany other type of
logic.
Hardware can be changed as easily as software. Designupdates or
modifications are easy, and can be made toproducts already in the
field. An FPGA can even be recon-figured dynamically to perform
different functions at differ-ent times.
Reconfigurable logic can be used to implement
systemself-diagnostics, create systems capable of being
reconfig-ured for different environments or operations, or
implementdual-purpose hardware for a given application. As anadded
benefit, use of reconfigurable FPGA devices simpli-fies hardware
design and debugging and shortens producttime-to-market.
XC4000E Compared to XC4000
Any XC4000E device is pin-out and bitstream compatiblewith the
corresponding XC4000 device. An existingXC4000 bitstream can be
used to program an XC4000Edevice. However, since the XC4000E
includes many newfeatures, an XC4000E bitstream cannot always be
loadedinto an XC4000 device.
The XC4000E family has 8 members, ranging in density
from 3,000 to 25,000 gates, as shown in Table 1.
Description
XC4000E-family devices are implemented with a regular,flexible,
programmable architecture of Configurable LogicBlocks (CLBs),
interconnected by a powerful hierarchy ofversatile routing
resources, and surrounded by a perimeterof programmable
Input/Output Blocks (IOBs). They havegenerous routing resources to
accommodate the mostcomplex interconnect patterns.
The devices are customized by loading configuration datainto the
internal memory cells. The FPGA can eitheractively read its
configuration data from an external serialor byte-parallel PROM
(master modes), or the configura-tion data can be written into the
FPGA (slave and periph-eral modes).
The XC4000E family is supported by powerful and sophis-ticated
software, covering every aspect of design fromschematic or
behavioral entry, floorplanning, simulation,automatic block
placement and routing of interconnects, tothe creation,
downloading, and readback of the configura-tion bit stream.
Because Xilinx FPGAs can be reprogrammed an unlimitednumber of
times, they can be used in innovative designswhere hardware is
changed dynamically, or where hard-ware must be adapted to
different user applications.FPGAs are ideal for shortening the
design and develop-ment cycle, but they also offer a cost-effective
solution forproduction rates well beyond 5000 systems per month.
Forfastest time-to-high-volume, a design can first be imple-
Table 2: Density and Performance for Several Common Circuit
Functions
Design Class Function CLBs Used -3 Speed -2 Speed Units
Memory 32 x 16 bit FIFO (simultaneous read/write) 48 61 MHz
32 x 16 bit FIFO (MUXed read/write) 32 61 MHz
256 x 8 Single Port 72 66 MHz
Logic 16 bit Loadable Counter 8 70 MHz
16 bit Up/Down Counter 8 70 MHz
16 bit Pre-Scaled Counter 8 154 MHz
24 bit Accumulator 13 58 MHz
16 bit Address Decoder, XC4005E(pin-to-pin, edge decode)
0 12.5 ns
16 bit Address Decoder (internal decode) 3 4.7 ns
9 bit Parity Checker 1 3.6 ns
9 bit Shift Register (with enable) 5 170 MHz
-
3
For those readers already familiar with the XC4000 familyof
Xilinx Field Programmable Gate Arrays, the major newfeatures in the
XC4000E family are listed in this section.The biggest advantages of
switching to an XC4000Edevice are the significantly increased
system speed andthe new architectural features, particularly
Select-RAMmemory.
Increased System SpeedDelays in FPGA-based designs are layout
dependent.There is a rule of thumb designers can consider—the
sys-tem clock rate should not exceed one third to one half of
thespecified toggle rate. Critical portions of a design, such
asshift registers and simple counters, can run
faster—approx-imately two thirds of the specified toggle rate.
The XC4000E family can run at synchronous system clockrates of
up to 70 MHz and internal performance in excessof 150 MHz. This
increase in performance over the previ-ous families stems from
improvements in both device pro-cessing and system architecture.
XC4000E-familydevices use a deep sub-micron triple-layer metal
process.In addition, many architectural improvements have beenmade,
as described below.
PCI ComplianceXC4000E-3 and faster speed grades are fully PCI
compli-ant. The XC4000E offers a one-chip PCI solution.
Carry LogicThe speed of the carry logic chain has increased
dramati-cally. Some parameters, such as the delay on the carrychain
through a single CLB (TBYP), have improved by asmuch as 50% from
XC4000 values.
Select-RAM Memory: Edge-Triggered, SynchronousRAM ModesThe RAM
in any CLB can be changed to synchronous,edge-triggered, write
operation. In this mode, the internalwrite operation is controlled
by the same clock that drivesthe flip-flops. The clock polarity is
programmable for theRAM (both F and G function generators
together), but isindependent of the chosen flip-flop polarity.
Address, Data,and WE inputs are latched by this rising or falling
clockedge, and a short internal write pulse is generated
shortlyafter the clock edge. This self-timed write operation is
thuseffectively edge-triggered.
The read operation is not affected by this change to
anedge-triggered write.
Dual-Port RAMA separate option converts the 16x2 RAM in any CLB
into a16x1 dual-port RAM with simultaneous Read/Write. In thismode,
any operation that writes into the F-RAM automati-cally also writes
into the G-RAM, using the F address. TheG-address can only read
from the G-RAM; it cannot beused to write into the G-RAM.
The CLB can thus be used as an asymmetrical dual-portRAM, with F
being the read address for the F-RAM and thewrite address for both
F- and G-RAM, while G is the readaddress for the G-RAM. Note that F
and G can still be inde-pendent read addresses, as they are in
XC4000. The twoRAMs together have one read/write port using the
Faddress, and one read-only port using the G address.
The function generators in each CLB can be configured aseither
level-sensitive (asynchronous) single-port RAM,edge-triggered
(synchronous) single-port RAM, edge-trig-gered (synchronous)
dual-port RAM or as combinatoriallogic.
Configurable RAM ContentThe RAM content can now be configured,
so that the RAMstarts up with user-defined data.
H Function GeneratorIn the XC4000E, the H function generator is
more versatile.Its inputs can come not only from the F and G
function gen-erators but also from up to three control input lines.
The Hfunction generator can be totally or partially independent
ofthe other two function generators.
IOB Clock EnableThe two flip-flops in each IOB have a common
clock enableinput, which through configuration can be activated
individ-ually for the input or output flip-flop or both. This
clockenable operates exactly like the EC pin on the XC4000CLB. This
new feature makes the IOBs more versatile, andavoids the need for
clock gating.
Output DriversThe output pull-up structure defaults to a
TTL-like totem-pole. This driver is an n-channel pull-up
transistor, pullingto a voltage one threshold below Vcc, just like
the XC4000outputs. Alternatively, the XC4000E can be globally
config-ured with CMOS outputs, with p-channel pull-up
transistorspulling to Vcc. Also, the configurable pull-up resistor
inXC4000E is a p-channel transistor that pulls to Vcc,whereas in
the XC4000 it is an n-channel transistor thatpulls to a voltage one
threshold below Vcc.
Input ThresholdsThe input thresholds can be globally configured
for eitherTTL (1.2 V threshold) or CMOS (2.5 V threshold), just
likeXC2000 and XC3000 inputs. Note that the two globaladjustments
of input threshold and output level are inde-pendent of each
other.
Global Signal Access to LogicThere is additional access from
global clocks to the F andG function generator inputs.
Configuration Pin Pull-Up ResistorsDuring configuration, the
three mode pins, M0, M1, andM2, have weak pull-up resistors. For
the most popular con-figuration mode, Slave Serial, the mode pins
can thus beleft unconnected.
-
XC4000E Field Programmable Gate Array Family
4
For user mode, the three mode inputs can individually
beconfigured with or without weak pull-up or pull-down
resis-tors.
The PROGRAM input pin has a permanent weak pull-up.
Soft StartupLike the XC3000A, XC4000E devices have “Soft
Startup.”When the configuration process is finished and the
devicestarts up in user mode, the first activation of the outputs
isautomatically slew-rate limited. This feature avoids thepotential
ground bounce when all outputs are turned onsimultaneously.
Immediately after start-up, the slew rate ofthe individual outputs
is, as in the XC4000 family, deter-mined by the individual
configuration option.
XC4000 and XC4000A CompatibilityExisting XC4000 bitstreams can
be used to configure anXC4000E device. Although they are
pin-for-pin compati-
ble, XC4000A bitstreams must be recompiled for use withthe
XC4000E, due to improved routing resources.
Detailed Functional Description
The XC4000E family devices achieve high speed throughadvanced
semiconductor technology and improved archi-tecture. The XC4000E
supports system clock rates of up to70 MHz and internal performance
in excess of 150 MHz.Compared to older Xilinx FPGA families, the
XC4000Efamily is more powerful. It offers on-chip edge-triggeredand
dual-port RAM, clock enables on I/O flip-flops, andwide-input
decoders. It is more versatile in many applica-tions, especially
those involving RAM. Design cycles arefaster due to a combination
of increased routing resourcesand more sophisticated software.
Table 3: CLB Count of Selected XC4000E Soft Macros
7400 Equivalents CLBs Barrel Shifters CLBs Multiplexers CLBs
‘138‘139‘147‘148‘150‘151‘152‘153‘154‘157‘158‘160‘161‘162‘163‘164‘165s‘166‘168‘174‘194‘195‘280‘283‘298‘352‘390‘518‘521
52565332
1622568849573533822333
brlshft4brlshft8
413
m2-1em4-1em8-1em16-1e
11354-Bit Counters
cd4cdcd4clecd4rlecb4cecb4clecb4re
356365
Registers
rd4rrd8rrd16r
248
8- and 16-Bit Counters Shift Registers
cb8cecb8recc16cecc16clecc16cled
610
99
21
sr8cesr16re
48
Decoders
d2-4ed3-8ed4-16e
24
16Identity Comparators
comp4comp8comp16
125
Explanation of counter nomenclature
cb = binary countercd = BCD countercc = cascadable binary
counterd = bidirectionall = loadablex = cascadablee = clock enabler
= synchronous resetc = asynchronous clear
Magnitude Comparators
compm4compm8compm16
49
20
Explanation of RAM nomenclature
s = single-port edge-triggeredd = dual-port edge-triggeredno
extension = level-sensitive
RAMs
ram16x4ram16x4sram16x4d
224
-
5
Basic Building Blocks
Xilinx high-density user-programmable gate arrays includethree
major configurable elements: configurable logicblocks (CLBs),
input/output blocks (IOBs), and intercon-nections.• CLBs provide
the functional elements for constructing
the user’s logic.• IOBs provide the interface between the
package pins
and internal signal lines.• Programmable interconnect resources
provide routing
paths to connect the inputs and outputs of the CLBsand IOBs onto
the appropriate networks.
Three other types of circuits are also available:• 3-State
buffers (TBUFs) driving horizontal Longlines
are associated with each CLB.• Wide edge decoders are available
around the periph-
ery of each device.• An on-chip oscillator is provided.
The functionality of each circuit block is customized
duringconfiguration by programming internal static memory cells.The
values stored in these memory cells determine thelogic functions
and interconnections implemented in theFPGA.
Each of these available circuits is described in this
section.
Figure 1: Simplified Block Diagram of XC4000E CLB (RAM and Carry
Logic functions not shown)
LOGICFUNCTION
OFG1-G4
G4
G3
G2
G1
G'
LOGICFUNCTION
OFF1-F4
F4
F3
F2
F1
F'
LOGICFUNCTION
OFF', G',ANDH1
H'
DINF'G'H'
DINF'G'H'
G'H'
H'F'
S/RCONTROL
D
ECRD
Bypass
Bypass
SDYQ
XQ
Q
S/RCONTROL
D
ECRD
SDQ
1
1
K(CLOCK)
Multiplexer Controlledby Configuration Program
Y
X
H1 DIN/H2 SR/H0 EC
C1 C2 C3 C4
X6460
-
XC4000E Field Programmable Gate Array Family
6
Configurable Logic Blocks (CLBs)
Configurable Logic Blocks implement most of the logic inan FPGA.
The principal CLB elements are shown inFigure 1. The number of CLBs
needed to implementselected soft macros are shown in Table 3.
Two 4-input function generators (F and G) offer unre-stricted
versatility. Most combinatorial logic functions needfour or fewer
inputs. However, a third function generator(H) is provided. The H
function generator has three inputs.One or both of these inputs can
be the outputs of F and G;the other input(s) are from outside the
CLB. The CLB cantherefore implement certain functions of up to nine
vari-ables, like parity check or expandable-identity comparisonof
two sets of four inputs.
Each CLB contains two flip-flops that can be used to storethe
function generator outputs. However, the flip-flops andfunction
generators can also be used independently. DINcan be used as a
direct input to either of the two flip-flops.H1 can drive the other
flip-flop through the H function gen-erator. Function generator
outputs can also be accessedfrom outside the CLB, using two outputs
independent of theflip-flop outputs. This versatility increases
logic density andsimplifies routing.
Thirteen CLB inputs and four CLB outputs provide accessto the
function generators and flip-flops. These inputs andoutputs connect
to the programmable interconnectresources outside the block.
Function Generators
Four independent inputs are provided to each of two func-tion
generators (F1 - F4 and G1 - G4). These function gen-erators, whose
outputs are labeled F’ and G’, are eachcapable of implementing any
arbitrarily defined Booleanfunction of four inputs. The function
generators are imple-mented as memory look-up tables. The
propagation delayis therefore independent of the function
implemented.
A third function generator, labeled H’, can implement anyBoolean
function of its three inputs. Two of these inputscan optionally be
the F’ and G’ functional generator out-puts. Alternatively, one or
both of these inputs can comefrom outside the CLB (H2, H0). The
third input must comefrom outside the block (H1).
Signals from the function generators can exit the CLB ontwo
outputs. F’ or H’ can be connected to the X output. G’or H’ can be
connected to the Y output.
A CLB can be used to implement any of the following func-tions:•
any function of up to four variables, plus any second
function of up to four unrelated variables, plus any third
function of up to three unrelated variables1
• any single function of five variables• any function of four
variables together with some func-
tions of six variables• some functions of up to nine
variables
Implementing wide functions in a single block reduces boththe
number of blocks required and the delay in the signalpath,
achieving both increased density and speed.
The versatility of the CLB function generators
significantlyimproves system speed. In addition, the
design-softwaretools can deal with each function generator
independently.This flexibility improves cell usage.
Flip-Flops
The CLB can pass the combinatorial output(s) to the
inter-connect network, but can also store the combinatorialresults
or other incoming data in one or two storage ele-ments, and connect
their outputs to the interconnect net-work as well.
The two storage elements in the CLB are edge-triggered D-type
flip-flops with common clock (K) and clock enable (EC)inputs.
Flip-flop functionality is described in Table 4.
Table 4: CLB Flip-Flop Functionality(no optional inversions
used)
LEGEND:
Clock Input
Each flip-flop can be triggered on either the rising or
fallingclock edge. The clock pin is shared by both flip-flops;
how-ever, the clock is independently invertible for the two
flip-flops. Any inverter placed on the clock input is absorbedinto
the CLB.
1. When three separate functions are generated, one of the
func-tion outputs must be captured in a flip-flop internal to the
CLB.Only two unregistered function generator outputs are
availablefrom the CLB.
Mode K EC SR D Q
Power-Upor GSR
X X X X SR
Flip-Flop
X X 1 X SR
X 0 0* X Q
__/ 1* 0* D D
X Don’t care
__/ Rising edge
SR Set or Reset value specified with INIT prop-erty. Reset is
default.
0* Input is Low or unconnected (default value)
1* Input is High or unconnected (default value)
-
7
Clock Enable
The clock enable signal (EC) is active High. If uncon-nected, it
defaults to the active state. EC can be left uncon-nected for
either or both flip-flops; therefore, the control isindependent.
However, the input is shared by both flip-flops in a CLB. EC is not
invertible within the CLB.
Set/Reset
An asynchronous flip-flop input (SR) can be configured aseither
set or reset. This configuration option determines thestate in
which the flip-flops become operational after con-figuration. It
also determines the effect of a Global Set/Reset pulse during
normal operation, and the effect of apulse on the SR pin of the
CLB. All three set/reset func-tions for any single flip-flop are
controlled by the same databit.
The set/reset state can be independently specified for
eachflip-flop. This input can also be disabled for either
flip-flop.
The set/reset state is specified by using the INIT attribute
orby placing the appropriate set or reset flip-flop primitive.
SR is active high and can affect both flip-flops. It is
notinvertible within the CLB.
Global Set/Reset
A separate Global Set/Reset line (not shown in Figure 1)sets or
clears each register during power-up, reconfigura-tion, or when a
dedicated Reset net is driven active. Thisglobal net (GSR) does not
compete with other routingresources.
GSR can be driven from any package pin as a global resetinput.
To use this global net, place an input pad and inputbuffer in the
schematic or VHDL code, driving the GSR pinof the STARTUP symbol. A
specific pin location can beassigned to this input, just as for any
other user-program-mable pad. An inverter can optionally be
inserted after theinput buffer to invert the sense of the Set/Reset
signal.
GSR can also be driven from any internal node.
Data Inputs and Outputs
The source of a flip-flop data input is programmable. It
isdriven by any of the functions F’, G’, and H’, or by the DirectIn
(DIN) block input. The flip-flops drive the XQ and YQCLB
outputs.
The XQ and YQ outputs are also used by the place androute
software to form a fast bypass through the XC4000ECLB. A two-to-one
multiplexer selects between a flip-flopoutput and either the DIN or
EC input. This bypass is usedby the automated router to repower
internal signals.
Control Signals
Multiplexers in the CLB map the four control inputs (C1 - C4in
Figure 1) into the four internal control signals (H1, DIN/
H2, SR/H0, and EC). Any of these inputs can drive any ofthe four
internal control signals.
When the memory function is disabled, the four inputs are:• EC —
Enable Clock• SR/H0 — Asynchronous Set/Reset or H function gen-
erator Input 0• DIN/H2 — Direct In or H function generator Input
2• H1 — H function generator Input 1.
When the memory function is enabled, the four inputs are:• EC —
Enable Clock• WE — Write Enable• D0 — Data Input to F and/or G
function generator• D1 — Data input to G function generator (16x1
and
16x2 modes) or 5th Address bit (32x1 mode).
Using FPGA Flip-Flops
When a function generator drives a flip-flop in a CLB,
thecombinatorial propagation delay overlaps completely withthe
setup time of the flip-flop. The set-up time is specifiedbetween
the function generator inputs and the clock input.This represents a
performance advantage over competingtechnologies, where
combinatorial delays must be addedto the flip-flop setup time.
The abundance of flip-flops in the XC4000E-family devicesinvites
pipelined designs. This is a powerful way of increas-ing
performance by breaking the function into smaller sub-functions and
executing them in parallel, passing on theresults through pipeline
flip-flops. This method should beseriously considered wherever
total performance is moreimportant than simple through-delay.
In the XC4000E family, the flip flops can be used as regis-ters
or shift registers without blocking the function genera-tors from
performing a different, perhaps unrelated task.This ability
increases the functional density of the devices.
Using Function Generators as RAM
The XC4000E family devices are the first programmablelogic
devices with edge-triggered (synchronous) and dual-port RAM
accessible to the user. Edge-triggered RAMsimplifies system timing.
Dual-port RAM doubles the effec-tive throughput of FIFO
applications. These features canbe individually programmed in any
XC4000E CLB.
Optional modes for each CLB make the memory look-uptables in the
F’ and G’ function generators usable as anarray of Read/Write
memory cells. Available modes arelevel-sensitive (similar to the
XC4000/A/H families), edge-triggered, and dual-port edge-triggered.
Depending on theselected mode, a single CLB can be configured as
either a16x2, 32x1, or 16x1 bit array.
-
XC4000E Field Programmable Gate Array Family
8
RAM Configuration OptionsThe function generators in any CLB can
be configured asRAM arrays in the following sizes:• Two 16x1 RAMs:
two data inputs and two data outputs
with identical or, if preferred, different addressing foreach
RAM
• One 32x1 RAM: one data input and one data output
One F or G function generator can be configured as a 16x1RAM
while the other function generators are used to imple-ment any
function of up to 5 inputs.
Additionally, the XC4000E RAM may have either of two tim-ing
modes:• Edge-Triggered (Synchronous): data written by the
designated edge of the CLB clock. WE acts as a trueclock
enable.
• Level-Sensitive: an external WE signal must be sup-plied
asynchronously.
The selected timing mode applies to both function genera-tors
within a CLB when both are configured as RAM.
The number of read ports is also programmable:• Single Port:
each function generator has a read port
and a write port• Dual Port: both function generators are
configured as a
single 16x1 dual-port RAM with one write port and tworead ports.
Simultaneous read and write operations tothe same or different
addresses are supported.
Supported CLB memory configurations and timing modesfor single-
and dual-port modes are shown in Table 5.
RAM configuration options are selected by placing theappropriate
library symbol.
Table 5: Supported RAM Modes
RAM Inputs and OutputsThe F1-F4 and G1-G4 inputs to the function
generators actas address lines, selecting a particular memory cell
in eachlook-up table.
The functionality of the CLB control signals changes whenthe
function generators are configured as RAM. The DIN/H2, H1, and
SR/H0 lines become the two data inputs (D0,D1) and the Write Enable
(WE) input for the 16x2 memory.When the 32x1 configuration is
selected, D1 acts as thefifth address bit and D0 is the data
input.
16x1
16x2
32x1
Edge-TriggeredTiming
Level-SensitiveTiming
Single-Port X X X X X
Dual-Port X X
The contents of the memory cell(s) being addressed areavailable
at the F’ and G’ function-generator outputs. Theycan exit the CLB
through its X and Y outputs, or can becaptured in the CLB
flip-flop(s).
Configuring the CLB function generators as Read/Writememory does
not affect the functionality of the other por-tions of the CLB,
with the exception of the redefinition of thecontrol signals. The
H’ function generator can be used toimplement Boolean functions of
F’, G’, and D1, and the Dflip-flops can latch the F’, G’, H’, or D0
signals.
Single-Port Edge-Triggered ModeEdge-triggered RAM simplifies
timing requirements. TheXC4000E edge-triggered RAM timing operates
like writingto a data register. Data and address are presented.
Theregister is enabled for writing by a logic High on the
writeenable input, WE. Then a rising or falling clock edge loadsthe
data into the register, as shown in Figure 2.
Careful timing relationships between address, data, andwrite
enable signals are not required, and the external writeenable pulse
becomes a simple clock enable. The risingedge of WCLK latches the
address, input data, and WE sig-nals. An internal write pulse is
generated that performs thewrite. See Figure 3 and Figure 4 for
block diagrams of aCLB configured as 16x2 and 32x1 edge-triggered,
single-port RAM.
Figure 2: Edge-Triggered RAM Write Timing
X6461
WCLK (K)
WE
ADDRESS
DATA IN
DATA OUT OLD NEW
TDSS TDHS
TASS TAHS
TWSS
TWPS
TWHS
TWOS
TILOTILO
-
9
Figure 3: 16x2 (or 16x1) Edge-Triggered Single-Port RAM
Figure 4: 32x1 Edge-Triggered Single-Port RAM (F and G addresses
are identical)
G'4
G1 • • • G4
F1 • • • F4
WRITEDECODER
1 of 16
DIN
16-LATCHARRAY
WE D1 D0 EC
C1 C2 C3 C4
X5789
4
MUX
F'WRITE
DECODER
1 of 16
DIN
16-LATCHARRAY
READADDRESS
READADDRESS
WRITE PULSE
LATCHENABLE
LATCHENABLE
CLK
WRITE PULSE
MUX4
4
G'4
G1 • • • G4
F1 • • • F4
WRITEDECODER
1 of 16
DIN
16-LATCHARRAY
WE D1 D0 EC
C1 C2 C3 C4
X5788
4
MUX
F'WRITE
DECODER
1 of 16
DIN
16-LATCHARRAY
READADDRESS
READADDRESS
WRITE PULSE
LATCHENABLE
LATCHENABLE
CLK
WRITE PULSE
MUX4
4
H'
-
XC4000E Field Programmable Gate Array Family
10
Read data is not clocked. It appears asynchronously at
thefunction generator output a certain time after the addressinputs
have settled (TILO for 16x2 and 16x1, TIHO for 32x1).
The relationships between CLB pins and RAM inputs andoutputs for
single-port, edge-triggered mode are shown inTable 6.
The Write Clock input (WCLK) can be configured as activeon
either the rising edge (default) or the falling edge. Ituses the
same CLB pin (K) used to clock the CLB flip-flops,but it can be
independently inverted. Consequently, theRAM output can optionally
be registered within the sameCLB either by the same clock edge as
the RAM, or by theopposite edge of this clock. The sense of WCLK
applies toboth function generators in the CLB when both are
config-ured as RAM.
The WE pin is active-High and is not invertible within
theCLB.
The pulse following the active edge of WCLK has a maxi-mum
pulsewidth requirement, due to the construction of theRAM. The
specification is on the order of milliseconds andshould not be a
serious restriction; however, it should notbe forgotten.
Table 6: Single-Port Edge-Triggered RAM Signals
Dual-Port Edge-Triggered ModeIn dual-port mode, both the F and G
function generatorsare used to create a single 16x1 RAM array with
one writeport and two read ports. The resulting RAM array can
beread and written simultaneously at two independentaddresses.
Simultaneous read and write operations at thesame address are also
supported.
Dual-port mode always has edge-triggered write timing, asshown
in Figure 2 on page 8.
Figure 5 shows a simple model of an XC4000E CLB config-ured as
dual-port RAM. One address port, labeled A[3:0],supplies both the
read and write address for the F functiongenerator. This function
generator behaves the same as a16x1 single-port edge-triggered RAM
array. The RAM out-put, Single Port Out (SPO), appears at the F
function gen-erator output.
RAM Signal CLB Pin Function
D D0 or D1 Data In
A[3:0] F1-F4 orG1-G4
Address
WE WE Write Enable
WCLK K Clock
SPO(Data Out)
F’ or G’ Single Port Out(Data Out)
The other address port, labeled DPRA[3:0] for Dual PortRead
Address, supplies the read address for the G functiongenerator. The
write address for the G function generator,however, comes from the
address A[3:0]. The output fromthis 16x1 RAM array, Dual Port Out
(DPO), appears at theG function generator output.
Therefore, by using A[3:0] for the write address andDPRA[3:0]
for the read address, and reading only the DPOoutput, a FIFO that
can read and write simultaneously iseasily generated. Simultaneous
access doubles the effec-tive throughput of the FIFO.
The relationships between CLB pins and RAM inputs andoutputs for
dual-port, edge-triggered mode are shown inTable 7. See Figure 6
for a block diagram of a CLB config-ured in this mode.
The pulse following the active edge of WCLK has a maxi-mum
pulsewidth requirement, due to the construction of theRAM. The
specification is on the order of milliseconds andshould not be a
serious restriction; however, it should notbe forgotten.
Table 7: Dual-Port Edge-Triggered RAM Signals
Single-Port Level-Sensitive Timing ModeNote: Edge-triggered mode
is recommended for all newdesigns. Level-sensitive mode is still
supported forXC4000E backward-compatibility with the XC4000
family.
Level-sensitive RAM timing is simple in concept but can
becomplicated in execution. Data and address signals arepresented,
then a positive pulse on the write enable (WE)performs a write into
the RAM at the designated address.As indicated by the
“level-sensitive” label, this RAM actslike a latch. During the WE
High pulse, changing the datalines results in new data written to
the old address. Chang-ing the address lines while WE is High
results in spuriousdata written to the new address—and possibly at
otheraddresses as well, as the address lines inevitably do not
allchange simultaneously.
RAM Signal CLB Pin Function
D D0 Data In
A[3:0] F1-F4 Read Address for F,Write Address for Fand G
DPRA[3:0] G1-G4 Read Address for G
WE WE Write Enable
WCLK K Clock
SPO F’ Single Port Out
DPO G’ Dual Port Out
-
11
Figure 5: XC4000E Dual-Port RAM, Simple Model
Figure 6: 16x1 Edge-Triggered Dual-Port RAM
WE WE
D D Q
D Q
D
DPRA[3:0]
A[3:0]
AR[3:0]
AW[3:0]
WE
D
AR[3:0]
AW[3:0]
RAM16X1D Primitive
F Function Generator
G Function Generator
DPO (Dual Port Out)
Registered DPO
SPO (Single Port Out)
Registered SPO
WCLK X6217
G'
G1 • • • G4
F1 • • • F4
WRITEDECODER
1 of 16
DIN
16-LATCHARRAY
WE D1 D0 EC
C1 C2 C3 C4
X5790
4
MUX
F'WRITE
DECODER
1 of 16
DIN
16-LATCHARRAY
READADDRESS
READADDRESS
WRITE PULSE
LATCHENABLE
LATCHENABLE
CLK
WRITE PULSE
MUX4
4
-
XC4000E Field Programmable Gate Array Family
12
The user must generate a carefully timed WE signal. Thedelay on
the WE signal and the address lines must be care-fully verified to
ensure that WE does not become activeuntil after the address lines
have settled, and that WE goesinactive before the address lines
change again. The datamust be stable before and after the falling
edge of WE.
In practical terms, WE is usually generated by a 2X clock.If a
2X clock is not available, the falling edge of the systemclock can
be used. However, there are inherent risks in thisapproach, since
the WE pulse must be guaranteed inactivebefore the next rising edge
of the system clock. Severalapplication notes are available from
Xilinx that discuss thedesign of level-sensitive RAMs. These
application notesinclude XAPP031, “Using the XC4000 RAM
Capability,”and XAPP042, “High-Speed RAM Design in XC4000.”However,
the edge-triggered RAM available within theXC4000E is superior to
level-sensitive RAM for nearlyevery application.
Figure 7 shows the write timing for level-sensitive, single-port
RAM.
Figure 8 and Figure 9 show block diagrams of a CLB con-figured
as 16x2 and 32x1 level-sensitive, single-port RAM.
The relationships between CLB pins and RAM inputs andoutputs for
single-port level-sensitive mode are shown inTable 8.
Table 8: Single-Port Level-Sensitive RAM Signals
RAM Signal CLB Pin Function
D D0 or D1 Data In
A[3:0] F1-F4 orG1-G4
Address
WE WE Write Enable
O F’ or G’ Data Out
Figure 7: Level-Sensitive RAM Write Timing
WCT
ADDRESS
WRITE ENABLE
DATA IN
AST WPT
DST DHT
REQUIRED
AHT
X6462
-
13
Figure 8: 16x2 (or 16x1) Level-Sensitive Single-Port RAM
Figure 9: 32x1 Level-Sensitive Single-Port RAM (F and G
addresses are identical)
Enable
G'4
G1 • • • G4
F1 • • • F4
WRITEDECODER
1 of 16
DIN
16-LATCHARRAY
WE D1 D0 EC
C1 C2 C3 C4
X5786
4
READ ADDRESS
MUX
Enable
F'WRITE
DECODER
1 of 16
DIN
16-LATCHARRAY
4
READ ADDRESS
MUX4
Enable
G'4
G1 • • • G4
F1 • • • F4
WRITEDECODER
1 of 16
DIN
16-LATCHARRAY
H'
WE D1/A5 D0 EC
C1 C2 C3 C4
X5787
4
READ ADDRESS
MUX
Enable
F'WRITE
DECODER
1 of 16
DIN
16-LATCHARRAY
4
READ ADDRESS
MUX4
-
XC4000E Field Programmable Gate Array Family
14
Initializing RAM at ConfigurationBoth RAM and ROM
implementations of the XC4000Edevices are initialized at power-up.
The initial contents aredefined via an INIT attribute or property
attached to theRAM or ROM symbol, as described in the schematic
libraryguide.
If not defined, all RAM contents are initialized to all zeros,by
default.
RAM initialization occurs only during configuration. TheRAM
content is not affected by Global Set/Reset.
Advantages of On-Chip and Edge-Triggered RAMThe on-chip RAM is
extremely fast. The read access timeis the same as the logic delay.
The write access time isslightly slower. Both access times are much
faster thanany off-chip solution.
Edge-triggered RAM, also called synchronous RAM, is afeature
never before available in a Field ProgrammableGate Array. The
simplicity of designing with edge-triggeredRAM, combined with
greatly improved system speeds fromthe elimination of the 2X clock,
add up to a significantimprovement over existing devices with
on-chip RAM.
Two application notes are available from Xilinx that discussRAM
in the XC4000E: “XC4000E Edge-Triggered andDual-Port RAM
Capability,” and “Implementing FIFOs inXC4000E RAM.”
Fast Carry Logic
Each CLB F and G function generator contains dedicatedarithmetic
logic for the fast generation of carry and borrowsignals. This
extra output is passed on to the next CLBfunction generator above
or below. The carry chain is inde-pendent of normal routing
resources.
Dedicated fast carry logic greatly increases the efficiencyand
performance of adders, subtracters, accumulators,comparators and
counters.
The two 4-input function generators can be configured as a2-bit
adder with built-in hidden carry that can be expandedto any length.
This dedicated carry circuitry is so fast andefficient that
conventional speed-up methods like carrygenerate/propagate are
meaningless even at the 16-bitlevel, and of marginal benefit at the
32-bit level.
The fast-carry logic opens the door to many new applica-tions
involving arithmetic operation, where the previousgenerations of
FPGAs were not fast enough or too ineffi-cient. High-speed address
offset calculations in micropro-cessor or graphics systems, and
high-speed addition indigital signal processing are two typical
applications.
This fast carry logic is one of the more significant featuresof
the XC4000E family, speeding up arithmetic and count-ing into the
70 MHz range.
The fast carry logic can be accessed by placing speciallibrary
symbols, or by using Xilinx Relationally Placed Mac-ros (RPMs) that
already include these symbols.
Figure 10 shows the fast carry logic in one XC4000E CLB.
Figure 10: Fast Carry Logic in XC4000E CLB
LogicFunction
of G1 - G4G'
CarryLogic
CarryLogic
F'
LogicFunctionof F1 - F4
M
F4F3
F2
F1
COUT
CIN 1
CIN 2
B0
A0
G4G3
G2
G1
A1
B1
SUM 1
SUM 0
X5373
-
15
Figure 11: Simplified Block Diagram of XC4000E IOB
Input/Output Blocks (IOBs)
User-configurable input/output blocks (IOBs) provide
theinterface between external package pins and the internallogic.
Each IOB controls one package pin and can bedefined for input,
output, or bidirectional signals.
Figure 11 shows a simplified block diagram of theXC4000E IOB. A
more complete diagram can be found inFigure 20 on page 24, in the
Boundary Scan section.
Input Signals
Two paths, labeled I1 and I2 in Figure 11, bring input sig-nals
into the array. Inputs also connect to an input registerthat can be
programmed as either an edge-triggered flip-flop or a
level-sensitive transparent-Low latch. The choiceis made by placing
the appropriate primitive from the sym-bol library.
The inputs can be globally configured for either TTL (1.2V)or
CMOS (2.5V) thresholds. The two global adjustments ofinput
threshold and output level are independent of eachother. There is a
slight hysteresis of about 300mV.
Registered InputsThe I1 and I2 signals that exit the block can
each carryeither the direct or registered input signal.
The input and output storage elements in each IOB have acommon
clock enable input, which through configurationcan be activated
individually for the input or output flip-flopor both. This clock
enable operates exactly like the EC pinon the XC4000E CLB. It
cannot be inverted within the IOB.
The storage element behavior is shown in Table 9.
Q Flip-Flop/Latch
D
D QOut
OE
OutputClock
I
InputClock
ClockEnable
Delay
PadFlip-Flop
Slew RateControl
OutputBuffer
InputBuffer
PassivePull-Up/
Pull-Down
2
I1
X6463
Table 9: Input Register Functionality
LEGEND:
Optional Delay Guarantees Zero Hold TImeThe data input to the
register can optionally be delayed byseveral nanoseconds. With the
delay enabled, the setuptime of the input flip-flop is increased so
that normal clockrouting does not result in a positive hold-time
requirement.A positive hold time can lead to unreliable,
temperature- orprocessing-dependent operation.
The input flip-flop setup time is defined between the
datameasured at the device I/O pin and the clock input at theIOB.
Any routing delay from the clock pad to the clock pinof the IOB
must, therefore, be subtracted from this setuptime to arrive at the
real setup time requirement relative tothe device pins. A short
specified setup time might, there-fore, result in a negative set-up
time at the device pins, i.e.,a positive hold-time requirement.
When the delay is inserted on the data line, more clockdelay can
be tolerated without causing a positive hold-timerequirement. This
delay eliminates the possibility of a datahold-time requirement at
the external pin. The delay istherefore inserted as the default.
For faster input registersetup time, with non-zero hold, attach a
NODELAYattribute or property to the flip-flop.
Mode Clock Clk-Enable
D Q
Power-Upor GSR
X X X SR
Flip-Flop __/ 1* D D
Latch 1 1* D Q
0 1* D D
Both X 0 X Q
X Don’t care
__/ Rising edge
SR Set or Reset value specified with INIT prop-erty. Reset is
default.
0* Input is Low or unconnected (default value)
1* Input is High or unconnected (default value)
-
XC4000E Field Programmable Gate Array Family
16
Output Signals
Output signals can be optionally inverted within the IOB,and can
pass directly to the pad or be stored in an edge-triggered
flip-flop. The functionality of this flip-flop is shownin Table
10.
An output enable signal can be used to place the outputbuffer in
a high-impedance state, implementing 3-state out-puts or
bidirectional I/O. Under configuration control, theoutput (OUT) and
output enable (OE) signals can beinverted. The polarity of these
signals is independentlyconfigured for each IOB.
The 4 mA maximum output current specification of manyFPGAs often
forces the user to add external buffers, whichare especially
cumbersome on bidirectional I/O lines. TheXC4000E family solves
many of these problems by provid-ing a guaranteed output sink
current of 12 mA. Two adja-cent outputs can be interconnected
externally to sink up to24 mA. The FPGA can thus drive short buses
on a printedcircuit board.
By default, the output pull-up structure is configured as
aTTL-like totem-pole. This driver is an n-channel
pull-uptransistor, pulling to a voltage one threshold below
Vcc.Alternatively, the output can be configured as a CMOSdriver,
with a p-channel pull-up transistor pulling to Vcc.This option
applies to every output on the device.
An output can be configured as open-collector by placingan OBUFT
symbol in a schematic or VHDL code, then tyingthe 3-state pin (T)
to the input pin (I).
Table 10: Output Flip-Flop Functionality(no optional inversions
used)
LEGEND:
Mode Clock Clk-Enable
OE D Q
Power-Upor GSR
X X 0* X SR
Flip-Flop
X 0 0* X Q
__/ 1* 0* D D
X X 1 X Z
X Don’t care
__/ Rising edge
SR Set or Reset value specified with INIT prop-erty. Reset is
default.
0* Input is Low or unconnected (default value)
1* Input is High or unconnected (default value)
Z 3-State
Output Slew RateThe slew rate of each output buffer is by
default reduced, tominimize power bus transients when switching
non-criticalsignals. For critical signals, attach a FAST attribute
orproperty to the output buffer or flip-flop.
For XC4000E devices, maximum total capacitive load
forsimultaneous fast mode switching in the same direction is200pF
per Power/Ground pin pair. For slew-rate limitedoutputs this total
is two times larger. This maximum capac-itive load should not be
exceeded, as it can result in groundbounce of greater than 1.5 V
amplitude and more than 5 nsduration. This level of ground bounce
may cause undes-ired transient behavior on an output, or in the
internal logic.This restriction is common to all high-speed digital
ICs, andis not particular to Xilinx or the XC4000E family.
The XC4000E family has a feature called “Soft Startup,”designed
to avoid potential ground bounce when all out-puts are turned on
simultaneously at the end of configura-tion. When the configuration
process is finished and thedevice starts up in user mode, the first
activation of the out-puts is automatically slew-rate limited.
Immediately follow-ing the first activation of the I/O, the slew
rate of theindividual outputs is determined by the individual
configura-tion option for each IOB.
Global Three-StateA separate Global 3-State line (not shown in
Figure 11)forces all FPGA outputs to the high-impedance
state,unless boundary scan is enabled and is executing anEXTEST
instruction. This global net (GTS) does not com-pete with other
routing resources.
GTS can be driven from any package pin as a global 3-state
input. To use this global net, place an input pad andinput buffer
in the schematic or VHDL code, driving theGTS pin of the STARTUP
symbol. A specific pin locationcan be assigned to this input just
as for any other user-pro-grammable pad. An inverter can optionally
be insertedafter the input buffer to invert the sense of the Global
3-State signal.
GTS can also be driven from any internal node.
Other IOB Options
There are a number of other programmable options in theIOB.
Pull-up and Pull-down ResistorsProgrammable pull-up and
pull-down resistors are usefulfor tying unused pins to Vcc or
Ground to minimize powerconsumption. The configurable pull-up
resistor is a p-chan-nel transistor that pulls to Vcc. The
configurable pull-downresistor is an n-channel transistor that
pulls to Ground.
The value of these resistors is 50kΩ − 100kΩ. This highvalue
makes them unsuitable as wired-AND pull-up resis-tors.
-
17
The pull-up resistors for most user-programmable IOBs areactive
during the configuration process. See the “Pin Func-tions During
Configuration” table for a list of pins with pull-ups active before
and during configuration.
After configuration, voltage levels of unused pads, bondedor
unbonded, must be valid logic levels. Therefore, bydefault, unused
pads are configured with the internal pull-up resistor.
Alternatively, they can be individually config-ured with the
pull-down resistor, or as a driven output, or tobe driven by an
external source.
Independent ClocksSeparate clock signals are provided for the
input and out-put registers. The clock can be independently
inverted foreach flip-flop within the IOB, generating either
falling-edgeor rising-edge triggered flip-flops. The clock inputs
for eachIOB are independent.
Global Set/ResetAs with the CLB registers, the Global Set/Reset
signal(GSR) can be used to set or clear the input and output
reg-isters, depending on the value of the INIT attribute or
prop-erty. The two flip-flops can be individually configured to
setor clear on reset and after configuration. Other than theglobal
GSR net, no user-controlled set/reset signal is avail-able to the
I/O flip-flops. The choice of set or clear appliesto both the
initial state of the flip-flop and the response tothe Global
Set/Reset pulse.
JTAG SupportEmbedded logic attached to the IOBs contains test
struc-tures compatible with IEEE Standard 1149.1 for boundary-scan
testing, permitting easy chip and board-level testing.More
information is provided in the Boundary Scan sec-tion later in this
Data Sheet.
Programmable Interconnect
All internal connections are composed of metal segmentswith
programmable switching points and switching matri-ces to implement
the desired routing. A structured, hierar-chical matrix of routing
resources is provided to achieveefficient automated routing.
Four Types of Interconnect
There are four main types of interconnect. Three are
distin-guished by the relative length of their segments:
single-length lines, double-length lines, and Longlines. In
addi-tion, eight global buffers drive fast, low-skew nets mostoften
used for clocks or global control signals.
Single-length lines and double-length lines are connectedby way
of programmable switch matrices.
Programmable Switch Matrices
The single-length lines are a grid of horizontal and
verticallines that intersect at a switch matrix between each
block.Figure 12 illustrates the single-length interconnect
lines
surrounding one CLB in the array, and the switch
matricesconnecting those lines. Each switch matrix consists of
pro-grammable n-channel pass transistors used to
establishconnections between the single-length lines (Figure
13).
For example, a signal entering on the right side of theswitch
matrix can be routed to a single-length line on thetop, left, or
bottom sides, or any combination thereof, if mul-tiple branches are
required.
Figure 12: Typical CLB Connections to Adjacent Sin-gle-Length
Lines
Figure 13: Programmable Switch Matrix
Single-Length Lines
Single-length lines provide the greatest interconnect
flexi-bility and offer fast routing between adjacent blocks.
Theselines connect the switching matrices that are located atevery
intersection of a row and a column of CLBs. How-ever, they incur a
delay whenever they go through a switch-ing matrix.
CLB
G1
C1
K
F1
X
Y
G3
C3
F3
F4 C4 G4 YQ
XQ F2 C2 G2
SwitchMatrix
X3242
SwitchMatrix
SwitchMatrix
SwitchMatrix
Six Pass TransistorsPer Switch MatrixInterconnect Point
X3244
-
XC4000E Field Programmable Gate Array Family
18
Single-length lines are normally used to conduct signalswithin a
localized area and to provide the branching for netswith fanout
greater than one.
Double-Length Lines
The double-length lines consist of a grid of metal segmentstwice
as long as the single-length lines: they run past twoCLBs before
entering a switch matrix. Double-length linesare grouped in pairs
with the switch matrices staggered, sothat each line goes through a
switch matrix at every otherCLB location in that row or column (see
Figure 14).
They provide faster signal routing over intermediate dis-tances,
while retaining routing flexibility.
Figure 14: Double-Length Lines
Longlines
Longlines form a grid of metal interconnect segments thatrun the
entire length or width of the array. A Longline nethas negligible
delay variations. Longlines are intended forhigh fan-out,
time-critical signal nets, or nets that are dis-tributed over long
distances.
Two horizontal Longlines per CLB can be driven by 3-stateor
open-drain drivers. They can therefore implement unidi-rectional or
bidirectional buses, wide multiplexers, or wired-AND functions.
(See the Three-State Buffers section formore details.)
Each Longline has a programmable splitter switch at itscenter
that can separate the line into two independent rout-
CLB
CLB
CLB
CLB
SwitchMatrices X3245
ing channels, each running half the width or height of
thearray.
Each horizontal Longline driven by TBUFs has a pull-upresistor
at each end. To activate these resistors, place aPULLUP symbol and
attach it to the Longline net. The soft-ware automatically
activates one or both pull-ups as appro-priate.
There is also a weak keeper at each end of each
horizontalLongline. This circuit prevents undefined floating
levels.
Global Nets and Buffers
Additional vertical Longlines are driven by special
globalbuffers, designed to distribute clocks and other high
fanoutcontrol signals throughout the array with minimal skew.More
dedicated global resources are provided than in mostavailable
programmable logic devices. Four primary globalnets offer the
shortest delay and negligible skew. Four sec-ondary global nets
have slightly longer delay and slightlymore skew due to heavier
loading, but offer greater flexibil-ity when used to drive
non-clock CLB inputs.
The primary global buffers must be driven by the dedicatedpads.
The secondary global buffers may be sourced byeither dedicated pads
or internal nets.
Each CLB column has four dedicated vertical Longlines.Each of
these lines has access to a particular primary glo-bal net, or to
any of the secondary global nets, as shown inFigure 15. Each corner
of the device has one primary inputand one secondary input.
The user must specify these global nets for all
timing-sen-sitive global signal distribution. To use a global net,
place aBUFGP (primary buffer), BUFGS (secondary buffer), orBUFG
(either primary or secondary buffer) element in aschematic or VHDL
code.
Figure 15: XC4000E Global Net Distribution
X1027
SECONDARYGLOBAL NETS
PRIMARYGLOBAL NETS
-
19
Connections Between Lines
Communication between Longlines and single-length linesis
controlled by programmable interconnect points at theline
intersections. Double-length lines do not connect toother
lines.
CLB Routing Connections
CLB inputs and outputs are distributed on all four sides ofthe
block, providing maximum routing flexibility. In general,the entire
architecture is very symmetrical and regular. It iswell suited to
established placement and routing algorithmsdeveloped for
conventional mask-programmed gate-arraydesign. Inputs, outputs, and
function generators can freelyswap positions within a CLB to avoid
routing congestionduring the placement and routing operation.
Connecting to Single-Length LinesThe function generator and
control inputs to the CLB (F1-F4, G1-G4, and C1-C4) can be driven
from any adjacentsingle-length line segment. Figure 12 shows
typical CLBconnections to the adjacent single-length lines. (Note:
Thenumber of routing channels shown in Figure 12 is for
illus-trative purposes only.) The CLB clock input (K) can bedriven
from half of the adjacent single-length lines. EachCLB output can
drive several of the single-length lines, withconnections to both
the horizontal and vertical Longlines.
Connecting to Double-Length LinesAs with single-length lines,
all the CLB inputs except K canbe driven from any adjacent
double-length line. Each CLBoutput can drive nearby double-length
lines in both the ver-tical and horizontal planes.
Connecting to LonglinesCLB inputs can be driven from a subset of
the adjacentLonglines (see Figure 16). CLB outputs are routed to
theLonglines via 3-state buffers or the single-length intercon-nect
lines.
Three-State Buffers
A pair of 3-state buffers is associated with each CLB in
thearray. These 3-state buffers can be used to drive signalsonto
the nearest horizontal Longlines above and below theblock. They can
therefore be used to implement multi-plexed or bidirectional buses
on the horizontal Longlines,saving logic resources. Programmable
pull-up resistorsattached to both ends of these Longlines help to
implementa wide wired-AND function.
The 3-state buffer input can be driven from any X, Y, XQ, orYQ
output of the neighboring CLB, or from nearby single-length lines.
The buffer enable can come from nearby ver-tical single-length or
Longlines. The enable is an active-High 3-state, or an active-Low
enable, as shown inTable 11.
Another 3-state buffer with similar access is located neareach
I/O block along the right and left edges of the array.
Figure 16: Longline Routing Resources with TypicalCLB
Connections
Special Longlines running along the perimeter of the arraycan be
used to wire-AND signals coming from nearby IOBsor from internal
Longlines. These Longlines form the wideedge decoders discussed in
the next section, Wide EdgeDecoders .
Table 11: Three-State Buffer Functionality
Three-State Buffer Modes
There are three modes in which the 3-state buffers can
beconfigured:• Standard 3-state buffer• Wired-AND with input on the
I pin• Wired OR-AND
Standard 3-State BufferAll three pins are used. Place the
library element BUFT.Tie the input to the I pin and the output to
the O pin. The Tpin is an active-High 3-state or an active-Low
enable.
Wired-AND with Input on the I PinThe buffer can be used as a
Wired-AND. Use the WAND1library symbol, which is essentially an
open-drain buffer.(WAND4, WAND8, and WAND16 are also available.
Seethe XACT Libraries Guide for further information.)
IN T OUT
X 1 Z
IN 0 IN
F4 C4 G4 YQ
G1
C1
K
F1
X
XQ F2 C2 G2
F3
C3
G3
Y
CLB
“Global”Long Lines
X6464
“Global”Long Lines
-
XC4000E Field Programmable Gate Array Family
20
The T pin is internally tied to the I pin. Tie the input to the
Ipin and the output to the O pin. Tie the outputs of all theWAND1s
together and attach a PULLUP symbol.
Wired OR-ANDThe buffer can be configured as a Wired OR-AND. A
Highlevel on either input turns off the output. Use theWOR2AND
library symbol, which is essentially an open-drain 2-input OR
gate.
The two input pins are functionally equivalent. Attach thetwo
inputs to the I0 and I1 pins and tie the output to the Opin. Tie
the outputs of all the WOR2ANDs together andattach a PULLUP
symbol.
Three-State Buffer Examples
An example showing how to use the 3-state buffers toimplement a
wired-AND function is shown in Figure 17.When all the buffer inputs
are High, the pull-up resistor(s)provide the High output.
An example showing how to use the 3-state buffers toimplement a
multiplexer is shown in Figure 18. The selec-tion is accomplished
by the buffer 3-state signal.
Pay particular attention to the polarity of the T pin whenusing
these buffers in a design. Active-High T is identical toan
active-Low output enable.
Wide Edge Decoders
Dedicated circuitry boosts the performance of wide decod-ing
functions. When the address or data field is wider thanthe function
generator inputs, FPGAs need multi-leveldecoding and are thus
slower than PALs. XC4000E-familyCLBs have nine inputs. Any decoder
of up to nine inputs is,therefore, compact and fast. However, there
is also a needfor much wider decoders, especially for address
decodingin large microprocessor systems.
An XC4000E FPGA has four programmable decoderslocated on each
edge of the device. The inputs to eachdecoder are any of the I1
signals on that edge plus onelocal interconnect per CLB row or
column. Each decodergenerates a High output (resistor pull-up) when
the ANDcondition of the selected inputs, or their complements,
istrue. This is analogous to the AND term in typical
PALdevices.
Figure 17: Open-Drain Buffers Implement a Wired-AND Function
Figure 18: 3-State Buffers Implement a Multiplexer
PULL
UP
Z = DA
● DB
● (DC
+DD
) ● (DE
+DF)
DE
DF
DC
DD
DB
DA
WAND1 WAND1W0R2AND W0R2AND
X6465
DNDCDBDA
A B C N
Z = DA • A + DB • B + DC • C + DN • N~100 kΩ
"Weak Keeper"
X6466
-
21
Each of these wired-AND gates is capable of accepting upto 42
inputs on the XC4005E and 72 on the XC4013E.These decoders may also
be split in two when a large num-ber of narrower decoders are
required, for a maximum of32 decoders per device.
The decoder outputs can drive CLB inputs, so they can becombined
with other logic to form a PAL-like AND/ORstructure. The decoder
outputs can also be routed directlyto the chip outputs. For fastest
speed, the output should beon the same chip edge as the decoder.
Very large PALscan be emulated by ORing the decoder outputs in a
CLB.This decoding feature covers what has long been consid-ered a
weakness of older FPGAs. Users often resorted toexternal PALs for
simple but fast decoding functions. Now,the dedicated decoders in
the XC4000E can implementthese functions fast and efficiently.
Figure 19 shows an example of edge decoding. Each rowor column
of CLBs provides up to three variables or theircompliments.
To use the wide edge decoders, place one or more of theWAND
library symbols (WAND1, WAND4, WAND8,WAND16). Attach a DECODE
attribute or property to eachWAND symbol. Tie the outputs together
and attach a PUL-LUP symbol.
Figure 19: Edge Decoding Example
Oscillator
The XC4000E devices include an internal oscillator.
Thisoscillator is used to clock the power-on time-out, for
config-uration memory clearing, and as the source of CCLK inMaster
configuration modes. The oscillator runs at a nom-inal 8 MHz and
varies with process, Vcc, and temperature.The output frequency
falls between 4 and 10 MHz.
IOBIOB
BA
INTERCONNECT
( C) .....
(A • B • C) .....
(A • B • C) .....
(A • B • C) .....
.I1.I1
X2627
C
The oscillator output is optionally available after
configura-tion. Any two of four resynchronized taps of a built-in
rippledivider are also available. These taps are at the
fourth,ninth, fourteenth and nineteenth bits of the divider.
There-fore, if the primary oscillator output is running at the
nomi-nal 8 Mhz, the user has access to an 8 Mhz clock, plus anytwo
of 500kHz, 16kHz, 490Hz and 15Hz.
If only an approximate clock frequency is desired, thesesignals
can be accessed by placing the OSC4 library ele-ment in a schematic
or in VHDL code. If the OSC4 symbolis not placed, the oscillator is
automatically disabled afterconfiguration.
Development System
The powerful features of the XC4000E device familiesrequire an
equally powerful, yet easy-to-use set of develop-ment tools. Xilinx
provides an enhanced version of the Xil-inx Automatic CAE Tools
(XACTstep) optimized for theXC4000E families.
As with other logic technologies, the basic methodology
forXC4000E FPGA design consists of three interrelatedsteps: design
entry, implementation, and verification. Pop-ular generic tools
such as VIEWlogic Systems’ PROSeriesare used for entry and
simulation, but architecture-specifictools are needed for
implementation.
All XC4000E development system software is integratedunder the
Xilinx Design Manager (XDM), providing design-ers with a common
user interface regardless of their choiceof entry and verification
tools. XDM simplifies the selectionof command-line options with
pull-down menus and on-linehelp text. Application programs ranging
from schematiccapture to Partitioning, Placement, and Routing (PPR)
canbe accessed from XDM, while the program-commandsequence is
generated and stored for documentation priorto execution. The XMake
command, a design compilationutility, automates the entire
implementation process, auto-matically retrieving the design’s
input files and performingall the steps needed to create
configuration and report files.
Several advanced features of the XACTstep system facili-tate
XC4000E FPGA design. The MemGen utility, a mem-ory compiler,
implements on-chip RAM within an XC4000EFPGA. Relationally Placed
Macros (RPMs)—schematic-based macros with relative locations
constraints to guidetheir placement within the FPGA—help ensure an
opti-mized implementation for common logic functions.
XACT-Performance, a feature of the Partition, Place, and Route(PPR)
implementation program, allows designers to entertheir exact
performance requirements during design entry,at the schematic
level.
-
XC4000E Field Programmable Gate Array Family
22
Design Entry
Designs can be entered graphically, using schematic-cap-ture
software, or in any of several text-based formats. Forexample,
Boolean equations, state-machine descriptions,and high-level design
languages are supported.
Xilinx and third-party CAE vendors have developed libraryand
interface products compatible with a wide variety ofdesign-entry
and simulation environments. A standardinterface-file
specification, XNF (Xilinx Netlist File), is pro-vided to simplify
file transfers into and out of the XACTstepdevelopment system.
Xilinx offers XACTstep development system interfaces tothe
following design environments:• VIEWlogic Systems (VIEWDraw,
VIEWSim, VIEWSyn-
thesis, PROSeries)• Mentor Graphics V7 and V8 (NETED,
Quicksim,
Design Architect, Quicksim II, Exemplar)• Cadence (Composer,
Concept, Verilog)• OrCAD (SDT, VST)• Synopsys (Design Compiler,
FPGA Compiler)• Xilinx-ABEL• X-BLOX
Many other environments are supported by third-party ven-dors.
Currently, more than 100 packages are supported.
The schematic library for the XC4000E FPGA reflects thewide
variety of logic functions that can be implemented inthese
versatile devices. The library contains over 400 prim-itives and
macros, ranging from 2-input AND gates to 16-bitaccumulators, and
including arithmetic functions, compara-tors, counters, data
registers, decoders, encoders, I/Ofunctions, latches, Boolean
functions, RAM and ROMmemory blocks, multiplexers, shift registers,
and barrelshifters.
Designing with macros is as easy as designing with stan-dard
SSI/MSI functions. So-called “soft macros” containdetailed
descriptions of common logic functions, but do notcontain any
partitioning or routing information. The perfor-mance of these
macros depends, therefore, on how thePPR software processes the
design. Relationally PlacedMacros (RPMs), on the other hand, do
contain pre-deter-mined partitioning and relative placement
information,resulting in an optimized implementation for these
func-tions. Users can create their own library elements—eithersoft
macros or RPMs—based on the macros and primitivesof the standard
library.
X-BLOX is a graphics-based high-level description lan-guage
(HDL) that allows designers to use a schematic edi-tor to enter
designs as a set of generic modules. The X-BLOX compiler optimizes
the modules for the target devicearchitecture, automatically
choosing the appropriate archi-tectural resources for each
function.
The XACTstep design environment supports hierarchicaldesign
entry, with top-level drawings defining the majorfunctional blocks,
and lower-level descriptions defining thelogic in each block. The
implementation tools automaticallycombine the hierarchical elements
of a design. Differenthierarchical elements can be specified with
different designentry tools, allowing the use of the most
convenient entrymethod for each portion of the design.
Design Implementation
The design implementation tools satisfy the requirement foran
automated design process.
Logic partitioning, block placement and signal
routing,encompassing the design implementation process,
areperformed by the Partition, Place, and Route program(PPR). The
partitioner takes the logic from the entereddesign and maps the
logic into the architectural resourcesof the FPGA (such as the
logic blocks, I/O blocks, 3-statebuffers, and edge decoders). The
placer then determinesthe best locations for the blocks, depending
on their con-nectivity and the required performance. The router
finallyconnects the placed blocks together.
The PPR program includes XACT-Performance, a featurethat allows
designers to specify the timing requirementsalong entire paths
during design entry. Timing path analysisroutines in PPR then
recognize and accommodate theuser-specified requirements. Timing
requirements can beentered on the schematic in a form directly
relating to thesystem requirements. For example, the targeted
minimumclock frequency or the maximum allowable delay on thedata
path between two registers can be specified. So,while the timing of
each individual net is not predictable, itdoes not need to be. The
overall performance of the sys-tem along entire signal paths is
automatically tailored tomatch user-generated specifications.
The PPR algorithms result in the fully automatic implemen-tation
of most designs. However, for demanding applica-tions, the user may
exercise various degrees of controlover the automated
implementation process. The imple-mentation of highly-structured
designs can greatly benefitfrom the basic floorplanning techniques
familiar to design-ers of large gate arrays. User-designated
partitioning,placement, and routing information can be specified as
partof the design entry process. Alternatively, the
XACT-Floor-planner is proving to be an excellent tool for achieving
max-imum density and performance for difficult designs.
The automated implementation tools are complemented bythe XACT
Design Editor (XDE), an interactive graphics-based editor that
displays a model of the actual logic androuting resources of the
FPGA. XDE can be used todirectly view the results achieved by the
automated tools.Modifications can be made using XDE. XDE also
performschecks for logic connectivity and possible design-rule
viola-tions.
-
23
Design Verification
The high development cost associated with common mask-programmed
gate arrays necessitates extensive simulationto verify a design.
Due to the custom nature of maskedgate arrays, mistakes or
last-minute design changes can-not be tolerated. A gate-array
designer must simulate andtest all logic and timing using
simulation software. Simula-tion describes what happens in a system
under worst-casesituations. However, simulation is tedious and
slow, andsimulation vectors must be generated. A few seconds
ofsystem time can take weeks to simulate.
Programmable gate array users, however, can use in-cir-cuit
debugging techniques in addition to simulation.Because Xilinx
devices are reprogrammable, designs canbe verified in the system in
real time without the need forextensive simulation vectors.
The XACTstep development system supports both simula-tion and
in-circuit debugging techniques. For simulation,the system extracts
the post-layout timing information fromthe design database. This
data can then be sent to the sim-ulator to verify timing-critical
portions of the design. Back-annotation—the process of mapping the
timing informationback into the signal names and symbols of the
schematic—eases the debugging effort.
For in-circuit debugging, XACTstep includes a serial down-load
and readback cable called XChecker. XChecker con-nects the device
in the system to the PC or workstationthrough an RS232 serial port.
The engineer can downloada design or a design revision into the
system for testing.The designer can also single-step the logic,
read the con-tents of the numerous flip-flops on the device and
observeinternal logic levels. Simple modifications can be
down-loaded into the system in a matter of minutes.
The XACTstep system also includes XDelay, a static
timinganalyzer. XDelay examines a design’s logic and timing
tocalculate the performance along signal paths, identify pos-sible
race conditions, and detect setup and hold-time viola-tions. Timing
analyzers do not require that the usergenerate input stimulus
patterns or test vectors.
Boundary Scan
The ‘bed of nails’ has been the traditional method of
testingelectronic assemblies. This approach has become
lessappropriate, due to closer pin spacing and more sophisti-cated
assembly methods like surface-mount technologyand multi-layer
boards. The IEEE Boundary Scan standard1149.1 was developed to
facilitate board-level testing ofelectronic assemblies. Design and
test engineers canimbed a standard test logic structure in their
device toachieve high fault coverage for I/O and internal logic.
Thisstructure is easily implemented with a four-pin interface onany
Boundary Scan-compatible IC. IEEE 1149.1-compati-ble devices may be
serial daisy-chained together, con-nected in parallel, or a
combination of the two.
The XC4000E family implements IEEE 1149.1-compatibleBYPASS,
PRELOAD/SAMPLE and EXTEST Boundary-Scan instructions. When the
Boundary-Scan configurationoption is selected, three normal user
I/O pins become ded-icated inputs for these functions. Another user
output pinbecomes the dedicated boundary scan output. The detailsof
how to enable this circuitry are covered later in this
sec-tion.
By exercising these input signals, the user can serially
loadcommands and data into these devices to control the driv-ing of
their outputs and to examine their inputs. Thismethod is an
improvement over bed-of-nails testing. Itavoids the need to
over-drive device outputs, and itreduces the user interface to four
pins. An optional fifth pin,a reset for the control logic, is
described in the standard butis not implemented in the Xilinx
part.
The dedicated on-chip logic implementing the IEEE
1149.1functions includes a 16-state state machine, an
instructionregister and a number of data registers. The
functionaldetails can be found in the IEEE 1149.1 specification
andare also discussed in Xilinx document XAPP 017: "Bound-ary Scan
in XC4000 Devices."
Figure 20 shows a simplified block diagram of theXC4000E
Input/Output Block with boundary scan imple-mented.
Figure 21 is a diagram of the XC4000E boundary scanlogic. It
includes three bits of Data Register per IOB, theIEEE 1149.1 Test
Access Port controller, and the Instruc-tion Register with
decodes.
It is also possible to configure the XC4000E through theboundary
scan logic. See Configuration Through theBoundary Scan Pins on page
35.
-
XC4000E Field Programmable Gate Array Family
24
Data Registers
The primary data register is the boundary-scan register.For each
IOB pin in the FPGA, it includes three bits for In,Out and 3-State
Control. Non-IOB pins have appropriatepartial bit population for In
or Out only. PROGRAM, CCLKand DONE are not included in the boundary
scan register.Each EXTEST CAPTURE-DR state captures all In, Out,and
3-state pins.
The data register also includes the following non-pin
bits:TDO.T, and TDO.I, which are always bits 0 and 1 of thedata
register, respectively, and BSCANT.UPD, which isalways the last bit
of the data register. These three bound-ary scan bits are
special-purpose Xilinx test signals.
The other standard data register is the single flip-flopBYPASS
register. It synchronizes data being passedthrough the FPGA to the
next downstream boundary-scandevice.
The FPGA provides two additional data registers that canbe
specified using the BSCAN macro. The FPGA providestwo user pins
(BSCAN.SEL1 and BSCAN.SEL2) which arethe decodes of two user
instructions. For these instruc-tions, two corresponding pins
(BSCAN.TDO1 andBSCAN.TDO2) allow user scan data to be shifted out
onTDO. The data register clock (BSCAN.DRCK) is availablefor control
of test logic which the user may wish to imple-ment with CLBs. The
NAND of TCK and RUN-TEST-IDLEis also provided (BSCAN.IDLE).
Figure 20: Block Diagram of XC4000E IOB with Boundary Scan (some
details not shown)
D
EC
Q
M
M
QL
rd
M
DELAY
M M
M M
Input Clock IK
I - capture
I - update
GLOBALS/R
FLIP-FLOP/LATCH
INVERT
S/R
Input Data 1 I1
Input Data 2 I2
X5792
PAD
VCC
SLEWRATE
PULLUP
M
OUTSEL
D
EC
Q
rd
M
M
M
INVERTOUTPUT
M
M
INVERT
S/R
Ouput Clock OK
Clock Enable
Ouput Data O
O - update
Q - captureO - capture
BoundaryScan
MEXTEST
TS - update
TS - capture
3-State TS
sd
sd
TS INV
OUTPUT
TS/OE
PULLDOWN
INPUT
BoundaryScan
BoundaryScan
-
25
Figure 21: XC4000E Boundary Scan Logic
D Q
D Q
D Q
IOB
IOB
IOB
IOB
IOB
IOB
IOB
IOB
IOB
IOB
IOB
IOB
IOB
MUX
BYPASSREGISTER
IOB IOB
TDO
TDI
IOB IOB IOB
MUXTDO
TDI
IOB
IOB
IOB
IOB
IOB
IOB
IOB IOB
IOB
IOB
IOB
IOB
IOB
IOB
IOB IOB IOB IOB IOB
1
0
1
0
1
0
1
0
1
0
1
0
1
0
D Q
LE
sd
sd
LE
D Q
D Q
D Q
1
0
1
0
1
0
1
0
D Q
LE
sd
sd
LE
D Q
sd
LE
D Q
IOB
D Q
D Q1
0
1
0D Q
LE
sd
sd
LE
D Q
1
0
DATA IN
IOB.T
IOB.Q
IOB.I
IOB.Q
IOB.T
IOB.I
IOB.O
SHIFT/CAPTURE
CLOCK DATAREGISTER
DATAOUT UPDATE EXTEST
X1523
INSTRUCTION REGISTER
INSTRUCTION REGISTER
BYPASSREGISTER
-
XC4000E Field Programmable Gate Array Family
26
Instruction Set
The XC4000E boundary scan instruction set also
includesinstructions to configure the device and read back the
con-figuration data. The instruction set is coded as shown inTable
12.
Table 12: Boundary Scan Instructions
Bit Sequence
The bit sequence within each IOB is: In, Out, 3-State. Froma
cavity-up view of the chip (as shown in XDE), starting inthe upper
right chip corner, the Boundary-Scan data-regis-ter bits are
ordered as shown in Figure 22.
BSDL (Boundary Scan Description Language) files for theXC4000E
devices are available on the Xilinx BBS.
Including Boundary Scan in a Schematic
If boundary scan is only to be used during configuration,
nospecial schematic elements need be included in the sche-matic or
VHDL code. In this case, the special boundaryscan pins TDI, TMS,
TCK and TDO can be used for userfunctions after configuration.
To indicate that boundary scan remain enabled after
config-uration, place the BSCAN library symbol and connect theTDI,
TMS, TCK and TDO pad symbols to the appropriatepins.
Even if the boundary scan symbol is used in a schematic,the
input pins TMS, TCK, and TDI can still be used asinputs to be
routed to internal logic. Care must be taken notto force the chip
into an undesired boundary scan state byinadvertently applying
boundary scan input patterns to
InstructionI2 I1 I0
TestSelected
TDOSource
I/O DataSource
0 0 0 EXTEST DR DR
0 0 1 SAMPLE/PRELOAD
DR Pin/Logic
0 1 0 USER 1 BSCAN.TDO1
UserLogic
0 1 1 USER 2 BSCAN.TDO2
UserLogic
1 0 0 READ-BACK
Read-back Data
Pin/Logic
1 0 1 CONFIG-URE
DOUT Disabled
1 1 0 Reserved — —
1 1 1 BYPASS BypassRegister
—
these pins. The simplest way to do this is to keep TMSHigh, and
then apply whatever signal is desired to TDI andTCK.
Avoiding Inadvertent Boundary Scan Acti-vation
If TMS or TCK is used as user I/O, care must be taken toensure
that at least one of these pins is held constant dur-ing
configuration. In some applications, a situation mayoccur where TMS
or TCK is driven during configuration.This may cause the device to
go into boundary scan modeand disrupt the configuration
process.
To prevent activation of boundary scan during configura-tion,
you can do either of the following:• TMS: Tie it High to put the
device in a benign RESET
state• TCK: Tie it High or Low—don't toggle this clock
input.
For more information regarding Boundary Scan, refer toXAPP
017.001, “Boundary Scan in XC4000E Devices.“
Figure 22: Boundary Scan Bit Sequence
Bit 0 ( TDO end)Bit 1Bit 2
TDO.TTDO.O
Top-edge IOBs (Right to Left)
Left-edge IOBs (Top to Bottom)
MD1.TMD1.OMD1.IMD0.IMD2.I
Bottom-edge IOBs (Left to Right)
Right-edge IOBs (Bottom to Top)
B SCANT.UPD(TDI end)
X6075
-
27
Configuration
Configuration is the process of loading design-specific
pro-gramming data into one or more FPGAs to define the func-tional
operation of the internal blocks and theirinterconnections. This is
somewhat like loading the com-mand registers of a programmable
peripheral chip. TheXC4000E family uses about 350 bits of
configuration dataper CLB and its associated interconnects. Each
configura-tion bit defines the state of a static memory cell that
con-trols either a function look-up table bit, a multiplexer
input,or an interconnect pass transistor. The XACTstep develop-ment
system translates the design into a netlist file. It auto-matically
partitions, places and routes the logic andgenerates the
configuration data in PROM format.
Special Purpose Pins
Three configuration mode pins (M2, M1, M0) are sampledprior to
configuration to determine the configuration mode.After
configuration, these pins can be used as auxiliaryconnections. M2
and M0 can be used as inputs, and M1can be used as an output. The
XACTstep developmentsystem does not use these resources unless they
areexplicitly specified in the design entry. This is done by
plac-ing a special pad symbol called MD2, MD1, or MD0 insteadof the
input or output pad symbol.
In the XC4000E, the mode pins have weak pull-up resistorsduring
configuration. With all three mode pins High, SlaveSerial mode is
selected, which is the most popular configu-ration mode. Therefore,
for the most common configura-tion mode, the mode pins can be left
unconnected. (Note,however, that the internal pull-up resistor
value can be ashigh as 100kΩ.) After configuration, these pins can
individ-ually have weak pull-up or pull-down resistors, as
specifiedin the design entry.
These dedicated nets are located in the lower left chip cor-ner
and are near the readback nets. This location allowsconvenient
routing if compatibility with the XC2000 andXC3000 family
conventions of M0/RT, M1/RD is desired.
Configuration Modes
The XC4000E families have six configuration modes,selected by a
3-bit input code applied to the M2, M1, andM0 inputs. There are
three self-loading Master modes, twoPeripheral modes, and the
Serial Slave mode, which isused primarily for daisy-chained
devices. The coding formode selection is shown in Table 13.
A detailed description of each configuration mode isincluded
later in this data sheet. During configuration,some of the I/O pins
are used temporarily for the configura-tion process. All pins used
during configuration are shownin the “Pin Functions During
Configuration” table later inthis data sheet.
Table 13: Configuration Modes
Master Modes
The three Master modes use an internal oscillator to gener-ate a
Configuration Clock (CCLK) for driving potential slavedevices. They
also generate address and timing for exter-nal PROM(s) containing
the configuration data.
Master Parallel (Up or Down) modes generate the CCLKsignal and
PROM addresses and receive byte paralleldata. The data is
internally serialized into the FPGA data-frame format. The up and
down selection generates start-ing addresses at either zero or
3FFFF, for compatibility withdifferent microprocessor addressing
conventions. TheMaster Serial mode generates CCLK and receives the
con-figuration data in serial form from a Xilinx
serial-configura-tion PROM.
Peripheral Modes
The two Peripheral modes accept byte-wide data from abus. A
READY/BUSY status is available as a handshakesignal. In the
asynchronous mode, the internal oscillatorgenerates a CCLK burst
signal that serializes the byte-widedata. In the synchronous mode,
an externally suppliedclock input to CCLK serializes the data.
Mode M2 M1 M0 CCLK Data
MasterSerial
0 0 0 output Bit-Serial
SlaveSerial
1 1 1 input Bit-Serial
MasterParallelUp
1 0 0 output Byte-Wide,incrementfrom 00000
MasterParallelDown
1 1 0 output Byte-Wide,decrementfrom 3FFFF
PeripheralSynch.*
0 1 1 input Byte-Wide
PeripheralAsynch.
1 0 1 output Byte-Wide
Reserved 0 1 0 — —
Reserved 0 0 1 — —
*Peripheral Synchronous can be considered Slave Parallel
-
XC4000E Field Programmable Gate Array Family
28
Slave Serial Mode
In Slave Serial mode, the FPGA receives serial configura-tion
data on the rising edge of CCLK and, after loading
itsconfiguration, passes additional data out, resynchronizedon the
next falling edge of CCLK.
Multiple slave devices with identical configurations can bewired
with parallel DIN inputs. In this way, multiple devicescan be
configured simultaneously.
Multiple devices with different configurations can be con-nected
together in a “daisy chain,” DOUT to DIN, and a sin-gle combined
bitstream used to configure the chain of slavedevices. See the
Daisy Chained Devices section for fur-ther information on this
configuration option.
Setting CCLK Frequency
CCLK can be generated in either of two frequencies. In
thedefault slow mode, the frequency ranges from 0.5 MHz to1.25 MHz.
In fast CCLK mode, the frequency ranges from4 MHz to 10 MHz. The
frequency is sel