Xilinx Confidential – Internal V6 GTX Gu Yongguo
Mar 26, 2015
Xilinx Confidential – Internal
V6 GTX
Gu Yongguo
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 2
Agenda
Transceiver Overview– Transceiver Roadmap
– Virtex-6 GTX Table
Virtex-6 GTX Overview– Die Allocation
– PLL
– Clock resources
Virtex-6 GTX Architecture– Transmitter
– Receiver
– DRP
Virtex-6 GTX PCB
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 3
Transceiver OverviewNext-Generation Serial Connectivity Roadmap
I/O SpeedI/O Speed
6.5 Gbps
10 Gbps
614 Mbps
3.125 Gbps2.488 Gbps
5 GbpsGTXGTX
Advanced Rx EQAdvanced Rx EQLow latencyLow latencyLow PowerLow Power
PCI Express IPPCI Express IP
GTXGTXAdvanced Rx EQAdvanced Rx EQ
Low latencyLow latencyLow PowerLow Power
PCI Express IPPCI Express IP
GTPGTP Low PowerLow Power
PCI-Express PHYPCI-Express PHYPCI Express IPPCI Express IP
Easy to UseEasy to Use
GTPGTP Low PowerLow Power
PCI-Express PHYPCI-Express PHYPCI Express IPPCI Express IP
Easy to UseEasy to Use
11.2 Gbps
GTX Advanced Rx
EQ
Low latency
Low Power
PCI-Express PHY
PCI Express IP
Easy to Use
GTX Advanced Rx
EQ
Low latency
Low Power
PCI-Express PHY
PCI Express IP
Easy to Use
GTPLowest CostLow Power
PCI-Express PHYPCI-Express IP
Easy to Use
GTPLowest CostLow Power
PCI-Express PHYPCI-Express IP
Easy to Use
150 Mbps
GTHGTHFeatures: Highest Serial BW Advanced RX EQ
Features: Highest Serial BW Advanced RX EQ
9.95 Gbps
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 4
Virtex-6 LXT & SXT
16 of 18 device-package combinations have transceivers
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 5
Agenda
Transceiver Overview– Transceiver Roadmap
– Virtex-6 GTX Table
Virtex-6 GTX Overview– Die Allocation
– Reference Clock
– PLL
Virtex-6 GTX Architecture– Transmitter
– Receiver
– DRP
Virtex-6 GTX PCB
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 6
GTX Allocation
X1Y1
X0Y4
X0Y3
X0Y2
X1Y4
X1Y3
X1Y2
BANK14
BANK15
BANK16
BANK24
BANK25
BANK26
BANK34
BANK35
BANK36
X0Y11
X0Y10
X0Y9
X0Y8
X0Y19
X0Y18
X0Y17
X0Y16
X0Y7
X0Y6
X0Y5
X0Y4
MMCM5MMCM4
MMCM7MMCM6
MMCM9MMCM8
BUFG
16
BUFG
16
GTXE CLOUMN
8BUFR4BUFIO
8BUFR4BUFIO
8BUFR4BUFIO
8BUFR4BUFIO
8BUFR4BUFIO
8BUFR4BUFIO
8BUFR4BUFIO
8BUFR4BUFIO
8BUFR4BUFIO
MGT_Q115
MGT_Q116
MGT_Q114
X0Y1
BANK13
BANK23
BANK33
MMCM3MMCM2
8BUFR4BUFIO
8BUFR4BUFIO
8BUFR4BUFIO
MGT_Q113
X0Y15
X0Y14
X0Y13
X0Y12
X1Y0
X0Y3
X0Y2
X0Y1
X0Y0
X0Y0
BANK12
BANK22
BANK32
MMCM1MMCM0
8BUFR4BUFIO
8BUFR4BUFIO
8BUFR4BUFIO
MGT_Q112
LX130TLX195TLX240TLX365TSX315TSX475T
IOCL IOCRIOOL
RX B5/B6TX A3/A4
RX D5/D6TX B1/B2
RX E3/E4TX C3/C4
RX G3/G4TX D1/D2
REF1 F6/F5REF0 H6/H5
RX J3/J4TX F1/F2
RX K6/K5TX H1/H2
RX L3/L4TX K1/K2
RX N3/N4TX M1/M2
REF1 M6/M5REF0 P6/P5
RX R3/R4TX P1/P2
RX U3/U4TX T1/T2
RX W3/W4TX V1/V2
RX AA3/AA4TX Y1/Y2
REF1 T6/T5REF0 V6/V5
RX AC3/AC4TX AB1/AB2
RX AE3/AE4TX AD1/AD2
RXAF5/AF6TX AF1/AF2
RX AG3/AG4TX AH1/AH2
REF1 AB6/AB5REF0 AD6/AD5
RX AJ3/AJ4TX AK1/AK2
RX AL3/AL4TX AM1/AM2
RX AM5/AM6TX AN3/AN4
RX AP5/AP6TX AP1/AP2
REF1 AH6/AH5REF0 AK6/AK5
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 7
Transceiver OverviewVirtex-6 Transceivers - GTX
Available in Virtex-6 LXT, SXT and HXT
Range: 480 Mbps – 6.5 Gbps Compliant to major protocol standards
– Gigabit Ethernet, PCI Express Gen1 & Gen2, OC-48, XAUI, HD-SDI, OBSAI, CPRI, SRIO, FC-1/2/4, Interlaken, CEI-6
OOB signaling for PCI Express Built-in Linear EQ, DFE and Tx Pre-
emphasis Highly flexible clocking
– Independent PLLs for TX and RX Power dissipation: < 150 mW typ
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 8
Reference Clock
Easier than it looks– Intelligent Pin Selection– Connect IBUFDS_E1 to
MGTREFCLKTX/RX[0] Wizard will sort this out for you!
– The wizard selections will make the correct connections
– Includes north and south bound routes Advanced Users:
– Can use MUX connections to specify specific clock routes
– Complex view available to assist in Clock Switching applications
2 Refclks (RefClk0 and 1)from pins
(Like Virtex-5)
2 Refclks cascade from North
Quad
2 Refclks cascade from South
Quad
PERFCLK and GREFCLK From Fabric
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 9
GTX Reference Clock Conceptual View
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 10
GTX Transceiver Detailed Diagram
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 11
Clock Generation Comparison:
Virtex 5 GTP/GTX Clocking:
Virtex 6 Clocking:
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 12
Reference clock connectionSingle GTX w/ Single Refclk
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 13
Reference clock connectionMultiple GTXs w/ Multiple Refclks
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 14
Reference clock connectionSingle Clock Sharing
MGTRefClk comes from local pins
Note – Each external RefClk can feed up to
3 Quads (12 transceivers)
– MGTRefclk directly from an external pin via IBUFDS
Quad (n+1)
Quad (n)
Quad (n-1)
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 15
PLL Architecture
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 16
PLL Selection: Typical Case
Upstream and downstream are same rate– XAUI– PCIe– Aurora– Most other protocols…
Power down TX PLL = Power Savings
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 17
PLL Selection: Fancy Case
Upstream and downstream are different rates!– HD-SDI
– Transponder w/ FEC and w/o FEC rates
– Additional Flexibility
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 18
GTX
MGTRefClk from local pins
This output used by both TX and RX
Remember…
Recall – MGTRefClk is local, so we select
MGTREFCLKRX[0]
– For Aurora, we use the same RefClk for TX and RX directions
• TX PLL is powered down to save power
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 19
Agenda
Transceiver Overview– Transceiver Roadmap
– Virtex-6 GTX Table
Virtex-6 GTX Overview– Die Allocation
– Reference Clock
– PLL
Virtex-6 GTX Architecture– Transmitter
– Receiver
– DRP
Virtex-6 GTX PCB
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 Xilinx
Overview
Page 20
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 21
Transmitter Diagram
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 22
Data Width
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 23
TXUSRCLK/TXUSRCLK2
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 24
TXUSRCLK INTERNAL GENERATION
TXUSRCLK tied to GND
TXUSRCLK is derived from TXUSRCLK2
TXUSRCLK is not faster than TXUSRCLK2– Internal divider only
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 25
Clock scheme Example2-Byte interface
TXUSRCLK is generated internally– 1 BUFG is saved
Internal and internal data widths are equal
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 26
Clock scheme Example4-Byte interface
WINT = WEXT
FTXUSRCLK2 = FTXUSRCLK / 2
TXUSRCLK is generated externally by MMCM
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 27
Clock scheme Example1-Byte interface
FTXUSRCLK2 = FTXUSRCLK * 2
TXUSRCLK is generated internally– 1 BUFG is saved
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 28
Clock scheme ExampleMulti-lanes with 2-Byte interface
Clock is same to single 2-byte application– But share among other lanes
Similar case to other interfaces
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 29
Transmitter Resets
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 30
Transmitter Reset Coverage
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 31
Reset Recommendation
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 32
TXBUFFER
Remove phase difference between TXUSRCLK and XCLK Can be bypassed for low latency application
– Advanced and some complex
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 33
Buffer Bypass Steps
Set the following attributes with their values as follows:– Set TXOUTCLK_CTRL to use either TXPLLREFCLK_DIV2 or TXPLLREFCLK_DIV1– Set TX_XCLK_SEL to TXUSR– Set TX_BUFFER_USE to FALSE– Set TX_PMADATA_OPT to TRUE
After power-on, make sure TXPMASETPHASE and TXENPMAPHASEALIGN are driven Low.
Make sure that the input port TXDLYALIGNDISABLE is driven High. Apply GTXTXRESET and wait for TXRESETDONE to go High. Wait for all clocks to stabilize, then assert TXDLYALIGNRESET for at least 16
TXUSRCLK2 clock cycles. Drive TXENPMAPHASEALIGN High.
– Keep TXENPMAPHASEALIGN High unless the phase-alignment procedure must be repeated. – Driving TXENPMAPHASEALIGN Low causes phase alignment to be lost.
Wait 32 TXUSRCLK2 clock cycles and then drive TXPMASETPHASE High. Wait the number of required TXUSRCLK2 clock cycles as specified in Table 3-20,
and then drive TXPMASETPHASE Low. The phase of the PMACLK is now aligned with TXUSRCLK.
Drive TXDLYALIGNDISABLE Low. – Optional: Keep TXDLYALIGNDISABLE High to disable the TX delay aligner.
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 34
Phase Alignment after GTXTXRESET
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 35
Phase Re-alignment conditions
GTXTXRESET is asserted
TXPLLPOWERDOWN is deasserted
The clocking source changed
The line rate of the GTX TX transceiver changed
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 36
Phase Alignment after Line Rate changing
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 37
TX Parallel Clock Divider
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 38
TX Driver
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 39
TXDIFFCTRL
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 40
TXPOSTEMPHASIS
Control the Post-Cursor Preemphasis
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 41
TXPREEMPHASIS
Control the Pre-Cursor Preemphasis
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 42
Receiver Diagram
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 43
RX AFE
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 44
RX Linear Equalization
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 45
RXEQMIX Setting
Determine the operating data rate.
Determine the channel loss (board) in dB at data rate/2. – This is the differential insertion loss from measured or extracted S-parameter
data commonly referred to as Sdd21.
Pick the appropriate RXEQMIX setting from the relative gain plot.– Always make sure that the transmit amplitude is sufficient when picking
modes with a higher gain because there is DC attenuation of the signal. Reference the absolute gain plot.
Based on these results, the appropriate setting of RXEQMIX can be picked.
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 46
DFE
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 47
RX CDR
Edge Sampler– Detect the Eye edge
Data Sampler– Real Data Sampling Point
Scan Sampler– For Margin
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 48
CDR Lock Detection
Finding known data in the incoming data stream (for example, commas or A1/A2 framing characters). – Several consecutive known data patterns must be received without error to
indicate a CDR lock.
Using the LOS state machine– Incoming data is 8B/10B encoded
– If CDR is locked, the LOS state machine moves to the SYNC_ACQUIRED state and stays there.
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 49
RX parallel clock divider
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 50
RX Margin ScanAttributes
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 51
Eye Margin related to bit error
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 52
Eye MarginOperating Steps Set proper RXREQMIX
– Improper RXREQMIX can lead to incorrect DFE operating
Run DFE with Auto-Calibration– To get the max eye height
– With NO bit error
Run manual DFE with proper DFE setting– Copy TAP monitors to TAP set ports
– Assert DFETAPOVRD
Set RX_EYE_SCANMODE to 2’b01– Via DRP
Modify RX_EYE_OFFSET to control scan sampling point– Via DRP
– Judge by DFEEYEDACMONITOR[4:0]• 5’b11111 for 200mV• Minimum input is 120mV (about 5’b10011)
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 53
RX Buffer Bypass
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 54
RX Phase Alignment steps
Set the following attributes with their values as follows:– Set RXRECCLK_CTRL:
• 2 byte or 4 byte – use RXRECCLKPMA_DIV2• 1 byte – use RXRECCLKPMA_DIV1
– Set RX_BUFFER_USE to FALSE to bypass the RX elastic buffer.– Set RX_XCLK_SEL to RXUSR.
Make sure all the input ports RXENPMAPHASEALIGN and RXPMASETPHASE are driven Low
Make sure that the input port RXDLYALIGNDISABLE is driven High. Reset the RX datapath using GTXRXRESET or the RXCDRRESET. If an MMCM is used to generate RXUSRCLK/RXUSRCLK2 clocks, wait for the MMCM
to lock. Wait for the CDR to lock and provide a stable RXRECCLK. Assert RXDLYALIGNRESET for 20 RXUSRCLK2 clock cycles. Drive RXENPMAPHASEALIGN High
– Keep RXENPMAPHASEALIGN High unless the phase-alignment procedure must be repeated. Driving RXENPMAPHASEALIGN Low causes phase align to be lost
Wait 32 RXRUSCLK2 clock cycles and then drive RXPMASETPHASE High for 32 RXUSRCLK2 cycles and then deassert it.
Drive RXDLYALIGNDISABLE Low.
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 55
Timing waveform
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 56
Agenda
Transceiver Overview– Transceiver Roadmap
– Virtex-6 GTX Table
Virtex-6 GTX Overview– Die Allocation
– PLL
– Clock resources
Virtex-6 GTX Architecture– Transmitter
– Receiver
– DRP
Virtex-6 GTX PCB
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 57
Power supplier
Power can be shared between Quads
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 58
RCAL Resistor PCB Layout
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 59
Power SupplyingAll column is in used
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 60
Power SupplyingMGTAVCC plane
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 61
Power SupplyingNo MGT used in Column
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 62
Power SupplyingPartially used Column --- Whole Quad unused
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 63
Power SupplyingPartial Quad unused
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 64
Quad used PriorityFF484/FF784
Priority 1: MGT115– This Quad should be used if any of the GTX transceivers in the device
are used in the application. It contains the RCAL circuit that is required for the RX and TX internal termination resistors.
Priority 2: MGT114/116– Depending on availability in the package, these Quads have equal
priority.
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 65
Quad used PriorityFF1156/FF1759
Priority 1: MGT115– This Quad should be used if any of the GTX transceivers in the device are used in
the application. It contains the RCAL circuit that is required for the RX and TX internal termination resistors.
Priority 2: MGT116/117/118– If present in the Virtex-6 device, these Quads are connected in the package to the
same power planes as MGT115, the north power plane group. Therefore they have equal priority. Because the north power planes need to be powered for MGT115, these Quads are also powered; therefore they can be used without additional power supply connections.
Priority 3: MGT110/111/112/113/114– These transceivers are connected to the south power planes. They should be used if
all Quads on the north power planes have already been utilized. If any of these Quads are used, then all MGTAVCC_N and MGTAVTT_S pins need to be connected to the appropriate power supply voltage.
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 66
Reference Clock InterfaceLVDS Clock
Xilinx Confidential – Internal • Unpublished Work © Copyright 2009 XilinxPage 67
Reference ClockLVPECL Clock