UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

AR# 66341

UltraScale GTY Transceiver: TX and RX Latency Values

Description

This answer record provides the TX and RX latency values for the Virtex UltraScale FPGA GTY Transceiver.

Solution

 

 

 

 

Scroll down for tables:

 

 

 

 

TX:

Internal Data Width 16 20 32 40 64 80 Comments
Min Max Min Max Min Max Min Max Min Max Min Max  
TX Fabric Interface Fabric width Fabric width Fabric width Fabric width Fabric width Fabric width Double the * values if TX_FABINT_USRCLK_INT = 1b1 (Default 0). Parenthesized numbers apply when Gearbox FIFO is used.
16 32 20 40 32 64 40 80 64 128 80 160
16* 48 20* 60 32* (33*) 96 (99) 40* 120 64* (66*) 192 (198) 80* 240
PCIe 128B/130B Encoder     96 126       0 if bypass
8B/10B Encoder   20 20   40 40         0 if bypass
Synchronous Gearbox
(Legacy Gearbox)
32 - 64 32 - 64     64 - 128 64 - 128     128 - 254 128 - 254     64B66B - 0 if bypass. Parenthesized range is for CAUI mode.
        (64 - 128) (64 - 128)     (128 - 256) (128 - 256)    
32 - 66 32 - 66     64 - 130 64 - 130     128 - 257 128 - 257     64B67B - 0 if bypass. Parenthesized range is for CAUI mode.
        (64 - 132) (64 - 132)     (128 - 260) (128 - 260)    
Asynchronous Gearbox
(Gearbox FIFO)
        309 340     353 353     64B66B only - 0 latency when unused. When used, TX Phase FIFO is bypassed with 0 latency. If non-default TXGBOX_FIFO_INIT_RD_ADDR (IRA) is used, add (4IRA)*66 UI to latency. CAUI numbers are expected to be in the same range.
TX Phase FIFO 40-56
(56-72)
40-56
(56-72)
50-70
(70-90)
50-70
(70-90)
80-112
(112-144)
80-112
(112-144)
100-140
(140-180)
100-140
(140-180)
162-226
(226-290)
162-226
(226-290)
200-280
(280-360)
200-280
(280-360)
Using TX FIFO. Parenthesized value applies if TXFIFO_ADDR_CFG = HIGH. (Default LOW)
16 16 20 20 32 (0 when using Gearbox FIFO) 32 (0 when using Gearbox FIFO) 40 40 64 (0 when using Gearbox FIFO) 64 (0 when using Gearbox FIFO) 80 80 Bypassing TX FIFO.
To TX PCS/PMA boundary 16 16 20 20 32 32 40 40 64 64 80 80  
To Serializer 32 32 40 40 64 64 80 80 128 128 160 160 Using TX FIFO or Gearbox FIFO: 2 TX XCLK cycles
16 16 20 20 32 32 40 40 64 64 80 80 Bypassing TX FIFO and Gearbox FIFO: 1 cycle into Serializer.
PMA 19 19 19 19 29 29 29 29 29 29 29 29 Serializer.
Total -- absolute minimum 83 99 157 189 285 349 Fabric Interface (NxN) + TX FIFO bypass + To TX PCS/PMA boundary + To Serializer + PMA.
Total -- XAUI (8B/10B mode) with TX FIFO   169 229   329 449     Fabric Interface (min NxN, max 2NxN) + 8b10b + TX FIFO (latency variation after reset) + To TX PCS/PMA boundary + To Serializer + PMA.
 

Note:

1) Using the TXGBOX_FIFO_LATENCY DRP register:

Actual latency through the TX Asynchronous Gearbox exceeds the latency reported by the TXGBOX_FIFO_LATENCY (DRP Attribute) by 65 UI (for 4 byte usage) and 131 UI(for 8 byte Usage).

The latency reported by the attribute is in units of 1/8th UI. So, the value read out from the attribute has to be divided by 8 before adding the offset.

 

2) Using the Latency table for high line rates

In addition to the TX and RX latencies in the tables above, users will have to add an additional 0.5 UI latency for each 1 Gbps increment in line rate beyond 4Gbps to the total roundtrip (TX path + RX path) latency.

 

For Example:

From the table, for a particular use case, if the total latency (TX data path + RX data path) results in 500 UI and if the intended line rate of operation is 28 Gbps, the user will need to add 12UI (0.5 UI * 24). 

So, the final latency is 500 UI + 12 UI = 512 UI.

 

RX:

 
Internal Data Width 16 20 32 40 64 80 Comments      
Min Max Min Max Min Max Min Max Min Max Min Max        
PMA 40.5 44.5 64.5 76.5 96.5 116.5 Deserializer.      
PMA to PCS 0 0 0 0 0 0 RX FIFO used      
8 10 16 20 32 40 RX FIFO bypass: 1/2 cycle latency.      
Internal Parallel Loopback: PCS TX to RX 16 20 32 40 64 80 For internal parallel loopback only. Latency from To TX PCS/PMA boundary of Tx Table      
Comma Alignment 32 55 [33] 40 (60) 69 [41] 64 103 [65] 80 (120) 129 [81] 128 [131] 160 [163] Variability covers multiple modes. Parenthesized min is for XAUI. Bracketed max is for RXSLIDE PMA mode with PCS shifter.      
16 20 32 40 64 80 No Comma Alignment      
8B/10B Decoder   20   40     0 if bypass.      
PCIe Decoder and Block Alignment (128B/130B)     97 127       Decoder is synchronous, but its latency varies continuously within this range during normal operation.      
PCIe RX Elastic Buffer     320 - 416 +/-variation due to nonzero PPM       The FIFO commonly alternates between two latencies 32 UI apart in normal operation. The remaining 64 UI of variation depends on startup conditions when the FIFO starts receiving valid data.      
PCIe Decode/ Align + Elastic Buffer combined     421 - 513 +/-variation due to nonzero PPM       Because of a correlation in the latency variation between the Decode/Aligner and the Elastic Buffer, the total variation for the combination is smaller than the sum of the variations for each one alone.      
Elastic buffer 24+ 8xML

("ML" = CLK_COR_MIN_LAT)
40 + 8xML

("ML" = CLK_COR_MIN_LAT)
30 + 10xML
50 + 10xML
48 + 8xML
80+ 8xML
60 + 10xML
100 + 10xML
96 +
8xML
160 + 8xML 120 + 10xML 200 + 10xML 0 if bypass Note concerning CLK_COR_MIN_LAT: the value ranges shown for CLK_COR_MIN_LAT in the table are simplified guidelines for the purpose of showing sample latency ranges in the last 2 rows of the table. Use the CLK_COR_MIN_LAT value from the wizard as ML for actual latency calculations.
 
 
For 2 byte: 4 <= ML <= 6 (phase only)
11 <= ML <= 13 (clock correction)
**Use ML = 6 for calculation
For 4 byte : 8 <= ML <= 12 (phase only)
23 <= ML <= 27 (clock correction)
**Use ML = 12 for calculation
For 8 byte: 16 <= ML <= 24 (phase only)
**Use ML = 24 for calculation
Asynchronous Gearbox
(Gearbox FIFO)
    252 348   372 500   64B66B only - 0 latency when unused. If non-default RXGBOX_FIFO_INIT_RD_ADDR (IRA) is used, add (defaultIRA)*66 UI to latency.      
     
     
Synchronous Gearbox
(Legacy Gearbox)
16 - 49   32 - 97   66 - 193   64B66B - 0 if bypass. Parenthesized range is for CAUI mode.      
(32 - 98) (64 - 194)      
16 - 50   32 - 98   67 - 196   64B67B - 0 if bypass. Parenthesized range is for CAUI mode.      
(32 - 100) (64 - 196)      
RX Fabric Interface Fabric Width Fabric Width Fabric Width Fabric Width Fabric Width Fabric Width        
16 32 20 40 32 64 40 80 64 128 80 160      
16* 48 20* 60 32* (33*) 96 (99) 40* 120 64* (66*) 192 (198) 80* 240 Fabric interface latency. Double the * values if RX_FABINT_USRCLK_INT = 1b1. Parenthesized numbers apply when Gearbox FIFO is used.      
  0, 16       0,       0     Extra latency when using gearboxes (0 or 1 RXUSRCLK cycle), due to possible need for extra pipelining to align frame within fabric word.      
32-33      
Total -- absolute minimum 81 95 145 177 257 317 PMA (round up) + PMA to PCS (FIFO bypass) + Comma Alignment bypass + Fabric Interface (NxN)      
Total -- XAUI (highest latency mode)   235 304   457 586     PMA (round up) + PMA to PCS (zero-cycle setup) + Comma Alignment (XAUI mode) + 8B/10B + Elastic Buffer (latency variation after reset) + Fabric Interface (min NxN, max 2NxN)      

 

 

Note:

1) Using the RXGBOX_FIFO_LATENCY DRP register:

Actual latency through the RX Asynchronous Gearbox exceeds the latency reported by the RXGBOX_FIFO_LATENCY (DRP Attribute) by 32 UI (for 4 byte Usage) and 63UI (for 8 byte usage).

The latency reported by the attribute is in units of 1/8th UI. So, the value read out from the attribute has to be divided by 8 before adding the offset.

 

2) Using the COMMA_ALIGN_LATENCY DRP register:

To determine actual latency using the COMMA_ALIGN_LATENCY register, use the following procedure:

Latency = 2*Internal Data Width + DRP value from COMMA_ALIGN_LATENCY register

3) Using the Latency table for high line rates:

In addition to the TX and RX latencies in the tables above, users will have to add an additional 0.5 UI latency for each 1 Gbps increment in line rates beyond 4 Gbps to the total roundtrip (TX path + RX path) latency.

For Example:

From the table, for a particular use case, if the total latency (Tx data path + Rx data path) results in 500 UI and if the intended line rate of operation is 28 Gbps, the user will need to add 12 UI (0.5 UI * 24).

So, the final latency is 500 UI + 12 UI = 512 UI.

AR# 66341
Date Created 01/05/2016
Last Updated 11/07/2016
Status Active
Type General Article
Devices
  • Virtex UltraScale