The Xillybus Forum

by **Guest** »

I have a project design on ZYNQ platform that uses the axi-pcie ip and connects to the DDR in the Processing System. I have a BAR enabled in the PCIe to be of 512MB to be able to write/read into the DDR memory space. We have a Linux kernel based driver that does “write” of a constant 8-byte data in a for loop (upto 256MB). I tested the boot image file for different PCIe configurations and it seems like the speed seems to be always around 55-66MBps irrespective of the lanes/speed. For lanes greater than 1 and speed greater than 2.5Gt/s, we witness spikes upto 120MBps but they are not stable. I used your concept to calculate the bandwidth for lane 1 at 2.5Gtps (Bandwidth * 8/30) and it comes out to be 66MBps (so that’s ideal for lane 1 at 2.5Gt/s). However, the bandwidth doesn’t scale up base on the lane/speed. I have read all your blogs/forums but wasn’t able to understand this behavior. I see that suggestion to use the DMA will improve the bandwidth, however, shouldn’t the write performance scale based on configuration?

Other questions –
1) Does a single TLP use a single lane or multiple lanes if the PCIe is configured for multiple lanes?
2) How did you create the fpga sniffer and what data are you exactly putting in it?

Please suggest what might be going on. I would really appreciate your help.

by **support** »

Hello,

It's not exactly clear how you've set up your system, but apparently the write operations are driven by software. This means that each time you issue a write, a TLP with no more than 4 our 8 bytes is created, so you get a very low efficiency. On the face of it, it looks like the PCIe/AXI bridge is your bottleneck, as it probably handles TLP packets at a constant rate, no matter the lane with. Just a wild guess.

So yes, DMA is probably what you need there.

Anyhow, all TLP packets are sent on all lanes, so adding lanes does speed up the transmission of the data on the PCIe bus.

As for the TLP sniffing -- I simply connected the 32-bit wide TLP data wires of the PCIe block to a FIFO, and attached the other end to a Xillybus stream. The transmission of the data to the host created even more TLP data to sniff, but that's not necessarily a bad thing...

Regards,
Eli

by **Guest** »

The system has AXI PCIe IP whose master is connected to the PS GP0 and the salve is connected to the HP0 to be able to access the DDR. The REFCLK is off the PCIe bus. In the address editor, I can see the two BARs of 16KB and 512MB to be enumerated at offset addresses BAR0 - 50010000 and BAR1 - 60000000. Do you have particular question on the set up of the hardware?

by **support** »

Hi,

I'm not sure if I can help at all on this one. But it's quite clear that if you want performance, you need DMA.

Regards,
Eli

The Xillybus Forum

same bandwidth of 55MBps regardless of PCIe config

same bandwidth of 55MBps regardless of PCIe config

Re: same bandwidth of 55MBps regardless of PCIe config

Re: same bandwidth of 55MBps regardless of PCIe config

Re: same bandwidth of 55MBps regardless of PCIe config