Adapting design to Xillybus IP

Questions and discussions about the Xillybus IP core and drivers

Adapting design to Xillybus IP

Postby Guest » Sun Feb 18, 2018 10:10 pm

Hello,

I have implemented a Verilog module as a two-state FSM, with number of them coupled in a pipeline to perform computation.
The first stage is pre-loading 2 bit data values into each of the modules, second stage is streaming another load of 2 bit values across the initialized coupled modules.
I am wondering - what is the right way to make use of Xillybus IP and FIFOs to achieve such a goal?

I had a look into the manuals as well as the top-level xillydemo.v module, but I am getting confused with all the information presented.
All I need is to pre-load data (host-FPGA) to my modules, stream some data (host-FPGA) and stream the results back as they arrive (FPGA-host).
I would appreciate any pointers to where I can start. I know how FIFOs and shift registers operate, but I seem to not be able to put the whole picture together in my head.
Please indicate if I should elaborate more.

Thank you.
Guest
 

Re: Adapting design to Xillybus IP

Postby support » Mon Feb 19, 2018 9:09 am

Hello,

The general idea is to connect one of the FIFOs' end to Xillybus' IP core and the other to your application logic.

For example, in the host-to-FPGA direction, anything you write into the respective device file on the host appears on the related FIFO on the FPGA. Same goes in the other direction: Anything you write into the FIFO, is read from the respective device file on the host.

What is left for your application is to fetch data from one FIFO (host-to-FPGA), process it, and push the results into the other FIFO (FPGA-to-host).

Regards,
Eli
support
 
Posts: 572
Joined: Tue Apr 24, 2012 3:46 pm

Re: Adapting design to Xillybus IP

Postby Guest » Tue Feb 27, 2018 8:53 pm

Thank you.

Another concern that I just came across is that an array of, let's say, 500 32bit values are being calculated at *every clock cycle*.
How should I approach this? It seems that there would be an issue trying to pass the whole array at every clock cycle to the host.

My initial idea was store the intermediate values, and send them to the host 32bits at a time, but this would incur a requirement for on-board storage and it's what i'm trying to minimize, by streaming the output results as they arrive to the host instead of keeping them.

Does xillybus provide a way of streaming such large amounts of data to the host without buffering? The idea is to have a, let's say, C program with two threads, one sending the data to FPGA, another receiving the big arrays of output results.
Guest
 

Re: Adapting design to Xillybus IP

Postby support » Wed Feb 28, 2018 7:04 am

Hello,

I suggest taking a look on section 6.6 of the Programming Guide for Linux, which discusses hardware acceleration:

http://xillybus.com/downloads/doc/xilly ... _linux.pdf

(there's a similar section in the guide for Windows as well, depending on what you're working with)

As for how to organize the data transport, that's really the art of FPGA design. Xillybus provides a stream transport. Storing the data in an array until calculation might be a good idea. 512 x 32 RAMs is peanuts to common FPGAs.

The thing I didn't understand is how you're going to calculate 500 x 32 bit on each clock cycle. Even at a lazy 50 MHz, that's 100 GB/s of data processed. So surely you can't transport each such array back and forth at anything near that rate.

Regards,
Eli
support
 
Posts: 572
Joined: Tue Apr 24, 2012 3:46 pm


Return to Xillybus