I am getting too good results!

Questions and discussions about the Xillybus IP core and drivers

I am getting too good results!

Postby Guest »

Hello,

I am using Xillybus on my platform which has a PCIe 1x, and I am doing a simple loopback in my user application. It write a large array and reads it back in C and measures the roundtrip delay. When I do this once, I get around 80 MB/s which is logical, however when I do this in multiple runs and measure the average (which should be a more robust measurement) I get around 10x better result which is around 800 MB/s. This is simply impossible on a 1x. In the best and ideal case I should not get over 500MB/s (roughly), so I'm confused.

Notes:
1. I re-initilize the whole array in every iteration, just to make sure cache is not messing up my numbers, or for example user space to kernel space overhead, etc.
2. I check the integrity i.e. I am indeed getting the right values back from device!

What am I missing? This is so confusing. Any ideas?

Thanks
Guest
 

Re: I am getting too good results!

Postby support »

Hi,

The question is what time period you actually measure. When the stopwatch is started and stopped.

For example, write() may return almost immediately if there is enough space in the DMA buffers. The data transfer occurs "in the background". By the same coin, if the device file is open for read, data will start arriving in the DMA buffers before read() is called, so when read() is eventually called, it may return pretty much right away, if there's enough data in the buffers already.

This is true only for asynchronous streams, but odds are this is what you have there.

I hope this gave an idea.

Regards,
Eli
support
 
Posts: 802
Joined:

I am getting too good results!

Postby Guest »

Hello,

I am working with my Xillybus on my platform with a PCIe 1x, and my performance measurement is beyond the peak!!

Here is the situation:
I have a loopback application in C that writes a large array to the device and reads it back (I make sure I read all of the elements back completely and correctly). When I do this once the measurement gives me 80 MB/s (which can be logical) but when I do this in a loop and calculate the average time, it gives me around 800 MB/s which is simply impossible on a PCIe 1x. I also re-intialize the array (change all the values) as a precaution for caching effects. I know I'm missing something, but I can't see what! Does anyone has any idea what I might be missing?

A simple version of my code:
Code: Select all
  for (i=0; i<EVAL_RUNS; i++) {
   // Initialize buffer
   int j;
   for(j=0; j<BUFF_SIZE; j++) buf[j] = 'a'+i+j%70;
    gettimeofday(&start_w, NULL);
    // Write to device
    allwrite(fdw, buf, BUFF_SIZE);         // WRITE ALL
    // Read from device
    rc = allread(fdr, buf2, BUFF_SIZE);    // READ ALL
    gettimeofday(&end_r, NULL);

    // Read everything remained just in case
    sleep(1); // just in case!
    allread(fdr, buf2+rc, BUFF_SIZE-rc);
   
    // Check integrity   
    for(i=0; i<rc; i++) if(buf[i] != buf2[i]) printf("Mismatch: %c <-> %c\n", buf[i], buf2[i]);

   // ERROR CHECKING
    if ((rc < 0) && (errno == EINTR))
      exit(0);

    if (rc < 0) {
      perror("allread() failed to read");
      exit(1);
    }

    if (rc == 0) {
      fprintf(stderr, "Reached read EOF.\n");
      exit(0);
    }

    sum_wr += ((end_r.tv_sec * 1000000 + end_r.tv_usec)
               - (start_w.tv_sec * 1000000 + start_w.tv_usec));
   }

long long cpu_time_wr = sum_wr/EVAL_RUNS;
Guest
 

Re: I am getting too good results!

Postby support »

Your use of allread() seems to imply that it doesn't necessarily read all BUFF_SIZE bytes from the buffer. If this is indeed the case, no wonder you get weird figures, as you measure a partial data transmission.

Regards,
Eli
support
 
Posts: 802
Joined:


Return to Xillybus

cron