Can a CPU freeze due to congestion or unresponsive device?

Comments and questions related to the "Down to the TLP" pages

Can a CPU freeze due to congestion or unresponsive device?

Postby Guest »

hello xillybus,

in the case of a mmio read:
the cpu simply has an instruction of loading to register from a memory location (indistinguishable from loading from a host DRAM location). Is the CPU allowed to take as many cycles as it needs to "mov rax, [rbx]" where rbx points to a MMIO address. In that case, what happens if a PCIe device takes long or forever to send a completion TLP?

similarly in the case of a mmio write:
if a cpu constantly writes to a device, can the PCIe fabric or device be so saturated that writes are simply dropped like in an IP network? and how does CPU know anything about saturation of the fabric or device given that write are posted
Guest
 

Re: Can a CPU freeze due to congestion or unresponsive devic

Postby support »

Hello,

Let's take them one by one, because these are difference scenarios.

According to the PCIe spec's section 2.8, if a PCIe device doesn't answer a memory read request, a timeout mechanism should kick in. The time of this timeout is application-specific, and should be between 50us and 50ms (but it's recommended to make this timeout no less than 10 ms).

So in theory, a CPU should recover gracefully from such a situation. In practice, I've seen a CPU getting stuck exactly in this scenario. This is why Xillybus' IP core never makes any memory reads from the device.

As for memory writes: If the device can't handle write requests, their flow will be stalled at some point, thanks to PCIe's flow control. I'm not sure if there's a timeout mechanism to prevent holding the flow control stalled indefinitely.

But what I can say from a practical point of view, is that I've never seen nor heard about any computer getting stuck because people reconfigured the FPGA while Xillybus' driver attempted to write to the FPGA's PCIe interface. So sending packets to the void is harmless. What happens if you deliberately prevent write requests from being delivered, I don't know. It's a rather bizarre scenario, though.

Regards,
Eli
support
 
Posts: 802
Joined:

Re: Can a CPU freeze due to congestion or unresponsive devic

Postby Guest »

Thanks for the response Eli.

In the case of a read timeout. what value does the CPU get? how does the CPU know whether this is an intended value versus an error code
Guest
 

Re: Can a CPU freeze due to congestion or unresponsive devic

Postby support »

Hello,

Truth to be told, I don't know the answer to neither of those questions, because I abandoned memory reads very early in the development process. Exactly because of this thing with CPU freezes.

But I've seen situations where PCIe devices failed to respond to lspci's configuration queries, and I think the response was all 1's. I'm not sure if I remember correctly or if this matters at all.

Regards,
Eli
support
 
Posts: 802
Joined:


Return to General PCIe

cron