by support »
Hello,
It's a bit pointless to argue about things that are already written in such a well-established specification. But I can come up with a few situations where this restrictions may simplify the implementation of hardware.
For example, if DDR memories in involved in a DMA read request to the CPU. Reading from a DDR memory involves requesting a DDR row, which typically contains 1024 memory elements. On a 32-bit bus, that's exactly 4 kB/row. So if the read request is ensured not to cross a 4kB boundary, that also ensures that the request can be completed with a single row operation.
Another example I can think of is when there is an IOMMU. Each DMA request to the CPU requires the evaluation against the IOMMU's memory conversion map. This map is arranged according to 4kB pages. The fact that the page boundary is never crossed means that the IOMMU only needs to check one entry.
Knowing that there's only one operation needed often simplifies hardware the implementation considerably. But why they decided on this restriction many years ago is a different question.
Regards,
Eli