There are two types of latency to consider when looking at the cycle by cycle transmission of data on the DDR4 Data Bus. The first is the time between the Read or Write command and the data associated with that command, refered to as CAS and CAS Write latency, and then there is the time between successive Read/Write Commands. Latency exists in these two cases in order to give the DRAM time to find, in the case of a READ, and accept in the case of a Write, the data. The time between successive transactions is because the DRAM needs to recover from the previous operation before it can accept a new one. Latency is a key factor in determining Performance.
In the next diagram you will see where the important JEDEC separation parameters or protocol checks are measured on a DDR4 bus. Note the orange colored boxes. This is an example of where the system is operating under the specification, in other words leaving some performance on the table. For latency you want to exhibit what we call the ‘Goldilocks’ principle, not too long, not too short but just right. Meaning the memory controller is issuing transactions right at the specification. This ensures that bus is not leaving any performance on the table and critical applications can take advantage of the higher DDR4 speeds.
I once had a very experienced engineer from a major server vendor say to me. “ Barb, if you can save us 1 clock tic out of a hundred, that’s a 1% performance improvement and for servers that is a big deal.” By identifying which parameters needed to be tweaked in the Memory Controller I was happy to say 'we did that!'