Memory Latency
What is it, and what causes it?
Memory latency is comprised of the following factors:
- Address decoding delay
- Word line delay
- Bit line sensing delay
- Output driving delay
- Wire RC Delay (dominant)
If the effects of chip processing improvements are considered, it has been found that DRAM latency is proportional (in some respects) to chip size, however there are many circuit, architecture, and layout techniques may be used to reduce latency. Higher speed circuits take up more area, but in general, no dominant circuit topologies have emerged as the preferred design.
Contributors to Latency
The worst case latency in DRAM is specified by the row access time (tRAC). This is the time from when the RAS pin is pulled low to when valid data appears on the output. It is comprised of four main components:
- Address decoding latency – The time required to latch the address and decide which word line needs to be activated. This delay is caused by the multiplexing of the row and column addresses and the propagation delay through the decoding logic.
- Word line activation latency – The time required to raise the word line to a high voltage. This is fundamentally an RC delay.
- Bit line sensing latency – The time required for the cell contents to be detected by the sense amplifiers. This is affected by bit line architecture, the RC of the sense amp drive line, cell-to-bit line capacitance ratio, and sense amp topology.
- Output driving latency – Time needed for the data to be propagated from the sense amps to the output pads. This too is an RC delay.
DRAM designers seeking to reduce tRAC all strive to reduce these four components through various circuit and layout techniques. To provide you with an idea of the relative importance of the above delays, a “conventional” 64-Mb DRAM fabricated in a 0.4u CMOS process takes 7 ns (13%) to decode the address, 23 ns (44%) raising the word line, 16 ns (30%) to sense the data, and 7 ns (13%) to drive the data to the output, for a total tRAC of 53 ns.
How can we reduce latency in a DRAM?
Reductions in latency can be achieved by developing and applying advanced circuit and layout techniques, however these improvements usually come at the expense of increased power and increased DRAM area, and obviously, increased cost. Given current market trends, low cost is one of the key design criteria in DRAM’s, as is speed. It has become obvious recently that DRAM manufacturers have the ability to make extremely fast DRAM, but for some reason have chosen not to. Below you will find a brief summary of each of the latency components, followed by techniques that have been developed thus far to reduce their impact.
Address decoding latency
Much of this delay comes from the timing signals needed to drive RAS and CAS. Any excess in the signal spacing (time between signals) adds directly to the delay, therefore it is critical to time these inputs precisely to obtain maximum performance. Some designs do not multiplex the addresses in order to eliminate this constraint, requiring extra package pins, and hence increased area.
Word line activation latency
This delay is caused by the large number of gates that are connected to the word lines. The resulting RC easily accounts for the bulk of the total access latency. The most straightforward approach to this problem is to divide the word line into smaller sections and add buffers. All modern DRAM word lines are subdivided to some extent, but these additional drivers and buffers add area. Many optimizations can be made if ultimate performance is the goal, but at the cost of bit density.
Another solution is to route the word lines in metal rather than polysilicon or silicide. When aluminum word lines are used, researchers have reported a decrease in delay from 23 ns in the conventional (silicide word line) DRAM to a delay of only 8 ns. While few DRAMS employ full metal word lines, many designs shunt the word lines periodically with metal to reduce the RC delay. BiCMOS line drivers have also been presented as alternatives, but similar performance has been achieved using standard CMOS.
Bit line sensing latency
This delay can be reduced by using advanced high speed circuit topologies instead of ones designed for lower area, and by providing better power supply lines to the sense amps. A design technique which looks quite promising is direct sensing. With this method, separate read and write amps are connected to the bit lines, in addition to the conventional rewrite amplifier. This allows the data to be detected by the read amp and passed on to the global I/O before the bit lines are refreshed. There is an area penalty though that us incurred by using this technique, which appears to be about 14%.
The bit line sensing speed is also dependent on the drive voltage of the sense amps. The resistance of the sense amp drive lines has a significant impact on the sensing delay. Many of the same techniques used to reduce RC delay in the word line may be applicable here. Some designs boost the sense amp drive to obtain better performance. This method appears to accelerate sensing by 8 ns. The cell-to-bit-line capacitance ratio also directly impacts the sensing delay. To reduce the delay, this ratio should be as high as possible. In practice, this means increasing cell capacitance or putting fewer cells on a bit line, both of which come with their own penalties of increasing area.
Output driving latency
Like the word line activation latency, this delay is basically caused by the RC of driving a long wire across the die. Again, circuit techniques may be used to reduce delay, as well as careful layout to reduce the length of the wire. Also, new packaging methods such as lead-on-chip can be used to reduce wire length.
Click on the links to learn more about Memory Speeds and anticipated Memory Trends in 2001.
Need to determine how much memory you need?
This page updated: 8/12/2000