B-Fetch: Improving Future Computer System Energy-Efficiency and Performance and through Efficient and Accurate Memory System Speculation

As in many fields, energy efficiency has become a first order design
constraint in modern computer system design. A recent report by
J. Koomey of the Lawrence Berkeley National Laboratory estimated that
datacenter computing is now consuming ~2% of global power generation,
and growing at a 16% annual rate. To deal with this problem, many
computer manufacturers have gone to chip-multiprocessor designs, or
CPU chip designs with many processor cores, as a way to lower power
consumption. This lower power consumption, however, has come with
some cost to performance of individual threads.

Programs written in high-level languages, such as C++ and Java, are
compiled into a machine-specific assembly language. In assembly, data
movement between the main memory and the processor is explicit and
often forms a bottleneck for performance. Chip-multiprocessors place
an even greater demand on the memory system that previous generation
uni-core processors, while at the same time avoiding many of the
power-hungry speculative techniques that older uni-processors used to
alleviate memory latency. Data-prefetching, speculatively requesting
data before it is needed, so that when it is requested by the program
it is already available, is a well known technique to alleviate the
effect of memory system latency on performance. Many data-prefetching
schemes, however, either require a high overhead in power and area or
do not perform particularly well across all applications.

A novel data-prefetching scheme was recently proposed by Reena Panda,
a graduate student at Texas A&M, her advisor Dr. Paul V. Gratz and
their collaborator Dr. Daniel Jiménez at UTSA. This scheme, known as
B-Fetch is a “prefetching mechanism for light weight, in-order
processors that leverages a combination of control path speculation
and effective address value speculation.” This unique approach yields
better performance, at approximately 1/3 the power and area of prior
best-of-class data-prefetching schemes.


Panda, Gratz and Jiménez detail their innovative method in an article
entitled “B-Fetch: Branch Prediction Directed Prefetching for In-Order
Processors,” published in the Journal, IEEE Computer Architecture
Letters, last year. After examination by a selections committee,
Panda, Gratz and Jiménez’s article was one of four chosen to receive
the award of “Best Papers from IEEE Computer Architecture Letters”
from 2011, and will be featured during a special session during the
The 18th IEEE International Symposium on High Performance Computer

After the completion of her M.S., Panda accepted a job with Oracle,
where she is currently working as a processor design engineer.