PROJECT TITLE :
Adaptive Cache and Concurrency Allocation on GPGPUs
Memory bandwidth is critical to GPGPU performance. Exploiting locality in caches can better utilize memory bandwidth. But, memory requests issued by excessive threads cause cache thrashing and saturate memory bandwidth, degrading performance. In this paper, we tend to propose adaptive cache and concurrency allocation (CCA) to stop cache thrashing and improve the use of bandwidth and computational resources, hence improving performance. In step with locality and reuse distance of access patterns in GPGPU program, warps on a stream multiprocessor are dynamically divided into three teams: cached, bypassed, and waiting. The information cache accommodates the footprint of cached warps. Bypassed warps cannot allocate cache lines in the information cache to forestall cache thrashing, but are in a position to take advantage of obtainable memory bandwidth and computational resource. Waiting warps are de-scheduled. Experimental results show that adaptive CCA can important improve benchmark performance, with 80 percent harmonic mean IPC improvement over the baseline.
Did you like this research project?
To get this research project Guidelines, Training and Code... Click Here