Accelerating Matrix Operations with Improved Deeply Pipelined Vector Reduction

Department

Computer Science

Document Type

Article

Publication Date

2-1-2012

Abstract

Many scientific or engineering applications involve matrix operations, in which reduction of vectors is a common operation. If the core operator of the reduction is deeply pipelined, which is usually the case, dependencies between the input data elements cause data hazards. To tackle this problem, we propose a new reduction method with low latency and high pipeline utilization. The performance of the proposed design is evaluated for both single data set and multiple data set scenarios. Further, QR decomposition is used to demonstrate how the proposed method can accelerate its execution. We implement the design on an FPGA and compare its results to other methods.

Journal Title

IEEE Transactions on Parallel and Distributed Systems

Journal ISSN

1045-9219

Volume

23

Issue

2

First Page

202

Last Page

210

Digital Object Identifier (DOI)

10.1109/TPDS.2011.141

Share

COinS