Scalable Matrix Decompositions with Multiple Cores on FPGAs

Department

Computer Science

Document Type

Article

Publication Date

11-1-2013

Abstract

Hardware accelerators are getting increasingly important in heterogeneous systems for many applications, including those that employ matrix decompositions. In recent years, a class of tiled matrix decomposition algorithms has been proposed for out-of-memory computations and multi-core architectures including GPU-based heterogeneous systems. However, on FPGAs these scalable solutions for large matrices are rarely found. In this paper we use the latest tiled decomposition algorithms from high performance linear algebra for off-chip memory access and loop mapping on multiple processing cores for on-chip computation to perform scalable and high performance QR and LU matrix decompositions on FPGAs.

Journal Title

Microprocessors and Microsystems: Embedded Hardware Design

Journal ISSN

0141-9331

Volume

37

Issue

8

First Page

887

Last Page

898

Digital Object Identifier (DOI)

10.1016/j.micpro.2012.06.008

Share

COinS