Residential College | false |
Status | 已發表Published |
An FPGA-Based Transformer Accelerator With Parallel Unstructured Sparsity Handling for Question-Answering Applications | |
Cao, Rujian1; Zhao, Zhongyu1; Un, Ka Fai1; Yu, Wei Han1; Martins, Rui P.1,2; Mak, Pui In1 | |
2024-11 | |
Source Publication | IEEE Transactions on Circuits and Systems II-Express Briefs |
ISSN | 1549-7747 |
Volume | 71Issue:11Pages:4688-4692 |
Abstract | Dataflow management provides limited performance improvement to the transformer model due to its lesser weight reuse than the convolution neural network. The cosFormer reduced computational complexity while achieving comparable performance to the vanilla transformer for natural language processing tasks. However, the unstructured sparsity in the cosFormer makes it a challenge to be implemented efficiently. This brief proposes a parallel unstructured sparsity handling (PUSH) scheme to compute sparse-dense matrix multiplication (SDMM) efficiently. It transforms unstructured sparsity into structured sparsity and reduces the total memory access by balancing the memory accesses of the sparse and dense matrices in the SDMM. We also employ unstructured weight pruning cooperating with PUSH to further increase the structured sparsity of the model. Through verification on an FPGA platform, the proposed accelerator achieves a throughput of 2.82 TOPS and an energy efficiency of 144.8 GOPs/W for HotpotQA dataset with long sequences. |
Keyword | Sparse Matrices Computational Modeling Transformers Hardware Energy Efficiency Circuits Throughput Dataflow Digital Accelerator Energy-efficient Field-programmable Gate Array (Fpga) Sparsity Transformer |
DOI | 10.1109/TCSII.2024.3462560 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Engineering |
WOS Subject | Engineering, Electrical & Electronic |
WOS ID | WOS:001348293900026 |
Publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141 |
Scopus ID | 2-s2.0-85204436132 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | Faculty of Science and Technology THE STATE KEY LABORATORY OF ANALOG AND MIXED-SIGNAL VLSI (UNIVERSITY OF MACAU) INSTITUTE OF MICROELECTRONICS DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING |
Corresponding Author | Un, Ka Fai |
Affiliation | 1.University of Macau, State-Key Laboratory of Analog and Mixed-Signal VLSI, Institute of Microelectronics, Faculty of Science and Technology -ECE, Macao 2.Universidade de Lisboa, Instituto Superior Técnico, Portugal |
First Author Affilication | Faculty of Science and Technology |
Corresponding Author Affilication | Faculty of Science and Technology |
Recommended Citation GB/T 7714 | Cao, Rujian,Zhao, Zhongyu,Un, Ka Fai,et al. An FPGA-Based Transformer Accelerator With Parallel Unstructured Sparsity Handling for Question-Answering Applications[J]. IEEE Transactions on Circuits and Systems II-Express Briefs, 2024, 71(11), 4688-4692. |
APA | Cao, Rujian., Zhao, Zhongyu., Un, Ka Fai., Yu, Wei Han., Martins, Rui P.., & Mak, Pui In (2024). An FPGA-Based Transformer Accelerator With Parallel Unstructured Sparsity Handling for Question-Answering Applications. IEEE Transactions on Circuits and Systems II-Express Briefs, 71(11), 4688-4692. |
MLA | Cao, Rujian,et al."An FPGA-Based Transformer Accelerator With Parallel Unstructured Sparsity Handling for Question-Answering Applications".IEEE Transactions on Circuits and Systems II-Express Briefs 71.11(2024):4688-4692. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment