Sparse matrix-matrix multiplication (SpMM) is a crucial kernel in various applications, including sparse deep neural networks [1]–[6], graph analytics [7], triangle counting [8], and linear algebra ...
MIT researchers have designed silicon structures that can perform calculations in an electronic device using excess heat instead of electricity. These tiny structures could someday enable more ...
Abstract: In modern machine learning models like Transformers, matrix multiplication dominates most computation. Specific hardware often uses large-scale PE arrays, such as systolic arrays, to ...
This project is intended for research purposes only. Use it at your own risk and discretion. Triton is a language and compiler for writing highly efficient ML primitives, one of the most common ...
This repository contains the artifact for the SC '25 paper submission "KAMI: Communication-Avoiding General Matrix Multiplication within a Single GPU." The NVIDIA GH200 is installed with Ubuntu 22.04 ...