Public
dense
9 items
..
Go back to parent directory
collective
Jan 23, 2026
FOLDER
Last modified by WebDev
common
Jan 23, 2026
FOLDER
Last modified by WebDev
device
Jan 23, 2026
FOLDER
Last modified by WebDev
kernel
Jan 23, 2026
FOLDER
Last modified by WebDev
fmha_cutlass_bwd_sm100.cu
3.74 KB
Jan 23, 2026
CU
Last modified by WebDev
fmha_cutlass_bwd_sm100.cuh
8.91 KB
Jan 23, 2026
CUH
Last modified by WebDev
fmha_cutlass_fwd_sm100.cu
3.61 KB
Jan 23, 2026
CU
Last modified by WebDev
fmha_cutlass_fwd_sm100.cuh
13.37 KB
Jan 23, 2026
CUH
Last modified by WebDev
interface.h
876 B
Jan 23, 2026
H
Last modified by WebDev
About
FlashMLA is a collection of highly optimized attention kernels (核心代码模块) developed by DeepSeek-AI. It's not a user-facing app, but rather a foundational library used to power their large language models like DeepSeek-V3 and DeepSeek-V3.2-Exp.
130 files
53 folders
1.13 MB total size
0 open issues
0 open pull requests
0 watchers
0 forks
0 stars
153 views
Updated Jan 23, 2026
Languages
C++
60.1%
C
20.3%
Python
19.5%
LICENSE
0.2%