Public
tests
8 items
..
Go back to parent directory
kernelkit
Jan 23, 2026
FOLDER
Last modified by WebDev
lib.py
16.29 KB
Jan 23, 2026
PY
Last modified by WebDev
quant.py
7.97 KB
Jan 23, 2026
PY
Last modified by WebDev
ref.py
4.45 KB
Jan 23, 2026
PY
Last modified by WebDev
test_flash_mla_dense_decoding.py
8.84 KB
Jan 23, 2026
PY
Last modified by WebDev
test_flash_mla_sparse_decoding.py
13.98 KB
Jan 23, 2026
PY
Last modified by WebDev
test_flash_mla_sparse_prefill.py
5.55 KB
Jan 23, 2026
PY
Last modified by WebDev
test_fmha_sm100.py
7.38 KB
Jan 23, 2026
PY
Last modified by WebDev
About
FlashMLA is a collection of highly optimized attention kernels (核心代码模块) developed by DeepSeek-AI. It's not a user-facing app, but rather a foundational library used to power their large language models like DeepSeek-V3 and DeepSeek-V3.2-Exp.
130 files
53 folders
1.13 MB total size
0 open issues
0 open pull requests
0 watchers
0 forks
0 stars
118 views
Updated Jan 23, 2026
Languages
C++
60.1%
C
20.3%
Python
19.5%
LICENSE
0.2%