flashmla - Code/tests - AppsGM

Public

tests 8 items

Go back to parent directory

FOLDER Last modified by WebDev

lib.py 16.29 KB

PY Last modified by WebDev

quant.py 7.97 KB

PY Last modified by WebDev

ref.py 4.45 KB

PY Last modified by WebDev

test_flash_mla_dense_decoding.py 8.84 KB

PY Last modified by WebDev

test_flash_mla_sparse_decoding.py 13.98 KB

PY Last modified by WebDev

test_flash_mla_sparse_prefill.py 5.55 KB

PY Last modified by WebDev

test_fmha_sm100.py 7.38 KB

PY Last modified by WebDev

About

FlashMLA is a collection of highly optimized attention kernels (核心代码模块) developed by DeepSeek-AI. It's not a user-facing app, but rather a foundational library used to power their large language models like DeepSeek-V3 and DeepSeek-V3.2-Exp.

130 files

53 folders

1.13 MB total size

0 open issues

0 open pull requests

0 watchers

0 forks

0 stars

1279 views

Updated Jan 23, 2026

Recent Commits View all

Initial commit - Upload project 'flashmla'

WebDev committed Jan 23, 2026

Languages

C++ 60.1%

C 20.3%

Python 19.5%

LICENSE 0.2%