⚠️ DEPRECATED — This repo has been merged into openclaw-local-asr-skill. Please use the unified repo for all future updates. This repo is kept for reference only.
flash-attention-with-sink implements an attention variant used in GPT-OSS 20B that integrates a "sink" step into FlashAttention. This repo focuses on the forward path and provides an experimental ...