Individual wheel

Flash Attention
for Google Colab

The most painful wheel to compile, now prebuilt and ready to install. Get both flash-attn 2.7.3 and 2.8.3 — guaranteed to work out of the box on A100 and L4 with Colab's current CUDA stack.

One-time purchase
Both versions included

Buy Flash Attention — $5 or get the full pack — $17

✓ v2.7.3 + v2.8.3 ✓ A100 · L4 ✓ Instant token ✓ No compiling

🖼️ Try it with Z-Image Turbo notebook →

Flash Attention — Prebuilt for Colab A100 and L4

Two versions included

Pick the version you need.

Both flash-attn 2.7.3 and 2.8.3 are included with your purchase. Install whichever your project requires — or both.

v2.8.3

flash-attn 2.8.3

Latest stable release. Full Flash Attention 2 with improved kernel dispatch, better memory efficiency, and support for GQA and MQA patterns. Ideal for newer models like Z-Image, SDXL, and Flux.

A100 · SM80 L4 · SM89

v2.7.3

flash-attn 2.7.3

Widely-used stable release. Compatible with the broadest range of models and diffusers versions. Perfect for TRELLIS.2, Stable Diffusion, and workflows that pin to 2.7.x.

A100 · SM80 L4 · SM89

Installation

One line. Sixty seconds.

After purchase, you'll receive a personal token. Use it to install directly in your Colab notebook.

Install v2.8.3 (latest)

# Set your token
import os
os.environ['MISSING_LINK_TOKEN'] = "ml_YOUR_TOKEN"
TOKEN = os.environ['MISSING_LINK_TOKEN']

# Install flash-attn 2.8.3
!pip install --no-deps "https://{TOKEN}@missinglink.build/wheel/flash_attn-2.8.3-cp312-cp312-linux_x86_64.whl"

Install v2.7.3

# Set your token
import os
os.environ['MISSING_LINK_TOKEN'] = "ml_YOUR_TOKEN"
TOKEN = os.environ['MISSING_LINK_TOKEN']

# Install flash-attn 2.7.3
!pip install --no-deps "https://{TOKEN}@missinglink.build/wheel/flash_attn-2.7.3-cp312-cp312-linux_x86_64.whl"

Spec	Guaranteed
GPU	A100 L4
Platform	Google Colab linux x86_64
Python	3.12
CUDA	12.8
PyTorch	2.10

🖼️ Featured notebook

Try it with Z-Image Turbo

State-of-the-art text-to-image generation. Flash Attention 2.8.3 powers the efficient inference — generate 1024×1024 images in seconds on an L4.

🖼️ Text → Image

⚡ ~5s on L4

🔬 Z-Image

Open Z-Image Turbo in Colab Buy Flash Attention — $5

The problem

Why not just compile it yourself?

You can try. But flash-attn is notoriously one of the hardest CUDA packages to build from source on Colab.

✕ Compiling from source

30–90 minutes of GPU time wasted per build. Requires matching exact CUDA toolkit, torch version, and gcc. Frequently fails with cryptic errors. Each new Colab session starts over.

✓ MissingLink wheel

Installs in under 60 seconds. Prebuilt against Colab's exact CUDA 12.8 + PyTorch 2.10 stack. Works every time. One pip install, zero config.

Skip the build.
Ship faster.

Flash Attention 2.7.3 and 2.8.3, prebuilt for every Colab GPU. Five dollars. Zero compiling.

Buy Flash Attention — $5 Get the full Survival Pack — $17

One-time payment via Stripe · No account required · Instant token delivery

Flash Attentionfor Google Colab

flash-attn 2.8.3

flash-attn 2.7.3

Install v2.8.3 (latest)

Install v2.7.3

Try it with Z-Image Turbo

✕ Compiling from source

✓ MissingLink wheel

Skip the build.Ship faster.

Resend your token

Manage subscription

Start your subscription

Flash Attention
for Google Colab

Skip the build.
Ship faster.