Wafer

Wafer makes GPU kernel work feel like a normal dev loop. Stay inside VS Code / Cursor, profile with Nsight Compute, inspect PTX/SASS, jump to the right docs, and iterate with an AI assistant that explains what to try next.

This is built for engineers who can write CUDA but don’t yet have the “profiler + assembly intuition” - and for teams who want faster iteration without burning GPU hours doing CPU work.

Why Wafer

GPU performance workflows are still fragmented:

You profile in one tool, read counters you’re not sure how to prioritize
You inspect PTX/SASS somewhere else, with little context on what matters
You bounce between docs, blog posts, and guesses
If you’re developing remotely, you waste time (and money) keeping a GPU attached while you’re just editing code

Wafer pulls the loop into your editor and makes it repeatable.

What you get

1) Nsight Compute report analysis (NCU)

Open .ncu-rep reports directly in VS Code and get a structured view of what matters:

Kernel duration, compute/memory throughput, occupancy, register pressure signals
A clean “what to look at next” summary (instead of a wall of counters)
Exportable text reports so you can paste results into issues, PRs, or other tools

2) PTX / SASS viewer

See what your kernel compiled into without leaving your editor:

Jump from kernel code to generated PTX/SASS
Spot common issues (memory access patterns, control flow, register pressure hints, instruction mix)
Keep low level output tied to the source that produced it

3) GPU docs agent

A docs assistant for when you’re stuck or unsure what a metric or instruction implies:

CUTLASS and CuTe DSL concepts, layouts, and tensor core paths
PTX ISA navigation (including modern MMA paths)
Multi turn Q&A with citations so you can verify claims

Installation

Option 1: Marketplace (recommended)

Open VS Code or Cursor
Go to Extensions (Cmd+Shift+X)
Search for "Wafer"
Click Install

Option 2: Install a VSIX

Download the .vsix from GitHub Releases
Open VS Code or Cursor
Open the command palette (Cmd+Shift+P)
Run: Extensions: Install from VSIX
Select the downloaded .vsix

Getting started

Click the Wafer icon in the VS Code activity bar
Sign in with GitHub
Pick a tool from the Wafer sidebar dropdown
Start with either:
- an existing .ncu-rep report, or
- a kernel you want to optimize

Requirements

NCU Analysis: Nsight Compute installed, with ncu CLI available on your PATH
GPU Docs Agent: requires the local docs backend (see below)
IntelliSense Headers: about 100MB disk once extracted (CUDA 13.0.2 + CUTLASS 4.3.2)

wafer

Wafer

Wafer

Why Wafer

What you get

1) Nsight Compute report analysis (NCU)

2) PTX / SASS viewer

3) GPU docs agent

Installation

Option 1: Marketplace (recommended)

Option 2: Install a VSIX

Getting started

Requirements