Lexicon

Your code should speak your domain. Does it?
Open any codebase and read the type names, the method names, the variables. If
what you see is OrderManager, PaymentHandler, DataProcessor, UserUtils —
a wall of Manager, Handler, Service, Data — then your code isn't speaking
your domain. It's speaking plumbing. The ubiquitous language your team argues
about in the standup never made it into the source.
Lexicon makes that visible.
What it does
Lexicon walks your workspace, extracts every identifier, splits it on
camelCase / PascalCase / snake_case boundaries, strips the language noise
(err, ctx, string, return, …) and renders what's left as a word cloud.
The biggest words are the words your code actually says most. That's your real,
de-facto ubiquitous language — for better or worse.
Running it
- Open the codebase you want to analyse as a workspace folder.
- Command Palette (
Cmd/Ctrl+Shift+P) → Lexicon: Analyse Codebase.
- Read the cloud.
It runs entirely locally. No server, no subprocess, no network calls, no
telemetry. Nothing leaves your machine.
Reading the cloud
Every word is the same colour. The signal is size — the bigger the word,
the more your code says it. There's no built-in verdict on which words are
"good" or "bad", because that's domain-specific: data is plumbing in a payments
service and the whole point in an analytics product; model is a smell in one
codebase and the core noun in an ML one. Lexicon won't pretend to know which is
which — it shows you what your code actually says and trusts you to read it.
So read it. If the loudest words are Manager, Handler, Service, Data,
http, util — mechanism, not meaning — that's your sign the ubiquitous
language never made it into the code. If the loudest words are the nouns and
verbs your domain experts would recognise (invoice, playlist, settlement,
roster), the language is there. The cloud doesn't grade you; it just makes the
state impossible to miss.
When mechanism dominates:
- Rename the worst offenders toward the language your domain experts actually use.
- Ask whether a
*Manager or *Service is hiding a real concept that deserves
its own name.
- Treat it as a rough health check, not a score. The goal isn't a pure cloud —
it's that the loudest words are words your business would recognise.
Tuning what's counted — .lexiconignore
Lexicon strips only language grammar by default — keywords, primitive types,
and literals (return, func, class, string, static). It does not
pre-decide that http, fmt, println, slog or generic verbs are noise.
Whether those words are plumbing or signal is a call only your codebase can make,
so the tool surfaces them and lets you decide.
To suppress terms for a specific project, drop a .lexiconignore file in the
workspace root — one term per line, # for comments:
# .lexiconignore — terms this codebase treats as plumbing
http
fmt
errorf
slog
println
base64
# generic CRUD verbs, if they're drowning the domain words
get
set
fetch
Terms match the lowercased, split words (so http also strips it out of
httpClient). The subtitle reports how many terms were ignored, so a tidy cloud
is never silently curated.
A word of caution: a clean-looking cloud you achieved by ignoring everything
isn't a clean codebase. The point is to see the state as it is — reach for
.lexiconignore to remove genuine background hum, not to hide the finding.
Supported languages
Go, TypeScript/JavaScript (incl. JSX/TSX), Python, Java, C#, Ruby, Rust, Kotlin,
Swift, and C/C++.
Built by uRadical.