Doc Cleanup
Clean up noisy document content using configurable regex replace rules. Useful for normalizing timestamps, IDs, or other dynamic values before comparing files.
Features
Clean Documents
Run Doc Cleanup: Clean Active Document from the Command Palette. If you have multiple rule groups, you'll be prompted to pick which group to apply (or apply all at once).
Rule Groups
Organize rules into named groups (e.g. "JSON Files", "Logs", "Markdown") and apply them independently. Each group contains its own set of regex replace rules.
Visual Rule Configuration
Run Doc Cleanup: Configure Rules to open a tabbed UI for managing rule groups:
- Create, rename, and delete groups
- Add, edit, and delete rules per group
Import/Export
- Import: Load rule groups from a JSON file
- Export: Save current rule groups to a JSON file for backup or sharing
Commands
| Command |
Description |
Doc Cleanup: Clean Active Document |
Apply rules from a selected group (or all groups) to the current editor |
Doc Cleanup: Configure Rules |
Open the rule groups configuration UI |
Settings
| Setting |
Type |
Default |
Description |
docCleanup.ruleGroups |
array |
[] |
Groups of regex replace rules. Each group can be applied independently. |
Rule Group Structure
Each group in docCleanup.ruleGroups has:
- name (required): Name of the group
- rules (required): Array of rules, each with:
- pattern (required): Regex pattern string (without leading/trailing slashes)
- replacement (required): Replacement string
- description (optional): Short note about what the rule does
- flags (optional): Regex flags (
g is added automatically)
Replacement Tokens
Replacement strings support standard regex substitution tokens:
| Token | Description |
| --------- | ------------------------- |
| $1–$9 | Captured group references |
| $& | The entire matched text |
| $`` | Text before the match | | $' | Text after the match | |$$ | Literal$` character |
Example Rules
Settings JSON
"docCleanup.ruleGroups": [
{
"name": "JSON Normalization",
"rules": [
{
"description": "Normalize timestamps",
"pattern": "\"timestamp\":\\s*\"[^\"]+\"",
"replacement": "\"timestamp\": \"<TIMESTAMP>\""
},
{
"description": "Normalize UUIDs",
"pattern": "\"id\":\\s*\"[0-9a-fA-F-]{36}\"",
"replacement": "\"id\": \"<UUID>\"",
"flags": "i"
}
]
},
{
"name": "Log Cleanup",
"rules": [
{
"description": "Remove line numbers",
"pattern": "^\\d+:\\s*",
"replacement": "",
"flags": "gm"
}
]
}
]
Using Capture Groups
Swap key-value pairs:
{
"description": "Swap key=value to value=key",
"pattern": "(\\w+)=(\\w+)",
"replacement": "$2=$1"
}
Exported files use this format:
{
"ruleGroups": [
{
"name": "Group Name",
"rules": [
{
"description": "Rule 1",
"pattern": "...",
"replacement": "..."
}
]
}
]
}
Import also supports legacy formats: a flat array of rules, or an object with a rules property.
Notes
- Rules run in order; each rule applies to the output of the previous one
- Invalid rules are skipped and reported in the Doc Cleanup output channel
- The
g (global) flag is added automatically to all patterns
Development
npm run compile # Build
npm test # Run tests
npm run test:watch # Run tests in watch mode
npm run package # Build and package .vsix