Doc Cleanup

Clean up noisy document content using configurable regex replace rules. Useful for normalizing timestamps, IDs, or other dynamic values before comparing files.

Features

Clean Documents

Run Doc Cleanup: Clean Active Document from the Command Palette. If you have multiple rule groups, you'll be prompted to pick which group to apply (or apply all at once).

Rule Groups

Organize rules into named groups (e.g. "JSON Files", "Logs", "Markdown") and apply them independently. Each group contains its own set of regex replace rules.

Visual Rule Configuration

Run Doc Cleanup: Configure Rules to open a tabbed UI for managing rule groups:

Create, rename, and delete groups
Add, edit, and delete rules per group

Import/Export

Import: Load rule groups from a JSON file
Export: Save current rule groups to a JSON file for backup or sharing

Commands

Command	Description
`Doc Cleanup: Clean Active Document`	Apply rules from a selected group (or all groups) to the current editor
`Doc Cleanup: Configure Rules`	Open the rule groups configuration UI

Settings

Setting	Type	Default	Description
`docCleanup.ruleGroups`	array	`[]`	Groups of regex replace rules. Each group can be applied independently.

Rule Group Structure

Each group in docCleanup.ruleGroups has:

name (required): Name of the group
rules (required): Array of rules, each with:
- pattern (required): Regex pattern string (without leading/trailing slashes)
- replacement (required): Replacement string
- description (optional): Short note about what the rule does
- flags (optional): Regex flags (g is added automatically)

Replacement Tokens

Replacement strings support standard regex substitution tokens:

| Token | Description | | --------- | ------------------------- | | $1–$9 | Captured group references | | $& | The entire matched text | | $`` | Text before the match | | $' | Text after the match | |$$ | Literal$` character |

Example Rules

Settings JSON

"docCleanup.ruleGroups": [
  {
    "name": "JSON Normalization",
    "rules": [
      {
        "description": "Normalize timestamps",
        "pattern": "\"timestamp\":\\s*\"[^\"]+\"",
        "replacement": "\"timestamp\": \"<TIMESTAMP>\""
      },
      {
        "description": "Normalize UUIDs",
        "pattern": "\"id\":\\s*\"[0-9a-fA-F-]{36}\"",
        "replacement": "\"id\": \"<UUID>\"",
        "flags": "i"
      }
    ]
  },
  {
    "name": "Log Cleanup",
    "rules": [
      {
        "description": "Remove line numbers",
        "pattern": "^\\d+:\\s*",
        "replacement": "",
        "flags": "gm"
      }
    ]
  }
]

Using Capture Groups

Swap key-value pairs:

{
  "description": "Swap key=value to value=key",
  "pattern": "(\\w+)=(\\w+)",
  "replacement": "$2=$1"
}

Import/Export Format

Exported files use this format:

{
  "ruleGroups": [
    {
      "name": "Group Name",
      "rules": [
        {
          "description": "Rule 1",
          "pattern": "...",
          "replacement": "..."
        }
      ]
    }
  ]
}

Import also supports legacy formats: a flat array of rules, or an object with a rules property.

Notes

Rules run in order; each rule applies to the output of the previous one
Invalid rules are skipped and reported in the Doc Cleanup output channel
The g (global) flag is added automatically to all patterns

Development

npm run compile    # Build
npm test           # Run tests
npm run test:watch # Run tests in watch mode
npm run package    # Build and package .vsix

Doc Cleanup

sakinis

Doc Cleanup

Features

Clean Documents

Rule Groups

Visual Rule Configuration

Import/Export

Commands

Settings

Rule Group Structure

Replacement Tokens

Example Rules

Settings JSON

Using Capture Groups

Import/Export Format

Notes

Development