Skip to content
| Marketplace
Sign in
Visual Studio Code>Linters>Sentinel AI SafetyNew to Visual Studio Code? Get it now.
Sentinel AI Safety

Sentinel AI Safety

Sentinel Seed Team

|
3 installs
| (0) | Free
AI safety guardrails for LLM prompts using the THSP protocol (Truth, Harm, Scope, Purpose)
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Sentinel AI Safety - VS Code Extension

AI safety guardrails for LLM prompts using the THSP protocol (Truth, Harm, Scope, Purpose).

Sentinel VS Code Marketplace

Features

Two Analysis Modes

Mode Method Accuracy Requires
Semantic (recommended) LLM-based analysis High (~90%) API key (OpenAI or Anthropic)
Heuristic (fallback) Pattern matching Limited (~50%) Nothing

For accurate results, configure an LLM API key. Heuristic mode uses pattern matching which has significant false positives/negatives.

Real-time Safety Linting

The extension automatically detects potentially unsafe patterns in your prompts:

  • Jailbreak attempts: "ignore previous instructions", persona switches
  • Harmful content: weapons, hacking, malware references
  • Deception patterns: fake documents, impersonation
  • Purposeless actions: requests lacking legitimate benefit

Commands

Command Description
Sentinel: Analyze Selection for Safety Analyze selected text using THSP protocol
Sentinel: Analyze File for Safety Analyze entire file
Sentinel: Insert Alignment Seed Insert standard seed (~1,400 tokens)
Sentinel: Insert Minimal Alignment Seed Insert minimal seed (~450 tokens)
Sentinel: Set OpenAI API Key (Secure) Store API key securely
Sentinel: Set Anthropic API Key (Secure) Store API key securely
Sentinel: Show Status Show current analysis mode and provider

The THSP Protocol

Every request is evaluated through four gates:

Gate Question
Truth Does this involve deception?
Harm Could this cause harm?
Scope Is this within boundaries?
Purpose Does this serve legitimate benefit?

All four gates must pass for content to be considered safe.

Configuration

Recommended: Enable Semantic Analysis

For accurate analysis, configure an LLM API key using the secure method:

  1. Open Command Palette (Ctrl+Shift+P or Cmd+Shift+P)
  2. Run Sentinel: Set OpenAI API Key (Secure) or Sentinel: Set Anthropic API Key (Secure)
  3. Enter your API key (stored encrypted in VS Code's SecretStorage)

Alternatively, you can set keys in VS Code Settings (less secure - stored in plaintext).

Supported Providers

Currently supported:

  • OpenAI (GPT-4o, GPT-4o-mini, etc.)
  • Anthropic (Claude 3 Haiku, Sonnet, Opus)

Planned for future versions:

  • Azure OpenAI (enterprise)
  • Ollama (local/free)
  • OpenAI-compatible endpoints (Groq, Together AI, etc.)

All Settings

Setting Default Description
sentinel.enableRealTimeLinting true Enable real-time safety linting
sentinel.seedVariant standard Default seed variant (minimal/standard)
sentinel.highlightUnsafePatterns true Highlight unsafe patterns
sentinel.llmProvider openai LLM provider (openai/anthropic)
sentinel.openaiApiKey "" OpenAI API key (use secure command instead)
sentinel.openaiModel gpt-4o-mini OpenAI model to use
sentinel.anthropicApiKey "" Anthropic API key (use secure command instead)
sentinel.anthropicModel claude-3-haiku-20240307 Anthropic model to use
sentinel.useSentinelApi false Use Sentinel API for analysis
sentinel.apiEndpoint https://api.sentinelseed.dev/api/v1/guard Sentinel API endpoint

Usage Examples

Checking Prompts for Safety Issues

  1. Select the text you want to analyze
  2. Right-click and choose "Sentinel: Analyze Selection for Safety"
  3. View the THSP gate results with confidence level

Understanding Analysis Results

The extension shows:

  • Method: Semantic (LLM) or Heuristic (pattern matching)
  • Confidence: How reliable the analysis is
  • Gate results: Pass/fail for each THSP gate
  • Issues: Specific concerns detected
  • Reasoning: Explanation (semantic mode only)

Severity Levels

  • 🔴 Error: High-risk patterns (weapons, safety bypass)
  • 🟡 Warning: Potential issues (jailbreak attempts)
  • 🔵 Information: Consider reviewing
  • 💡 Hint: Suggestions (missing Sentinel seed)

Semantic vs Heuristic Analysis

Semantic Analysis (Recommended)

Uses an LLM to understand content contextually:

  • ✅ Understands context ("hack my productivity" vs malicious hacking)
  • ✅ Detects paraphrased harmful content
  • ✅ Provides reasoning for decisions
  • ✅ ~90% confidence

Heuristic Analysis (Fallback)

Uses pattern matching for basic detection:

  • ⚠️ May flag legitimate content (false positives)
  • ⚠️ May miss paraphrased threats (false negatives)
  • ⚠️ No contextual understanding
  • ⚠️ ~50% confidence

Supported Languages

  • Markdown
  • Plain text
  • Python
  • JavaScript/TypeScript
  • JSON
  • YAML

Links

  • Sentinel Website
  • Documentation
  • GitHub
  • PyPI Package
  • npm Package

License

MIT License - See LICENSE for details.


Made by Sentinel Team

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2025 Microsoft