Sentinel AI Safety: IDE Extension
AI safety guardrails for LLM prompts using the THSP protocol (Truth, Harm, Scope, Purpose).

Supported IDEs
Note: Cursor and Windsurf are VS Code forks that use the OpenVSX registry. The same extension works across all supported IDEs.
Features
Two Analysis Modes
| Mode |
Method |
Accuracy |
Requires |
| Semantic (recommended) |
LLM-based analysis |
High (~90%) |
LLM provider (OpenAI, Anthropic, Ollama, Groq) |
| Heuristic (fallback) |
Pattern matching |
Limited (~50%) |
Nothing |
For accurate results, configure an LLM provider. Heuristic mode uses pattern matching which has significant false positives/negatives.
Real-time Safety Linting
The extension automatically detects potentially unsafe patterns in your prompts:
- Jailbreak attempts: "ignore previous instructions", persona switches
- Harmful content: weapons, hacking, malware references
- Deception patterns: fake documents, impersonation
- Purposeless actions: requests lacking legitimate benefit
Commands
| Command |
Description |
Sentinel: Analyze |
Analyze selected text using THSP protocol |
Sentinel: Analyze File |
Analyze entire file |
Sentinel: Insert Seed |
Insert standard seed (~1,000 tokens) |
Sentinel: Insert Seed (Minimal) |
Insert minimal seed (~360 tokens) |
Sentinel: Set OpenAI Key |
Store OpenAI API key securely |
Sentinel: Set Anthropic Key |
Store Anthropic API key securely |
Sentinel: Set Custom API Key |
Store key for OpenAI-compatible endpoints |
Sentinel: Status |
Show current analysis mode and provider |
Sentinel: Compliance |
Run all compliance checks (EU AI Act, OWASP, CSA) |
Sentinel: EU AI Act |
EU AI Act (2024/1689) assessment |
Sentinel: OWASP |
OWASP LLM Top 10 vulnerability scan |
Sentinel: CSA |
CSA AI Controls Matrix assessment |
Sentinel: Scan Secrets |
Scan for API keys and credentials |
Sentinel: Sanitize |
Check for prompt injection patterns |
Sentinel: Validate |
Validate LLM output for security issues |
Sentinel: SQL Injection Scan |
Detect SQL injection patterns in prompts |
Sentinel: Metrics Dashboard |
View analysis statistics and history |
Sentinel: Clear Metrics |
Clear all stored metrics |
The THSP Protocol
Every request is evaluated through four gates:
| Gate |
Question |
| Truth |
Does this involve deception? |
| Harm |
Could this cause harm? |
| Scope |
Is this within boundaries? |
| Purpose |
Does this serve legitimate benefit? |
All four gates must pass for content to be considered safe.
Configuration
Recommended: Enable Semantic Analysis
For accurate analysis, configure an LLM API key using the secure method:
- Open Command Palette (
Ctrl+Shift+P or Cmd+Shift+P)
- Run
Sentinel: Set OpenAI Key or Sentinel: Set Anthropic Key
- Enter your API key (stored encrypted in VS Code's SecretStorage)
Alternatively, you can set keys in VS Code Settings (less secure, stored in plaintext).
Supported Providers
| Provider |
API Key Required |
Description |
| OpenAI |
Yes |
GPT-4o, GPT-4o-mini, etc. |
| Anthropic |
Yes |
Claude 3 Haiku, Sonnet, Opus |
| Ollama |
No |
Local models (llama3.2, mistral, qwen2.5) |
| OpenAI-compatible |
Yes |
Groq, Together AI, or any OpenAI-compatible API |
Ollama (Local, Free)
Run models locally with no API key:
- Install Ollama
- Pull a model:
ollama pull llama3.2
- Start the server:
ollama serve
- In VS Code Settings (
Ctrl+,), search for "sentinel" and set:
sentinel.llmProvider: ollama
sentinel.ollamaModel: llama3.2 (or your preferred model)
OpenAI-Compatible Endpoints (Groq, Together AI)
Use any OpenAI-compatible API:
- Get API key from your provider (e.g., Groq, Together AI)
- Run
Sentinel: Set Custom API Key command
- Configure in settings:
sentinel.llmProvider: openai-compatible
sentinel.openaiCompatibleEndpoint: Your API URL
sentinel.openaiCompatibleModel: Model name
Popular endpoints:
| Provider | Endpoint | Example Model |
|----------|----------|---------------|
| Groq | https://api.groq.com | llama-3.3-70b-versatile |
| Together AI | https://api.together.xyz | meta-llama/Llama-3.3-70B-Instruct-Turbo |
All Settings
| Setting |
Default |
Description |
sentinel.enableRealTimeLinting |
true |
Enable real-time safety linting |
sentinel.seedVariant |
standard |
Default seed variant (minimal/standard) |
sentinel.highlightUnsafePatterns |
true |
Highlight unsafe patterns |
sentinel.llmProvider |
openai |
LLM provider (openai/anthropic/ollama/openai-compatible) |
sentinel.openaiApiKey |
"" |
OpenAI API key |
sentinel.openaiModel |
gpt-4o-mini |
OpenAI model |
sentinel.anthropicApiKey |
"" |
Anthropic API key |
sentinel.anthropicModel |
claude-3-haiku-20240307 |
Anthropic model |
sentinel.ollamaEndpoint |
http://localhost:11434 |
Ollama server endpoint |
sentinel.ollamaModel |
llama3.2 |
Ollama model |
sentinel.openaiCompatibleEndpoint |
"" |
Custom API endpoint (Groq, Together AI) |
sentinel.openaiCompatibleApiKey |
"" |
Custom API key |
sentinel.openaiCompatibleModel |
llama-3.3-70b-versatile |
Custom API model |
Usage Examples
Checking Prompts for Safety Issues
- Select the text you want to analyze
- Right-click and choose "Sentinel: Analyze"
- View the THSP gate results with confidence level
Understanding Analysis Results
The extension shows:
- Method: Semantic (LLM) or Heuristic (pattern matching)
- Confidence: How reliable the analysis is
- Gate results: Pass/fail for each THSP gate
- Issues: Specific concerns detected
- Reasoning: Explanation (semantic mode only)
Severity Levels
- 🔴 Error: High-risk patterns (weapons, safety bypass)
- 🟡 Warning: Potential issues (jailbreak attempts)
- 🔵 Information: Consider reviewing
- 💡 Hint: Suggestions (missing Sentinel seed)
Semantic vs Heuristic Analysis
Semantic Analysis (Recommended)
Uses an LLM to understand content contextually:
- ✅ Understands context ("hack my productivity" vs malicious hacking)
- ✅ Detects paraphrased harmful content
- ✅ Provides reasoning for decisions
- ✅ ~90% confidence
Heuristic Analysis (Fallback)
Uses pattern matching for basic detection:
- ⚠️ May flag legitimate content (false positives)
- ⚠️ May miss paraphrased threats (false negatives)
- ⚠️ No contextual understanding
- ⚠️ ~50% confidence
Compliance Checking
The extension includes regulatory compliance checking against three major frameworks:
Supported Frameworks
| Framework |
Coverage |
Description |
| EU AI Act |
Article 5 prohibited practices, Annex III high-risk contexts |
Risk classification (unacceptable/high/limited/minimal) |
| OWASP LLM Top 10 |
6/10 vulnerabilities with strong THSP coverage |
Input and output validation against LLM security risks |
| CSA AI Controls Matrix |
10/18 domains with THSP support |
Security domains and threat category assessment |
OWASP LLM Top 10 Coverage
| Vulnerability |
THSP Gates |
Coverage |
| LLM01: Prompt Injection |
Scope |
Strong |
| LLM02: Sensitive Info Disclosure |
Truth, Harm |
Strong |
| LLM05: Improper Output Handling |
Truth, Harm |
Strong |
| LLM06: Excessive Agency |
Scope, Purpose |
Strong |
| LLM07: System Prompt Leakage |
Scope |
Moderate |
| LLM09: Misinformation |
Truth |
Strong* |
*Note on LLM09 (Misinformation): Heuristic detection of misinformation is inherently limited. Pattern matching can identify obvious indicators (overconfident claims, dangerous medical advice, uncited sources), but accurate misinformation detection requires semantic analysis with an LLM. For best results with LLM09, configure an API key for semantic mode.
Infrastructure-Level Vulnerabilities
The following vulnerabilities require infrastructure-level controls and are outside THSP's behavioral scope:
- LLM03: Supply Chain: Use verified dependencies and model provenance
- LLM04: Data/Model Poisoning: Requires training pipeline controls
- LLM08: Vector/Embedding Weaknesses: RAG pipeline security
- LLM10: Unbounded Consumption: Rate limiting and quotas
Supported Languages
- Markdown
- Plain text
- Python
- JavaScript/TypeScript
- JSON
- YAML
Installation by IDE
VS Code
- Open VS Code
- Go to Extensions (
Ctrl+Shift+X)
- Search for "Sentinel AI Safety"
- Click Install
Or install via command line:
code --install-extension sentinelseed.sentinel-ai-safety
Cursor
Cursor uses the OpenVSX registry. To install:
- Open Cursor
- Go to Extensions (
Ctrl+Shift+X)
- Search for "Sentinel AI Safety"
- Click Install
If the extension doesn't appear, you can install manually:
- Download
.vsix from OpenVSX
- In Cursor:
Ctrl+Shift+P, then "Extensions: Install from VSIX..."
Windsurf
Windsurf also uses OpenVSX:
- Open Windsurf
- Go to Extensions panel
- Search for "Sentinel AI Safety"
- Click Install
Manual Installation (Any IDE)
For any VS Code-compatible IDE:
- Download the
.vsix file from Releases
- Open Command Palette (
Ctrl+Shift+P)
- Run "Extensions: Install from VSIX..."
- Select the downloaded file
MCP Server Alternative
For deeper integration with AI assistants in Cursor or Windsurf, you can also use the Sentinel MCP Server. See MCP Server documentation.
Links
License
MIT License. See LICENSE for details.
Made by Sentinel Team