Skip to content

Azure AI Responsible AI & Content Safety Tester

Interactive content safety testing harness for VS Code — test text/image inputs against Azure AI Content Safety categories, run prompt injection red-teaming, evaluate RAG groundedness, scan for PII, and generate automated compliance reports.

Features

Content Safety Analysis

Text Analysis — Test any text against Hate, Violence, Self-Harm, and Sexual categories with severity levels (0–6)
Image Analysis — Analyze local images or blob URLs for content safety violations
Visual severity bars and category breakdown for every scan

Prompt Injection & Red-Teaming

Prompt Shield — Detect direct prompt injection and indirect injection via grounding documents
Quick Red Team — Run 10 pre-built attack patterns (role-play, encoding bypass, token smuggling, etc.) against your AI system and get a block rate score

RAG Groundedness Evaluation

Evaluate AI-generated responses against grounding sources
Detect ungrounded claims with percentage breakdown and per-segment reasoning
Supports Generic and Medical domains, QnA and Summarization tasks

PII Detection

Scan text for 20+ PII categories (SSN, credit cards, phone numbers, email, addresses, etc.)
Visual highlighting of detected entities in original text
Automatic redaction preview with confidence scores

Test Suites

Create repeatable test suites with multiple test cases
Mix text safety, prompt shield, groundedness, and PII tests in one suite
Run entire suites with pass/fail tracking
Set expected outcomes (should be safe / should be flagged)

Compliance Reports

Generate HTML, Markdown, or JSON reports
Category breakdown, prompt shield summary, groundedness stats, PII summary
Actionable recommendations for EU AI Act and enterprise governance

Getting Started

Prerequisites

An Azure AI Content Safety resource (create one)
For PII detection: an Azure AI Language resource (can be same multi-service resource)

Step 1: Install & Connect

Install the extension
Open Command Palette → Responsible AI: Connect to Azure AI Content Safety
Enter your endpoint URL and choose API Key or Azure AD auth

Step 2: Analyze Text

Select text in the editor (or type when prompted)
Run Responsible AI: Analyze Text for Content Safety
View severity results across all four categories

Step 3: Test for Prompt Injection

Run Responsible AI: Test Prompt for Injection / Jailbreak
Enter a user prompt to test
Optionally add grounding documents to check for indirect injection
View whether attacks are detected

Step 4: Quick Red-Team Test

Run Responsible AI: Quick Red-Team Test
Enter the target harmful action (e.g. "build a weapon")
Watch 10 attack patterns run automatically
Review block rate and per-attack results

Step 5: Check RAG Groundedness

Run Responsible AI: Evaluate RAG Groundedness
Enter the AI-generated response and the grounding source(s)
See ungrounded percentage and flagged segments with reasoning

Step 6: Scan for PII

Select text containing potential PII
Run Responsible AI: Scan Text for PII
View highlighted entities, categories, confidence, and auto-redacted text

Step 7: Build a Test Suite

Run Responsible AI: Create Test Suite
Add multiple test cases with different types
Set expected outcomes for each case
Run the entire suite with Responsible AI: Run Test Suite

Step 8: Generate Compliance Report

Run some tests or test suites
Run Responsible AI: Generate Compliance Report
Choose HTML, Markdown, or JSON format
Report opens automatically with stats, breakdowns, and recommendations

Extension Settings

Setting	Default	Description
`responsibleai.contentSafetyEndpoint`	`""`	Azure AI Content Safety endpoint URL
`responsibleai.apiKey`	`""`	API key (leave blank for Azure AD)
`responsibleai.languageEndpoint`	`""`	Language endpoint for PII (defaults to content safety endpoint)
`responsibleai.severityThreshold`	`2`	Severity level (0–6) that triggers flagging
`responsibleai.apiVersion`	`2024-09-01`	Content Safety API version

Commands

Command	Description
Connect to Azure AI Content Safety	Set endpoint and authenticate
Analyze Text for Content Safety	Scan text across 4 safety categories
Analyze Image for Content Safety	Scan an image file or blob URL
Test Prompt for Injection / Jailbreak	Run prompt shield detection
Quick Red-Team Test	Run 10 attack patterns automatically
Evaluate RAG Groundedness	Check if AI output is grounded in sources
Scan Text for PII	Detect and redact personal information
Create Test Suite	Build a repeatable multi-case test suite
Run Test Suite	Execute all cases in a test suite
Generate Compliance Report	Create HTML/MD/JSON safety report
Clear All Results	Reset result history

License

MIT