Skip to content
| Marketplace
Sign in
Azure DevOps>Azure Pipelines>AllTrue AI Security Scanner
AllTrue AI Security Scanner

AllTrue AI Security Scanner

AllTrue.ai

|
2 installs
| (0) | Free
Comprehensive AI security testing with LLM pentesting, model scanning, and automated Azure Boards work item creation
Get it free

AllTrue Security Testing for AI Systems (Azure DevOps)

Run automated security testing for LLM endpoints and AI models inside Azure Pipelines. The task integrates with the AllTrue platform to discover inventory, execute scans, and optionally create Azure Boards work items for findings.


📖 Table of Contents

  • What This Task Does
  • Installation
  • Quick Start
  • Configuration Reference
    • Required Inputs
    • Core Settings
      • Execution Toggles
      • Inventory Scope
    • LLM Pentest Configuration
      • Model Selection
      • Guardrails
      • System Prompts
      • Capture-Replay Datasets
      • System Descriptions
    • Model Scanning
    • HuggingFace Onboarding
    • Failure Thresholds
    • Azure DevOps Boards Integration
    • Performance & Polling
  • Outputs
  • Artifacts
  • Platform-Specific Considerations
  • Usage Examples
  • Security & Permissions
  • Best Practices
  • Troubleshooting
  • Support

What This Task Does

Core Capabilities

  • ✅ Automated Discovery: Enumerates LLM endpoints and AI models from your AllTrue inventory
  • ✅ LLM Endpoint Pentesting - Test for prompt injection, data leakage, harmful content generation etc.
  • ✅ Model Scanning - Scan AI models for malicious code, security vulnerabilities, policy violations etc.
  • ✅ HuggingFace Integration - Automatically onboard and scan models from HuggingFace Hub
  • ✅ Flexible Scoping - Test at organization, project, or individual resource levels
  • ✅ Parallel Execution - Run multiple tests concurrently with intelligent retry logic
  • ✅ Outcome-Based Control - Configure pipeline behavior based on security outcomes

Advanced Features

  • 🔧 Model Selection - Map specific models to resource types for consistent testing
  • 🛡️ Guardrails Testing - Test with or without safety mechanisms
  • 📝 System Prompts - Configure and test custom system prompts
  • 📊 Capture-Replay - Test with real user interaction patterns
  • 🧾 Azure Boards Integration: Automatically create work items for threshold breaches, failures, and (optionally) per-policy/per-category findings
  • 📈 Comprehensive Reporting - CSV exports and JSON summaries

Execution Modes

The scanner supports two complementary testing approaches:

  1. LLM Endpoint Pentesting (enableLlmPentest): Tests your LLM endpoints for vulnerabilities like prompt injection, data leakage, harmful content generation, and more
  2. Model Scanning (enableModelScanning): Scans AI models and model assets for security issues, malicious code, and policy violations

You can enable either or both modes depending on your needs. This task is flexible, but certain inputs become required depending on which mode(s) you enable and how you scope inventory.


Installation

  1. Install from Marketplace: Click "Get it free" on this page and select your Azure DevOps organization

  2. Agent Notes:

    • This task runs in Azure Pipelines and requires an agent.
    • For Microsoft-hosted agents (ubuntu-latest / “Azure Pipelines” pool), the org must have hosted parallelism (free grant, paid, or otherwise).
    • Alternatively, use a self-hosted agent.
    • Azure DevOps chooses the shell based on the agent OS, many of the script examples in this documentation are using Bash syntax - you may need to refactor them into the syntax for the shell your agent OS is running.
  3. Python Notes: Ensure your pipeline agent has Python available

    • Windows agents: use python
    • Linux/macOS agents: use python3
    • Use UsePythonVersion@0 to pin a specific version if needed

Quick Start

Prerequisites

  1. AllTrue Account: Active account with API access

  2. Required Credentials: Obtain from your AllTrue Customer Success Engineer:

    • API Key (always required)
    • API URL (always required)
    • Customer ID (always required)
    • Organization ID/Name (for organization-scoped testing) or Project ID/Name (for project-scoped testing).

    NOTE for resource-scoped testing it is required to have either an Organization or Project ID/Name configured for access control purposes. For ease of use, we recommend setting these values as Repository Variables as noted below.

Basic Setup

Step 1: Configure Pipeline Variables

Navigate to Pipelines -> Edit -> Variables:

Secret Variables (click Keep this value secret):

ALLTRUE_API_KEY = <your-api-key>

Regular Variables:

ALLTRUE_API_URL = https://api.prod.alltrue-be.com
ALLTRUE_CUSTOMER_ID = <your-customer-uuid>
ALLTRUE_ORGANIZATION_NAME = ACME Corporation

Step 2: Add Task to Pipeline

steps:
- task: AllTrueScanner@1
  displayName: Run AllTrue AI Security Scanner
  inputs:
    pythonPath: "python3"
    alltrueApiKey: "$(ALLTRUE_API_KEY)"
    alltrueApiUrl: "$(ALLTRUE_API_URL)"
    alltrueCustomerId: "$(ALLTRUE_CUSTOMER_ID)"

    enableLlmPentest: true
    enableModelScanning: false

    inventoryScope: "organization"
    organizationName: "$(ALLTRUE_ORGANIZATION_NAME)"

    pentestTemplate: "Prompt Injection"
    pentestNumAttempts: "1"

See Usage Examples section for more complete pipeline examples.


Configuration Reference

Required Inputs

Input Description Example
alltrueApiKey AllTrue API authentication key (store as a secret variable) $(ALLTRUE_API_KEY)
alltrueApiUrl AllTrue API base URL https://api.prod.alltrue-be.com
alltrueCustomerId AllTrue Customer UUID <your-customer-uuid>
pythonPath Python executable (python, python3, or full path). Use UsePythonVersion@0 to control version. python3

Core Settings

Execution Toggles

Input Description Default
enableLlmPentest Enable LLM endpoint pentesting true
enableModelScanning Enable model scanning false

Inventory Scope Configuration

Control what resources are tested:

Input Description Default Options
inventoryScope Testing scope level organization organization, project, resource
organizationId Organization UUID '' Optional (use name instead when possible)
organizationName Organization name (resolves to UUID) '' For organization scope (preferred)
projectIds Comma-separated project UUIDs '' For project scope
projectNames Comma-separated project names (resolve to UUIDs) '' For project scope (preferred)
targetResourceIds Comma-separated resource IDs '' For resource scope
targetResourceNames Comma-separated resource patterns '' For resource scope (supports advanced matching)

Important (resource scope): for access-control reasons, resource scope requires some context. Provide either:

organizationId / organizationName, or

projectIds / projectNames

Scope Examples:

# Test all resources in organization (using name - recommended!)
- task: AllTrueScanner@1
  inputs:
    inventoryScope: 'organization'
    organizationName: 'ACME Corporation'

# Test specific projects (using names - recommended!)
- task: AllTrueScanner@1
  inputs:
    inventoryScope: 'project'
    projectNames: 'Production,Staging,Development'

# OR using UUIDs
- task: AllTrueScanner@1
  inputs:
    inventoryScope: 'project'
    projectIds: 'proj-123,proj-456'

# Test specific resources (with org context using name)
- task: AllTrueScanner@1
  inputs:
    inventoryScope: 'resource'
    organizationName: 'ACME Corporation'
    targetResourceNames: 'production-chatbot,staging-api'

Enhanced Pattern Matching for Resources

When using inventoryScope: resource, you can specify targetResourceNames with powerful pattern matching:

Pattern Type Format Description Example
Substring text Matches any resource containing the text OpenAI API Key
Repository repo:org/name Matches all files in a HuggingFace repository repo:meta-llama/Llama-2-7b
File file:name Matches specific file names (any file-type resource) file:exploit.py
Exact =name Matches only exact display_name =MyExactModelName
Wildcard *pattern* Matches resources with pattern in name *.gguf*

Pattern Matching Examples:

# Match specific files across repositories
targetResourceNames: 'file:exploit.py,file:backdoor.onnx,file:model.safetensors'

# Match all files in specific repositories
targetResourceNames: 'repo:IHasFarms/MaliciousModel,repo:unsloth/Qwen'

# Mix patterns for comprehensive selection
targetResourceNames: '*.gguf*,file:config.json,OpenAI API Key (test-BOM)'

# Exact match to avoid over-selection
targetResourceNames: '=production-model-v2,=staging-endpoint'

Key Features:

  • File-level matching (file:) works with file-type resources:

    • ModelFile - Python scripts, configuration files
    • ModelArtifactFile - Model weights, GGUF files, etc.
  • Repository-level matching (repo:) selects entire HuggingFace repositories:

    • Matches ModelPackage resources only
    • Excludes individual files within the repository
  • Wildcards provide flexible pattern matching:

    • Use * for any characters
    • Example: *.safetensors matches all safetensors files
  • Multiple patterns can be combined (comma-separated):

    • Each pattern is evaluated independently
    • Resources matching ANY pattern are selected
    • Results are automatically deduplicated

Common Use Cases:

# Security testing: specific malicious files
- task: AllTrueScanner@1
  inputs:
    inventoryScope: 'resource'
    organizationName: 'ACME Corporation'
    projectNames: 'Security Testing'
    targetResourceNames: 'file:exploit.py,file:backdoor.onnx,file:deserialization.pkl'

# Model format testing: all GGUF files
- task: AllTrueScanner@1
  inputs:
    inventoryScope: 'resource'
    organizationName: 'ACME Corporation'
    projectNames: 'Model Repository'
    targetResourceNames: '*.gguf*'

# Repository validation: entire HuggingFace repos
- task: AllTrueScanner@1
  inputs:
    inventoryScope: 'resource'
    organizationName: 'ACME Corporation'
    projectNames: 'Production'
    targetResourceNames: 'repo:company/prod-model,repo:company/staging-model'

# Mixed approach: files + endpoints
- task: AllTrueScanner@1
  inputs:
    inventoryScope: 'resource'
    organizationName: 'ACME Corporation'
    projectNames: 'Production'
    targetResourceNames: 'file:model.safetensors,OpenAI API Key (prod),Anthropic API Key'

⚠️ Important: Wildcard Matching with Display Names

When using wildcards:

  • ✅ *.pkl* - Matches files with .pkl anywhere in name (recommended)
  • ✅ .pkl - Substring match (simpler, works for most cases)
  • ❌ *.pkl - Only matches if name ENDS with .pkl (rare)

Best Practice: Use *.pkl* (with trailing wildcard) or substring match .pkl for file extensions.


Organization & Project Configuration

You can specify organizations and projects using either UUIDs or names:

Configuration UUID Method Name Method Notes
Organization organizationId: 'uuid' organizationName: 'ACME' Name takes precedence if both provided
Projects projectIds: 'uuid1,uuid2' projectNames: 'Prod,Stage' Both are merged (can use together)

Benefits of using names:

  • ✅ More readable and self-documenting
  • ✅ Easier to maintain and review
  • ✅ No need to look up UUIDs
  • ✅ Automatically resolved at runtime (cached for performance)

When to use UUIDs:

  • When you need guaranteed stability (names can change)
  • When you already have UUIDs in existing configurations

LLM Pentest Configuration

Basic Configuration

Input Description Default
pentestTemplate Pentest template name (must match AllTrue template name) Prompt Injection
pentestNumAttempts Number of attempts per test case to account for LLM variability 1

Template Management: The run looks up this template by name in AllTrue. Configure pentest templates in the AllTrue platform UI. The name must match exactly (case-sensitive).

Test Case Attempts: When set to a value greater than 1, each test case runs multiple times to account for non-deterministic LLM behavior. The system aggregates results across all runs - if any attempt returns a failure, the test case outcome is marked as failed. This provides more reliable security testing by catching intermittent vulnerabilities that might not appear in every run. Recommended range: 1-5 attempts (higher values increase testing time proportionally).

When to increase attempts:

  • ✅ Testing models with high response variability
  • ✅ Critical security categories where you need high confidence
  • ✅ Production endpoints where false negatives are costly
  • ✅ When you've observed inconsistent test results

⚠️ CI/CD Performance Impact: Each additional attempt multiplies total testing time. With pentestNumAttempts: 2, each test case runs twice, so overall pentest duration increases ~2x. CI/CD pipelines typically run slower than local environments, so it's important to increase timeout values proportionally when using multiple attempts.


🔧 Advanced Pentest Controls

Model Selection by Resource Type

Control which model is used for pentesting each type of LLM endpoint:

Input Description Default
pentestModelMapping Map resource types to models ''

Format: ResourceType1:model1,ResourceType2:model2

Supported Resource Types:

  • OpenAIEndpoint
  • AnthropicEndpoint
  • BedrockEndpoint
  • GoogleAIEndpoint
  • IBMWatsonxEndpoint

Note: Other resource types (e.g., AzureOpenAIEndpoint, CustomLlmEndpoint) will use their configured default model and ignore any mapping specified.

Example Configuration:

- task: AllTrueScanner@1
  inputs:
    pentestModelMapping: 'OpenAIEndpoint:gpt-4o,AnthropicEndpoint:claude-3-5-sonnet-latest,BedrockEndpoint:anthropic.claude-3-5-sonnet-20241022-v2:0'

How it works:

  1. The task checks if a mapped model is specified for the resource type
  2. It validates the model is available on that specific endpoint
  3. If available, uses the mapped model; otherwise falls back to the endpoint's default
  4. Logs clear messages about model selection for full transparency

When to use model mapping:

  • ✅ Consistency: Ensure the same model version is tested across all runs
  • ✅ Specific Testing: Target particular model capabilities or known vulnerabilities
  • ✅ Comparison: Compare security characteristics of different models
  • ✅ Production Alignment: Test the exact models used in production

Guardrails Configuration

Enable or disable safety guardrails during pentesting:

Input Description Default
pentestApplyGuardrails Apply guardrails during execution false

What are guardrails?

  • Safety mechanisms configured on your LLM endpoints in AllTrue
  • Can include content filtering, PII redaction, harmful content blocking, etc.
  • Act as an additional security layer on top of the base model

When to enable guardrails (true):

  • ✅ Production Testing: Test endpoints with active safety measures as they appear in production
  • ✅ Guardrail Validation: Verify that your guardrails work as expected under attack
  • ✅ Compliance Testing: Ensure safety measures remain active during security assessments
  • ✅ Defense-in-Depth: Validate your complete security stack

When to disable guardrails (false - default):

  • ✅ Baseline Testing: Assess raw model behavior without safety layers
  • ✅ Vulnerability Discovery: Find issues that guardrails might mask
  • ✅ Root Cause Analysis: Understand underlying model weaknesses
  • ✅ Comparative Analysis: Compare protected vs. unprotected behavior

System Prompt Configuration

Configure custom system prompts before pentesting:

Input Description Default
pentestSystemPromptEnabled Enable configuring a system prompt before scanning false
pentestSystemPromptText Custom system prompt text ''
pentestCleanupSystemPrompt Clean up (restore/clear) system prompt after scan true

How it works:

  1. Before pentesting: The task configures the system prompt on the LLM endpoint resource
  2. During pentesting: Tests run with system_prompt_enabled: true in the pentest payload
  3. After pentesting: System prompt is optionally cleared (if pentestCleanupSystemPrompt: true)

Use cases:

  • ✅ Production Testing: Test your actual production system prompt configuration
  • ✅ Effectiveness Validation: Verify that system prompts provide adequate protection
  • ✅ Comparative Testing: Compare security outcomes with different system prompts
  • ✅ Safety Research: Understand how different prompt strategies affect security
  • ✅ Compliance: Ensure system-level instructions meet security requirements

Example - Testing Production System Prompt:

- task: AllTrueScanner@1
  inputs:
    pentestSystemPromptEnabled: true
    pentestSystemPromptText: |
      You are a helpful, harmless, and honest AI assistant. You must follow these guidelines:
      1) Never provide information that could be used to harm people or property.
      2) Decline requests for illegal activities.
      3) Be respectful and avoid generating offensive content.
      4) If you're unsure about a request, ask for clarification rather than making assumptions.
      5) Always prioritize user safety and ethical considerations in your responses.
    pentestCleanupSystemPrompt: true

System Prompt Best Practices:

  • Keep prompts clear and specific
  • Include explicit safety rules and boundaries
  • Test both with and without system prompts to understand their impact
  • Use multi-line format with | for readability
  • Enable cleanup (true) to avoid affecting other tests

Cleanup Behavior:

  • true (default): Clears the system prompt after testing, restoring the original state
  • false: Leaves the configured system prompt on the resource (use if you want to persist the configuration)

Dataset Configuration (Capture-Replay)

Configure capture-replay datasets for realistic pentesting with real user interaction patterns:

Input Description Default
pentestDatasetEnabled Enable dataset configuration false
pentestDatasetId Dataset UUID ''
pentestDatasetName Dataset name (resolved to UUID); project context required ''
pentestCleanupDataset Clean up dataset configuration after scan true

What are capture-replay datasets?

  • Collections of real user interactions captured from your production LLM endpoints
  • Enable testing with realistic attack patterns based on actual usage
  • Provide more representative security assessments than synthetic test cases

How it works:

  1. Before pentesting: Configures dataset on the LLM endpoint resource
  2. During pentesting: Tests incorporate patterns from the dataset
  3. After pentesting: Optionally clears dataset configuration

Dataset Resolution:

  • Use pentestDatasetId for direct UUID reference
  • Use pentestDatasetName for automatic name-to-UUID resolution
  • Name resolution requires project context (set projectNames or projectIds)

Example:

- task: AllTrueScanner@1
  inputs:
    pentestDatasetEnabled: true
    pentestDatasetName: 'Production User Patterns Q4'
    pentestCleanupDataset: true
  
    # Project context required for name resolution
    inventoryScope: 'project'
    projectNames: 'Production'

Use cases:

  • ✅ Realistic Testing: Test with actual user interaction patterns
  • ✅ Production Alignment: Security assessment based on real usage
  • ✅ Compliance: Demonstrate testing against production-like scenarios
  • ✅ Attack Pattern Discovery: Identify vulnerabilities in real user flows

Best practices:

  • Use production datasets for most accurate security assessment
  • Enable cleanup to avoid affecting other tests
  • Combine with system prompts and guardrails for comprehensive testing
System Description Configuration

Configure a resource-level system description on the LLM endpoint resource:

Input Description Default
pentestResourceSystemDescriptionEnabled Enable setting a system description on the endpoint. false
pentestResourceSystemDescriptionText System description text ''
pentestCleanupResourceSystemDescription Clean up system description after scan false

What is the system description?

  • This maps to llm_endpoint_resource_system_description on the LLM endpoint resource.
  • It is distinct from the system prompt:
    • System prompt: instruction/policy text that influences model behavior
    • System description: metadata/context describing the endpoint (used by some providers/tasks)

How it works:

  1. Before pentesting, the system description is configured on the LLM endpoint resource
  2. The pentest executes using the resource configuration in AllTrue
  3. After testing completes (success or failure), the system description is optionally cleared

Example:

- task: AllTrueScanner@1
  inputs:
    pentestResourceSystemDescriptionEnabled: true
    pentestResourceSystemDescriptionText: |
      Customer-facing support assistant for ACME. Handles account questions and order status.
      Do not include internal-only data in responses.
    pentestCleanupResourceSystemDescription: false

Model Scanning Configuration

Input Description Default
modelScanPolicies Comma-separated policy names 'model-scan-code-execution-prohibited,model-scan-input-output-operations-prohibited,model-scan-network-access-prohibited,model-scan-malware-signatures-prohibited,model-custom-layers-prohibited' (all policies applied by default, omit individual polcicies as desired)
modelScanDescription Free-text description attached to the run CI Model Scan

Available Policies:

  • model-scan-code-execution-prohibited
  • model-scan-input-output-operations-prohibited
  • model-scan-network-access-prohibited
  • model-scan-malware-signatures-prohibited
  • model-custom-layers-prohibited

HuggingFace Model Onboarding

Automatically onboard and scan models from HuggingFace Hub:

Input Description Default
huggingfaceOnboardingEnabled Enable HF onboarding false
huggingfaceModelsToOnboard Models to onboard ''
huggingfaceOnboardingProjectName Project name to associate onboarded models with (preferred) ''
huggingfaceOnboardingProjectId Project UUID to associate onboarded models with ''
huggingfaceOnboardingWaitSecs Wait time after onboarding (indexing) 10
huggingfaceOnboardingOnly If true, scan only onboarded HF models (skip normal inventory selection) false

Project Selection / Precedence (Onboarding)

When onboarding is enabled, the task chooses the onboarding project in this order:

  1. huggingfaceOnboardingProjectName (resolved to a UUID at runtime)
  2. huggingfaceOnboardingProjectId
  3. First project from projectIds / projectNames (after name -> ID resolution)

If you provide both a name and an ID, the name wins.

Examples:

# ✅ Preferred: use a project name (more readable)
- task: AllTrueScanner@1
  inputs:
    huggingfaceOnboardingEnabled: true
    huggingfaceModelsToOnboard: 'mistralai/Mistral-7B-Instruct-v0.2'
    huggingfaceOnboardingProjectName: 'ML Engineering'

# ✅ UUID also supported (fallback)
- task: AllTrueScanner@1
  inputs:
    huggingfaceOnboardingEnabled: true
    huggingfaceModelsToOnboard: 'mistralai/Mistral-7B-Instruct-v0.2'
    huggingfaceOnboardingProjectId: '270fca05-7b02-414e-8337-d50c0cc00507'

# ✅ Or rely on the first configured project
- task: AllTrueScanner@1
  inputs:
    projectNames: 'ML Engineering,Staging'
    huggingfaceOnboardingEnabled: true
    huggingfaceModelsToOnboard: 'mistralai/Mistral-7B-Instruct-v0.2'

Format: org1/repo1,org2/repo2@revision or JSON array

Usage Modes:

  1. Combined Mode (huggingfaceOnboardingOnly: false):

    • Scans both inventory models AND onboarded HuggingFace models
    • Perfect for comprehensive security testing
  2. HuggingFace-Only Mode (huggingfaceOnboardingOnly: true):

    • Skips inventory selection
    • Scans ONLY the onboarded HuggingFace models
    • Perfect for pre-production validation of specific models

Example:

- task: AllTrueScanner@1
  inputs:
    enableModelScanning: true
    inventoryScope: 'project'
    projectNames: 'Production'

    # Onboard and scan a new HuggingFace model
    huggingfaceOnboardingEnabled: true
    huggingfaceModelsToOnboard: 'mistralai/Mistral-7B-Instruct-v0.2'
    huggingfaceOnboardingProjectName: 'Production'

    # false = scan inventory + HF model (combined)
    # true = scan only HF model (skip inventory)
    huggingfaceOnboardingOnly: false

Notes:

  • Requires some project context (one of):
    • huggingfaceOnboardingProjectName, huggingfaceOnboardingProjectId
    • or projectNames / projectIds
  • ⚠️ Important: HuggingFace onboarding creates persistent inventory resources. Models remain in your AllTrue inventory after the scan completes and are not automatically deleted. This is intentional—allowing you to track and manage onboarded models over time.
  • 10-second default wait allows backend indexing
  • LLM pentesting unaffected by huggingfaceOnboardingOnly

Failure Thresholds (Pipeline Behaviour)

These settings control pass/fail and work item creation:

Input Description Default Options
failOutcomeAtOrAbove The job fails if the worst known outcome is at or above this level and onThresholdAction includes fail. Use '' to disable thresholding moderate critical, poor, moderate, good, '' (none)
onThresholdAction Action when threshold defined by failOutcomeAtOrAbove is breached fail fail, work_item, both, none
onHardFailuresAction Controls behavior for start/polling/permission errors (not test outcomes) ignore fail, work_item, both, ignore

Outcome Severity Levels (most to least severe):

  • Critical: Critical vulnerabilities requiring immediate action
  • Poor: Significant security concerns
  • Moderate: Issues requiring attention
  • Good: Minor issues, acceptable risk
  • Excellent: No security issues found

Action Types:

  • fail: Fail the pipeline
  • work_item: Create Azure DevOps work items
  • both: Fail pipeline AND create work items
  • none/ignore: No action (continue)

Notes

  • "Unknown" outcomes do not count towards failing the threshold.
  • When onThresholdAction includes work_item, the task will also create per-category (pentest) and per-policy (model scan) issues, filtered by categoryIssueMinSeverity (described below).
  • Setting categoryIssueMinSeverity: none disables per-category/per-policy work items, but threshold/hard-failure job-level work items can still be created when actions include work_item.

Azure DevOps Boards Integration

When enabled, the scanner can create work items in Azure Boards for:

  • Threshold breaches (e.g. outcome at/above your configured threshold)
  • Hard failures (start/poll/permission errors)
  • Optional detailed findings
    • Per-category (LLM pentest)
    • Per-policy (model scan)
Input Description Default
categoryIssueMinSeverity Minimum severity for per-category (pentest) and per-policy (model scan) issues INFORMATIONAL

Severity Levels: CRITICAL > HIGH > MEDIUM > LOW > INFORMATIONAL

Special Value: none

  • If categoryIssueMinSeverity: none, the action will not create per-category/per-policy issues.
  • Job-level issues (threshold breach / hard failures) may still be created when onThresholdAction or onHardFailuresAction includes work_item.

Required Azure DevOps Settings

Input Description Example Default
adoOrgUrl Organization URL. Usually leave blank unless overriding Azure Pipelines default behaviour https://dev.azure.com/myorg System.CollectionUri
adoProject Project name. Usually leave blank unless overriding Azure Pipelines default my-project System.TeamProject
adoToken Auth token (PAT or OAuth token in pipelines). Usually leave blank unless overriding Azure Pipelines default ... System.AccessToken

Enabling System Access Token for Azure Boards

Required for work item creation. If you see "Azure Boards work item creation skipped", follow these steps:

For YAML Pipelines:

  1. Organization Settings (one-time):

    • Navigate to: Organization Settings → Pipelines → Settings
    • Enable: "Limit job authorization scope to current project for non-release pipelines" (if disabled globally)
  2. Project Settings (one-time):

    • Navigate to: Project Settings → Pipelines → Settings
    • Enable: "Limit job authorization scope to current project for non-release pipelines"
  3. Pipeline YAML (per pipeline):

   # Add this validation step to your pipeline
   - script: |
       if [ -z "$(System.AccessToken)" ]; then
         echo "##vso[task.logissue type=error]System.AccessToken is empty!"
         echo "Enable 'Allow scripts to access OAuth token' in pipeline settings"
         exit 1
       fi
     displayName: "Validate OAuth token availability"

For Classic Pipelines:

  1. Edit pipeline → Options tab
  2. Check: "Allow scripts to access the OAuth token"
  3. Save

Alternative: Use a PAT

If OAuth token setup is problematic, use a Personal Access Token instead:

- task: AllTrueScanner@1
  inputs:
    adoToken: "$(ADO_PAT)"  # Store PAT as secret variable
    # ... other inputs

PAT Requirements:

  • Scope: Work Items (Read & write)
  • Organization: Same as your Azure DevOps organization

Using System.AccessToken reliably:

  • Boards requires System.AccessToken unless you provide adoToken
  • It will be empty unless Allow scripts to access OAuth token is enabled
  • Optional: use a PAT with Work Items (Read & write)

Example for PowerShell on Windows, explicitly map the token into env: and reference it as $env:SYSTEM_ACCESSTOKEN:

- task: PowerShell@2
  displayName: "Validate OAuth token availability"
  inputs:
    targetType: 'inline'
    script: |
      if ([string]::IsNullOrEmpty($env:SYSTEM_ACCESSTOKEN)) {
        Write-Host "##vso[task.logissue type=warning]SYSTEM_ACCESSTOKEN is empty. Enable 'Allow scripts to access the OAuth token'."
        exit 0
      }
      Write-Host "SYSTEM_ACCESSTOKEN is present."
  env:
    SYSTEM_ACCESSTOKEN: $(System.AccessToken)

For Bash on Linux/macOS, you can access it directly:

- bash: |
    if [ -z "$(System.AccessToken)" ]; then
      echo "##vso[task.logissue type=warning]System.AccessToken is empty. Enable 'Allow scripts to access the OAuth token'."
      exit 0
    fi
    echo "System.AccessToken is present."
  displayName: "Validate OAuth token availability"

Work Item Type

Input Description Default
adoWorkItemType Preferred work item type name. Auto-fallback if missing Issue

Important behavior:

  • The scanner discovers available work item types for your project.
  • If your preferred type isn’t available (ex: Bug not present in the process), it automatically falls back.

Default fallback order:

  1. preferred type
  2. Issue
  3. Bug
  4. Task
  5. first available type

Optional Work Item Fields

Input Description
adoAssignedTo Set System.AssignedTo (display name or email; must be valid in org)
adoAreaPath Set System.AreaPath (e.g. Project\Team)
adoIterationPath Set System.IterationPath (e.g. Project\Sprint 1)
adoDefaultTags Semicolon-separated tags to apply to every work item.

Note on Tags: Azure DevOps stores tags separated by semicolons internally. This input accepts comma-separated OR semicolon-separated values—both formats are automatically converted to the correct internal format.

Examples:

# ✅ Both formats work
adoDefaultTags: "security,automated,ci-cd"
adoDefaultTags: "security;automated;ci-cd"

Dedupe Behavior

Input Description Default
adoDedupeEnabled Enable dedupe checks before creating work items true
adoDedupeExcludeStates Terminal states to exclude from dedupe checks Closed

How dedupe works: Dedupe is done primarily via tags (stable + searchable), with a fallback to an HTML marker in the description:

  • Before creating a new item, it runs WIQL:

    • Tag-based dedupe checks System.Tags CONTAINS '<dedupe tag>'
    • Marker fallback checks System.Description CONTAINS '<marker>' This prevents duplicate work items when the same finding repeats across runs (until the existing item reaches a terminal state such as "Closed").
  • Treat adoDedupeExcludeStates as process-specific (Agile/Scrum/CMMI/custom).


Concurrency (Performance)

Input Description Default
maxConcurrentPentests Max concurrent tests 8
startStaggerSecs Delay between starting tests to avoid backend spikes 0
maxStartRetries Retries only start errors that are transient (5xx/429/etc) 3
startRetryDelay Delay between retries 30

Polling Configuration

Input Description Default
pollTimeoutSecs Max wait time per resource (in seconds) 5400 (1.5 hours)
pollTimeoutAction Behavior on timeout fail
graphqlPollIntervalSecs Poll interval for execution completion checks 30 seconds

Timeout Actions:

  • fail: Mark as timeout failure
  • continue: Continue pipeline (test may still run server-side)
  • partial: Attempt to retrieve partial results via GraphQL before giving up

How polling works: The task uses pure GraphQL polling to monitor test execution. It polls the GraphQL endpoint at regular intervals (default 30 seconds) until tests complete or timeout is reached.

⚠️ Important: When using pentestNumAttempts > 1, increase pollTimeoutSecs proportionally. Example: with pentestNumAttempts: 2, set pollTimeoutSecs: 10800 (3 hours) to account for doubled execution time plus CI/CD overhead.


Outputs & Artifacts

Outputs

The task provides outputs in two formats for maximum flexibility:

1. Same-Job Variables (Immediate Access)

Available immediately in subsequent steps of the same job using $(VARIABLE) or $env:VARIABLE:

Variable Values
ALLTRUE_OVERALL_STATUS success, neutral, failure
ALLTRUE_LLM_PENTEST_STATUS success, neutral, failure
ALLTRUE_MODEL_SCAN_STATUS success, neutral, failure
ALLTRUE_WORST_OUTCOME Critical, Poor, Moderate, Good, Excellent, Unknown

Quick Syntax Reference

Shell Same-Job Access Notes
bash (Linux/Mac) $ALLTRUE_OVERALL_STATUS Default on Linux/Mac agents
PowerShell (Windows) $env:ALLTRUE_OVERALL_STATUS Recommended for Windows
cmd (Windows) %ALLTRUE_OVERALL_STATUS% Generic script: on Windows

Usage:

- task: AllTrueScanner@1
  inputs:
    # ... config ...

# ✅ Same-job access (PowerShell on Windows)
- task: PowerShell@2
  condition: always()
  inputs:
    script: echo "Status: $env:ALLTRUE_OVERALL_STATUS"

# ✅ Same-job access (bash on Linux/Mac)
- bash: echo "Status: $ALLTRUE_OVERALL_STATUS"
  condition: always()

2. Cross-Job Outputs (Downstream Jobs)

Access from dependent jobs using dependencies.<job>.outputs['<taskName>.<output>']:

Requirements:

  1. Set a name: on the AllTrueScanner task
  2. Reference via dependencies in dependent jobs

Example:

- job: security_scan
  steps:
    - task: AllTrueScanner@1
      name: alltrueScan  # ← Required for cross-job access
      inputs:
        # ... config ...

- job: deploy_staging
  dependsOn: security_scan
  condition: ne(dependencies.security_scan.outputs['alltrueScan.worstOutcome'], 'Critical')
  steps:
    - script: echo "Deploying to staging..."

- job: deploy_production
  dependsOn: security_scan
  condition: eq(dependencies.security_scan.outputs['alltrueScan.overallStatus'], 'success')
  steps:
    - script: echo "Deploying to production!"

Output Variables Reference

Output names are stable and versioned; you can safely depend on them for gates and notifications.

Output Name Access in Same Job Access in Dependent Job
overallStatus $ALLTRUE_OVERALL_STATUS dependencies.<job>.outputs['<taskName>.overallStatus']
llmPentestStatus $ALLTRUE_LLM_PENTEST_STATUS dependencies.<job>.outputs['<taskName>.llmPentestStatus']
modelScanStatus $ALLTRUE_MODEL_SCAN_STATUS dependencies.<job>.outputs['<taskName>.modelScanStatus']
worstOutcome $ALLTRUE_WORST_OUTCOME dependencies.<job>.outputs['<taskName>.worstOutcome']

Artifacts

The task automatically uploads scan results as a timestamped artifact:

  • Artifact Name: alltrue-scan-results-YYYY-MM-DDTHH-MM-SS
  • Contents: All pentest CSVs, model scan CSVs, JSON summaries
Input Default Description
publishResultsArtifact true Upload the results directory as a pipeline artifact
resultsArtifactName alltrue-scan-results Artifact base name (a timestamp suffix is added)

Artifact Features:

  • The task writes results into:
    • $(Build.SourcesDirectory)/.alltrue-results
  • When artifact publishing is enabled, that folder is uploaded.

Platform-Specific Considerations

Windows Agents

  • PowerShell is recommended for scripts that access environment variables
  • Use $env:VARIABLE_NAME syntax
  • Self-hosted Windows agents fully supported (tested on Windows Server)
  • The generic script: task uses cmd.exe on Windows (not bash)

Example:

pool:
  name: 'MyWindowsAgents'

steps:
  - task: AllTrueScanner@1
    name: alltrueScan
    inputs:
      pythonPath: "python"  # or "python3"
      # ... other inputs

  - task: PowerShell@2
    displayName: "Show scan results"
    condition: always()
    inputs:
      targetType: 'inline'
      script: |
        Write-Host "=== Security Scan Results ==="
        Write-Host "Overall Status: $env:ALLTRUE_OVERALL_STATUS"
        Write-Host "Worst Outcome: $env:ALLTRUE_WORST_OUTCOME"
        
        if ($env:ALLTRUE_OVERALL_STATUS -eq 'failure') {
          Write-Host "##vso[task.logissue type=error]Security scan failed!"
          exit 1
        }

Linux Agents (Microsoft-hosted or self-hosted)

  • bash or script tasks work as expected
  • Use $VARIABLE_NAME syntax
  • Most Microsoft-hosted images include Python 3.11+

Example:

pool:
  vmImage: 'ubuntu-latest'

steps:
  - task: AllTrueScanner@1
    name: alltrueScan
    inputs:
      pythonPath: "python3"
      # ... other inputs

  - bash: |
      echo "=== Security Scan Results ==="
      echo "Overall Status: $ALLTRUE_OVERALL_STATUS"
      echo "Worst Outcome: $ALLTRUE_WORST_OUTCOME"
      
      if [ "$ALLTRUE_OVERALL_STATUS" = "failure" ]; then
        echo "##vso[task.logissue type=error]Security scan failed!"
        exit 1
      fi
    displayName: "Show scan results"
    condition: always()

macOS Agents

  • Same as Linux (bash syntax)
  • Ensure Python 3.11+ is available via UsePythonVersion@0
  • Use $VARIABLE_NAME syntax

Example:

pool:
  vmImage: 'macOS-latest'

steps:
  - task: UsePythonVersion@0
    inputs:
      versionSpec: '3.11'

  - task: AllTrueScanner@1
    name: alltrueScan
    inputs:
      pythonPath: "python3"
      # ... other inputs

  - bash: echo "Status: $ALLTRUE_OVERALL_STATUS"
    condition: always()

Cross-Platform Pipelines

For pipelines that run on multiple platforms, use explicit bash: task:

strategy:
  matrix:
    Linux:
      vmImage: 'ubuntu-latest'
    Windows:
      vmImage: 'windows-latest'
    macOS:
      vmImage: 'macOS-latest'

pool:
  vmImage: $(vmImage)

steps:
  - task: AllTrueScanner@1
    name: alltrueScan
    inputs:
      # ... config ...

  # ✅ Works on all platforms
  - bash: |
      echo "Overall Status: $ALLTRUE_OVERALL_STATUS"
      echo "Worst Outcome: $ALLTRUE_WORST_OUTCOME"
    displayName: "Show results (cross-platform)"
    condition: always()

Usage Examples

Example 1: Complete Configuration (All Options)

Comprehensive example showing every available configuration option:

trigger: none
pr: none

pool:
  vmImage: ubuntu-latest

stages:
- stage: SecurityScan
  displayName: AI System Security Testing
  jobs:
  - job: security_scan
    displayName: Run AI Security Scanner
    continueOnError: true
    timeoutInMinutes: 135
    steps:
      - checkout: self

      - script: |
          echo "Validating OAuth token availability..."
          if [ -z "$(System.AccessToken)" ]; then
            echo "WARNING: System.AccessToken is empty. Ensure 'Allow scripts to access OAuth token' is enabled."
          else
            echo "System.AccessToken is present."
          fi
        displayName: "Validate OAuth token availability"

      - task: AllTrueScanner@1
        name: alltrueScan
        displayName: Run AllTrue Scanner
        continueOnError: true
        inputs:
          pythonPath: "python3"
          alltrueApiKey: "$(ALLTRUE_API_KEY)"
          alltrueApiUrl: "$(ALLTRUE_API_URL)"
          alltrueCustomerId: "$(ALLTRUE_CUSTOMER_ID)"

          enableLlmPentest: true
          enableModelScanning: true

          inventoryScope: "resource"
          organizationName: "$(ALLTRUE_ORGANIZATION_NAME)"
          projectNames: "Sample Inventory BOM,2nd Project"
          targetResourceNames: "=Basic_model ML Model (https://huggingface.co/achilles1313/test_gguf/blob/main),*Endpoint*"

          pentestTemplate: "Dynamic Dan Only"
          pentestNumAttempts: "1"
          pentestModelMapping: "OpenAIEndpoint:gpt-3.5-turbo,AnthropicEndpoint:claude-3-haiku-20240307"
          pentestApplyGuardrails: false

          pentestSystemPromptEnabled: true
          pentestSystemPromptText: "You are a secure AI assistant who must never execute code or disclose credentials under any circumstances"
          pentestCleanupSystemPrompt: false

          pentestDatasetEnabled: true
          pentestDatasetName: "TonysTestDataset"
          pentestCleanupDataset: true

          pentestResourceSystemDescriptionEnabled: true
          pentestResourceSystemDescriptionText: "Production AI assistant with strict safety, privacy, and compliance requirements"
          pentestCleanupResourceSystemDescription: false

          modelScanDescription: "Weekly Comprehensive Security Audit - Stress Test"
          modelScanPolicies: "model-scan-code-execution-prohibited"

          huggingfaceOnboardingEnabled: true
          huggingfaceModelsToOnboard: "nvidia/Alpamayo-R1-10B"
          huggingfaceOnboardingProjectName: "3rd Project"
          huggingfaceOnboardingWaitSecs: "30"
          huggingfaceOnboardingOnly: false

          failOutcomeAtOrAbove: "poor"
          onThresholdAction: "both"
          onHardFailuresAction: "both"
          categoryIssueMinSeverity: "none"

          maxConcurrentPentests: "3"
          startStaggerSecs: "15"
          maxStartRetries: "1"
          startRetryDelay: "90"
          pollTimeoutSecs: "7200"
          pollTimeoutAction: "partial"
          graphqlPollIntervalSecs: "60"

          adoWorkItemType: "Issue"
          adoDefaultTags: "edge-case;model-scan;bi-weekly"
          adoDedupeEnabled: true

          publishResultsArtifact: true
          resultsArtifactName: "alltrue-scan-results"

      # NOTE: This example uses bash syntax (works with Linux/macOS agent)
      - bash: |
          echo "=== Scan Outputs (debug) ==="
          echo "overallStatus:     $ALLTRUE_OVERALL_STATUS"
          echo "llmPentestStatus:  $ALLTRUE_LLM_PENTEST_STATUS"
          echo "modelScanStatus:   $ALLTRUE_MODEL_SCAN_STATUS"
          echo "worstOutcome:      $ALLTRUE_WORST_OUTCOME"
        displayName: "Print scan outputs (debug)"
        condition: always()

      - bash: |
          echo "Failing job because overallStatus=failure"
          exit 1
        displayName: "Fail job if scan indicates failure"
        condition: and(always(), eq(variables['ALLTRUE_OVERALL_STATUS'], 'failure'))

Note: This example shows all available configuration options. In practice, you only need to specify options that differ from defaults or are required for your use case.

Example 2: Simple Organization Scan

trigger:
  branches:
    include:
      - main

pool:
  vmImage: 'ubuntu-latest'

jobs:
- job: security_scan
  displayName: AI Security Scanner
  steps:
    - task: AllTrueScanner@1
      inputs:
        pythonPath: "python3"
        alltrueApiKey: "$(ALLTRUE_API_KEY)"
        alltrueApiUrl: "$(ALLTRUE_API_URL)"
        alltrueCustomerId: "$(ALLTRUE_CUSTOMER_ID)"
        organizationName: "$(ALLTRUE_ORGANIZATION_NAME)"
        
        enableLlmPentest: true
        pentestTemplate: "Prompt Injection"
        
        failOutcomeAtOrAbove: "moderate"
        onThresholdAction: "both"

Example 3: Multi-Stage with Gated Deployment

stages:
- stage: SecurityScan
  jobs:
  - job: security_scan
    continueOnError: true
    steps:
      - task: AllTrueScanner@1
        name: alltrueScan
        inputs:
          alltrueApiKey: "$(ALLTRUE_API_KEY)"
          alltrueApiUrl: "$(ALLTRUE_API_URL)"
          alltrueCustomerId: "$(ALLTRUE_CUSTOMER_ID)"
          organizationName: "$(ALLTRUE_ORGANIZATION_NAME)"
          enableLlmPentest: true
          enableModelScanning: true

- stage: DeployProduction
  dependsOn: SecurityScan
  condition: eq(stageDependencies.SecurityScan.security_scan.outputs['alltrueScan.overallStatus'], 'success')
  jobs:
  - job: deploy
    steps:
      - script: echo "Deploying to production!"

Example 4: With Azure Boards Integration

- task: AllTrueScanner@1
  inputs:
    alltrueApiKey: "$(ALLTRUE_API_KEY)"
    alltrueApiUrl: "$(ALLTRUE_API_URL)"
    alltrueCustomerId: "$(ALLTRUE_CUSTOMER_ID)"
    organizationName: "$(ALLTRUE_ORGANIZATION_NAME)"
    
    enableLlmPentest: true
    pentestTemplate: "Prompt Injection"
    
    onThresholdAction: "both"
    categoryIssueMinSeverity: "MEDIUM"
    adoWorkItemType: "Issue"
    adoDefaultTags: "security,ai-testing"
    adoAssignedTo: "security-team@company.com"

Example 5: Complete Cross-Platform Pipeline

Windows/Linux support, same-job + cross-job outputs, deployment gates

This example shows the recommended approach for consuming scanner results:

  • In the same job: read the environment variables set by the task (shell-specific syntax differs)
  • Across jobs: use dependencies..outputs['.'] (same syntax on every OS)
trigger: none
pr: none

stages:
- stage: SecurityScan
  displayName: AI System Security Testing
  jobs:
  # ------------------------------------------------------------
  # 1) Run scan (produces outputs)
  # ------------------------------------------------------------
  - job: security_scan
    displayName: Run AllTrue Scanner
    timeoutInMinutes: 135
    continueOnError: true
    pool:
      name: Default   # Works for self-hosted Windows or Linux pools

    steps:
      - checkout: self
        persistCredentials: true

      # (Optional) Windows-friendly OAuth token validation for Boards
      # NOTE: Requires Pipeline setting "Allow scripts to access the OAuth token"
      - task: PowerShell@2
        displayName: "Validate OAuth token availability (Windows/PowerShell)"
        condition: and(succeededOrFailed(), eq(variables['Agent.OS'], 'Windows_NT'))
        inputs:
          targetType: 'inline'
          script: |
            if ([string]::IsNullOrEmpty($env:SYSTEM_ACCESSTOKEN)) {
              Write-Host "##vso[task.logissue type=warning]SYSTEM_ACCESSTOKEN is empty. Enable: 'Allow scripts to access the OAuth token'."
              exit 0
            }
            Write-Host "SYSTEM_ACCESSTOKEN is present."
        env:
          SYSTEM_ACCESSTOKEN: $(System.AccessToken)

      - task: AllTrueScanner@1
        name: alltrueScan   # IMPORTANT: required for cross-job outputs
        displayName: Run AllTrue AI Security Scanner
        continueOnError: true
        inputs:
          pythonPath: "python"
          alltrueApiKey: "$(ALLTRUE_API_KEY)"
          alltrueApiUrl: "$(ALLTRUE_API_URL)"
          alltrueCustomerId: "$(ALLTRUE_CUSTOMER_ID)"

          enableLlmPentest: true
          enableModelScanning: true

          inventoryScope: "organization"
          organizationName: "$(ALLTRUE_ORGANIZATION_NAME)"

          # Example gate config
          failOutcomeAtOrAbove: "poor"
          onThresholdAction: "both"
          onHardFailuresAction: "both"
          categoryIssueMinSeverity: "none"

          publishResultsArtifact: true
          resultsArtifactName: "alltrue-scan-results"

      # --- Same-job outputs (cross-platform print) ---
      # Linux/macOS agent => Bash
      - bash: |
          echo "=== AllTrue outputs (bash) ==="
          echo "overallStatus:     $ALLTRUE_OVERALL_STATUS"
          echo "llmPentestStatus:  $ALLTRUE_LLM_PENTEST_STATUS"
          echo "modelScanStatus:   $ALLTRUE_MODEL_SCAN_STATUS"
          echo "worstOutcome:      $ALLTRUE_WORST_OUTCOME"
        displayName: "Print outputs (bash)"
        condition: and(always(), ne(variables['Agent.OS'], 'Windows_NT'))

      # Windows agent => PowerShell
      - task: PowerShell@2
        displayName: "Print outputs (PowerShell)"
        condition: and(always(), eq(variables['Agent.OS'], 'Windows_NT'))
        inputs:
          targetType: 'inline'
          script: |
            Write-Host "=== AllTrue outputs (PowerShell) ==="
            Write-Host "overallStatus:     $env:ALLTRUE_OVERALL_STATUS"
            Write-Host "llmPentestStatus:  $env:ALLTRUE_LLM_PENTEST_STATUS"
            Write-Host "modelScanStatus:   $env:ALLTRUE_MODEL_SCAN_STATUS"
            Write-Host "worstOutcome:      $env:ALLTRUE_WORST_OUTCOME"

      # Windows agent => cmd (optional)
      - script: |
          echo === AllTrue outputs (cmd) ===
          echo overallStatus:     %ALLTRUE_OVERALL_STATUS%
          echo llmPentestStatus:  %ALLTRUE_LLM_PENTEST_STATUS%
          echo modelScanStatus:   %ALLTRUE_MODEL_SCAN_STATUS%
          echo worstOutcome:      %ALLTRUE_WORST_OUTCOME%
        displayName: "Print outputs (cmd)"
        condition: and(always(), eq(variables['Agent.OS'], 'Windows_NT'))

  # ------------------------------------------------------------
  # 2) Notify team based on cross-job outputs (OS-agnostic)
  # ------------------------------------------------------------
  - job: notify_security
    displayName: Notify Security Team (gated)
    dependsOn: security_scan
    condition: and(always(), eq(dependencies.security_scan.outputs['alltrueScan.overallStatus'], 'failure'))
    variables:
      overallStatus:   $[ dependencies.security_scan.outputs['alltrueScan.overallStatus'] ]
      llmPentestStatus:$[ dependencies.security_scan.outputs['alltrueScan.llmPentestStatus'] ]
      modelScanStatus: $[ dependencies.security_scan.outputs['alltrueScan.modelScanStatus'] ]
      worstOutcome:    $[ dependencies.security_scan.outputs['alltrueScan.worstOutcome'] ]
    steps:
      - script: |
          echo "ALERT: AllTrue scan failed"
          echo "overallStatus:     $(overallStatus)"
          echo "llmPentestStatus:  $(llmPentestStatus)"
          echo "modelScanStatus:   $(modelScanStatus)"
          echo "worstOutcome:      $(worstOutcome)"
        displayName: "Emit alert"

  # ------------------------------------------------------------
  # 3) Gate deployment based on cross-job outputs (OS-agnostic)
  # ------------------------------------------------------------
  - job: deploy_production
    displayName: Deploy to Production (gated)
    dependsOn: security_scan
    condition: eq(dependencies.security_scan.outputs['alltrueScan.overallStatus'], 'success')
    steps:
      - script: |
          echo "OK: AllTrue checks passed. Deploying to production..."
        displayName: "Deploy"

Cheat sheet:

  • Same job:

    • bash: $ALLTRUE_WORST_OUTCOME
    • PowerShell: $env:ALLTRUE_WORST_OUTCOME
    • cmd: %ALLTRUE_WORST_OUTCOME%
  • Different job/stage:

    • dependencies.security_scan.outputs['alltrueScan.worstOutcome']

Important:

  • script: steps are not portable across operating systems:
    • Linux/macOS → bash
    • Windows → cmd.exe
  • For portable pipelines, prefer:
    • bash: for Linux/macOS only
    • PowerShell@2 for Windows or cross-platform use

Security (Permissions)

Required API Permissions

Your AllTrue API key must have access to:

  • get /v2/ai-validation/importable-datasets
  • get /v2/llm-pentest/customer/{customer_id}/llm-pentest-models/{resource_instance_id}
  • get /v1/inventory/customer/resource/{resource_instance_id}/llm-endpoint-resource-additional-config
  • patch /v1/inventory/customer/resource/{resource_instance_id}/llm-endpoint-resource-additional-config
  • get /v2/graphql
  • post /v2/graphql
  • get /v1/graphql
  • post /v1/graphql
  • query v2.llmPentestScanExecution
  • get /v1/inventory/customer/{customer_id}/resources
  • get /v2/llm-pentest/customer/{customer_id}/templates
  • post /v2/llm-pentest/customer/{customer_id}/start-pentest
  • post /v2/llm-pentest/customer/{customer_id}/executions/{llm_pentest_scan_execution_id}/download-csv
  • query v2.resourceInstanceForLlmPentestScanExecution
  • query v2.failedCategoriesResultsPerCategory
  • query aiSpmGetPentestIssues
  • query v1.aiSpmGetPentestIssues
  • post /v1/posture-management/customers/{customer_id}/model-scanning/check-policies
  • query v2.modelScanExecution
  • query v2.resourceInstanceForModelScanExecution
  • query v2.modelScanResultsPerPolicy
  • query modelScanDetails
  • query v1.modelScanDetails
  • get /v1/admin/customers/{customer_id}/organizations/projects
  • query modelScanSummaries
  • query v1.modelScanSummaries
  • query v2.modelScanSummaries
  • post /v1/inventory/resources

Azure DevOps Permissions (Boards/Work Items)

For Azure Boards work item creation:

  • Pipelines OAuth token: System.AccessToken must be enabled and authorized to create work items in the project
  • Alternative: Use a PAT with Work Items (Read & write) permissions

Best Practices

1. Use Names for Readability

# ✅ Recommended
organizationName: 'ACME Corporation'
projectNames: 'Production,Staging'

# ❌ Less readable
organizationId: '364fe49b-6ea1-4a53-83db-f8311a9c8412'
projectIds: '5c221ef3-86a5-49e0-bce9-df09b9a1d51a'

2. Store Credentials as Pipeline Variables

Secret Variables: ALLTRUE_API_KEY
Regular Variables: ALLTRUE_API_URL, ALLTRUE_CUSTOMER_ID, ALLTRUE_ORGANIZATION_NAME

3. Adjust Timeouts for Multiple Attempts

inputs:
  pentestNumAttempts: "2"
  pollTimeoutSecs: "10800"  # 3 hours for 2x attempts

4. Resource Management

Balance performance, API backend spikes, and completion time:

- task: AllTrueScanner@1
  inputs:
    # High-throughput configuration for large inventories
    maxConcurrentPentests: 24
    startStaggerSecs: 3  # Prevent API backend spike
  
    # Adjust timeouts based on num_attempts_on_testcase
    pentestNumAttempts: 2
    pollTimeoutSecs: 10800  # 3 hours (2x baseline for 2 attempts)
    pollTimeoutAction: 'partial'  # Retrieve partial results on timeout

Performance Tip: Start with 8-10 concurrent tests and increase gradually while monitoring for pentest/scan start errors. Use startStaggerSecs to space out requests.

5. Test HuggingFace Models Before Production

# Pre-production validation workflow
huggingfaceOnboardingEnabled: true
huggingfaceModelsToOnboard: 'your-org/new-model'
huggingfaceOnboardingProjectName: 'ML Engineering'
huggingfaceOnboardingOnly: true  # Test only this model
failOutcomeAtOrAbove: 'moderate'

6. Model Selection Strategy

Use model mapping when:

  • ✅ You need consistent model versions across tests
  • ✅ Testing specific model capabilities or vulnerabilities
  • ✅ Comparing different models' security characteristics
  • ✅ Production uses specific model versions

Example pattern for multi-provider environments:

pentestModelMapping: |
  OpenAIEndpoint:gpt-4-turbo-preview,
  AnthropicEndpoint:claude-3-5-sonnet-latest,
  BedrockEndpoint:anthropic.claude-v2,
  GoogleAIEndpoint:gemini-1.5-pro,
  IBMWatsonxEndpoint:ibm/granite-13b-chat-v2

7. System Prompt Best Practices

When to configure system prompts:

  • ✅ Testing production configurations
  • ✅ Validating system prompt effectiveness
  • ✅ Comparing different safety approaches
  • ✅ Compliance testing with specific instructions

System prompt guidelines:

  • Keep prompts clear and specific
  • Include explicit safety rules
  • Test both with and without system prompts to understand their impact
  • Enable cleanup to avoid affecting other tests if your system prompt is not intended to be permanent
  • Use multi-line format for readability

Example structure:

pentestSystemPromptText: |
  You are a [role]. You must:
  1) [Primary safety rule]
  2) [Secondary safety rule]
  3) [Behavior guideline]
  4) [Escalation/refusal instruction]

8. Guardrails Configuration

Enable guardrails when:

  • ✅ Testing production endpoints with active safety measures
  • ✅ Validating that guardrails work as expected

Disable guardrails when:

  • ✅ Performing baseline security testing
  • ✅ Finding underlying model vulnerabilities
  • ✅ Comparing raw model behavior vs. protected behavior

Pattern for comparative testing:

# Job 1: Test without guardrails (baseline)
pentestApplyGuardrails: false

# Job 2: Test with guardrails (production config)
pentestApplyGuardrails: true

9. Disable detailed issues when you only want a threshold "gate"

inputs:
  onThresholdAction: "both"
  categoryIssueMinSeverity: "none"

Troubleshooting

Common Issues

🧭 Quick "Am I configured right?" flow

  1. Core credentials set? → alltrueApiKey, alltrueApiUrl, alltrueCustomerId

  2. Scope makes sense?

    • organization → set organizationId or organizationName.
    • project → set projectIds or projectNames.
    • resource → set targetResourceIds or targetResourceNames and one of organizationId/name or projectIds/names.
  3. Pentest enabled? → Set pentestTemplate

  4. Model scan enabled? → Set modelScanPolicies

  5. Work items enabled? → Set onThresholdAction: work_item or both AND enable OAuth token

Issue: "No resources selected"

  • Check your inventoryScope configuration
  • Verify Organization Name/ID, Project Names/IDs are set correctly
  • Ensure resources exist in AllTrue inventory

Issue: "Could not resolve organization name"

  • Verify the name matches exactly in AllTrue UI (case-insensitive but must be exact)
  • Try using organizationId as a fallback

Issue: "Could not resolve project name"

  • Verify the name matches exactly in AllTrue UI
  • Ensure project exists and is active
  • Check if the project is in the expected organization
  • Try using projectIds as a fallback

Issue: "Could not resolve HuggingFace onboarding project name"

  • Verify the name matches exactly in AllTrue UI
  • Ensure the project exists and is active

Issue: "Permission denied accessing organization lookup endpoint"

  • Your API key may not have access to /v1/admin/customers/{customer_id}/organizations/projects
  • Contact your AllTrue Customer Success Engineer to grant appropriate permissions
  • As a workaround, use UUIDs (organizationId, projectIds) instead of names

Issue: "Missing Organization Identifier" or "Missing Project Identifier"

  • You're using resource scope without proper access control context
  • Set either organizationName/id OR projectNames/ids
  • This is a security requirement to prevent unintended customer-wide scanning

Issue: "Pentest template not found"

  • Verify template name matches exactly (case-sensitive)
  • Check available templates in AllTrue UI
  • Template Management: Configure pentest templates in the AllTrue platform UI

Issue: "Start failures - permission denied"

  • Verify API key permissions (see Security & Permissions)
  • Check AllTrue license status

Issue: "Mapped model not available on endpoint"

  • The model specified in pentestModelMapping isn't available for that specific resource
  • Check the logs for available models
  • Verify the model name matches exactly (case-sensitive)
  • The system will fall back to the endpoint's default model

Issue: "Failed to configure system prompt"

  • Verify API key has PATCH access to /v1/inventory/customer/resource/{resource_instance_id}/llm-endpoint-resource-additional-config
  • Check that resource_instance_id is valid
  • System will continue with existing configuration if PATCH fails (non-blocking)

Issue: "Timeout during polling"

  • Increase pollTimeoutSecs
  • Consider reducing maxConcurrentPentests to avoid backend spikes

"Configured work item type 'Bug' not available... Falling back to 'Issue'"

  • Expected when the project process doesn’t include Bug.
  • Set adoWorkItemType=Issue to avoid the warning.

"Azure Boards work item creation skipped"

  • Boards creation requires all three:
    • ADO_ORG_URL (or System.CollectionUri)
    • ADO_PROJECT (or System.TeamProject)
    • ADO_TOKEN (or System.AccessToken)
  • If you expect System.AccessToken to work:
    • Ensure Allow scripts to access OAuth token is enabled.
  • Check onThresholdAction includes work_item or both

Dedupe not preventing duplicates

  • Confirm the existing work items are not in terminal states listed in adoDedupeExcludeStates (default: Closed)
  • Confirm adoDedupeEnabled: true

"System.AccessToken is empty"

  • Enable "Allow scripts to access OAuth token" in pipeline settings
  • See Enabling System.AccessToken

"Free memory is lower than 5%" warnings

This warning appears when the agent is under memory pressure. Common causes:

  1. Too many concurrent scans: Reduce maxConcurrentPentests
   maxConcurrentPentests: "3"  # Instead of 8+
  1. Self-hosted agent undersized: Increase agent VM memory

    • Recommended: 8GB+ RAM for typical workloads
    • Large inventories (100+ resources): 16GB+ RAM
  2. Other jobs running: Ensure agent has dedicated capacity

Impact: Usually none - scans complete successfully despite the warning. Monitor for actual failures or timeouts.

Issue: "No hosted parallelism has been purchased or granted"

  • Your organization does not have Microsoft-hosted parallelism enabled
  • Request the free grant: https://aka.ms/azpipelines-parallelism-request
  • Or use a self-hosted agent (no parallelism limits)

HuggingFace onboarding failed

  • Verify project context is provided (huggingfaceOnboardingProjectName, huggingfaceOnboardingProjectId, projectNames, or projectIds)
  • Check model exists on HuggingFace Hub
  • Verify format: org/repo or org/repo@revision

Name Resolution Debugging

If names aren't resolving:

  1. Check console output for resolution messages:

    [org-resolve] Resolved organization name 'ACME' -> 364fe49b-...
    [proj-resolve] Resolved project name 'Production' -> 5c221ef3-...
    
  2. Verify names in AllTrue UI:

    • Log into AllTrue platform
    • Copy the exact organization name
    • Navigate to Projects and copy exact project names
  3. Check for typos (matching is case-insensitive but must be exact)

  4. Use fallback UUIDs temporarily:

    # If name resolution fails, use UUID as fallback
    organizationId: '364fe49b-6ea1-4a53-83db-f8311a9c8412'
    
  5. Verify API permissions for /v1/admin/customers/{customer_id}/organizations/projects

Model Mapping Debugging

If model mapping isn't working as expected:

  1. Check console output for model selection messages:

    [i] Model mapping found for OpenAIEndpoint: gpt-4
    [OK] Using mapped model: gpt-4
    

    OR

    [!] Mapped model 'gpt-4' not available
    [i] Available: gpt-3.5-turbo, gpt-4-turbo-preview, ...
    [i] Using endpoint default
    
  2. Verify model names are exact matches (case-sensitive)

  3. Check available models in AllTrue UI for each resource

  4. Test without mapping first to see default behavior

System Prompt Debugging

If system prompt configuration isn't working:

  1. Check console output for configuration messages:

    [i] Configuring system prompt on resource...
    [OK] System prompt configured successfully
    

    OR

    [!] Warning: Failed to configure system prompt: <error>
    
  2. Verify the resource type supports system prompts (LLM endpoints only)

  3. Check API permissions for GET and PATCH on the additional-config endpoint

  4. Test with simple system prompt first before using complex multi-line prompts

  5. Verify cleanup is working:

    [i] System prompt cleaned up from resource
    

Debug Checklist

  1. ✅ Check workflow logs for detailed error messages
  2. ✅ Verify all required secrets/variables are set
  3. ✅ Confirm API key has necessary permissions
  4. ✅ Test with simpler configuration first
  5. ✅ Review AllTrue UI for resource visibility
  6. ✅ Check for typos in names (case-insensitive but exact)
  7. ✅ Use correct syntax for your platform (PowerShell vs bash)

Support

For assistance with:

  • Configuration: Refer to examples and troubleshooting guide above
  • API Access: Contact your AllTrue Customer Success Engineer
  • Technical Issues: Use the Q&A section on this marketplace page
  • Feature Requests: Submit through the Q&A section
  • AllTrue Platform

📝 License

Copyright © 2025 AllTrue.ai Canada Inc. All rights reserved.

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft