Skip to content
| Marketplace
Sign in
Visual Studio Code>Snippets>Azure RAG Pipeline BuilderNew to Visual Studio Code? Get it now.
Azure RAG Pipeline Builder

Azure RAG Pipeline Builder

Shas Vaddi

|
1 install
| (0) | Free
Scaffold end-to-end Retrieval-Augmented Generation apps: select a data source, auto-generate AI Search indexes, indexers, skillsets with Azure OpenAI embeddings, and a working Python or C# chat app.
Installation
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
Copied to clipboard
More Info

Azure RAG Pipeline Builder

Scaffold production-ready Retrieval-Augmented Generation (RAG) pipelines from inside VS Code. Walk through an interactive wizard, pick your data source, embedding model, and language — and get a complete, deployable project in seconds.

Zero boilerplate. From empty folder to working RAG chat app + Azure AI Search pipeline + Bicep infrastructure.


Features

1. Interactive 8-Step Wizard

Run RAG Builder: New RAG Pipeline from the Command Palette and the extension walks you through every decision:

Step What You Choose Options
1 Project name Free text (e.g. contoso-kb)
2 Output folder Browse to any local directory
3 Data source Azure Blob Storage, Azure SQL Database, Azure Cosmos DB
4 Azure OpenAI endpoint Your https://<name>.openai.azure.com/ URL
5 Embedding model text-embedding-3-large, text-embedding-3-small, text-embedding-ada-002
6 Chunking strategy Semantic (OpenAI token-aware), Fixed-size, Page-based
7 App language Python (FastAPI) or C# (ASP.NET Minimal API)
8 Feature flags Vector search, Semantic ranker, Streaming chat, Bicep templates, Dockerfile

Every step has sensible defaults — press Enter to accept them and generate a pipeline in under 30 seconds.


2. Azure AI Search Index Generator

Produces a fully-configured search index JSON with:

  • HNSW vector search — algorithm configuration with m=4, efConstruction=400, plus a vector profile tied to an Azure OpenAI vectorizer
  • Semantic ranker — title, chunk, and category fields configured as semantic fields
  • Data-source-specific fields — Blob metadata paths, SQL row IDs, or Cosmos DB _ts/partition keys
  • Standard fields — id, chunk, title, source, category, chunk_id, parent_id, and the chunk_vector field sized to your chosen embedding dimensions

Example output (search/index.json):

{
  "name": "contoso-kb-index",
  "fields": [
    { "name": "id", "type": "Edm.String", "key": true, "filterable": true },
    { "name": "chunk", "type": "Edm.String", "searchable": true },
    { "name": "title", "type": "Edm.String", "searchable": true, "filterable": true },
    { "name": "source", "type": "Edm.String", "filterable": true },
    { "name": "chunk_vector", "type": "Collection(Edm.Single)",
      "searchable": true, "dimensions": 1536,
      "vectorSearchProfile": "vector-profile" }
  ],
  "vectorSearch": {
    "algorithms": [{ "name": "hnsw-algorithm", "kind": "hnsw",
      "hnswParameters": { "m": 4, "efConstruction": 400, "metric": "cosine" } }],
    "profiles": [{ "name": "vector-profile", "algorithm": "hnsw-algorithm",
      "vectorizer": "openai-vectorizer" }],
    "vectorizers": [{ "name": "openai-vectorizer", "kind": "azureOpenAI",
      "azureOpenAIParameters": { "deploymentId": "text-embedding-3-large",
        "modelName": "text-embedding-3-large" } }]
  },
  "semantic": {
    "configurations": [{
      "name": "semantic-config",
      "prioritizedFields": {
        "titleField": { "fieldName": "title" },
        "contentFields": [{ "fieldName": "chunk" }],
        "keywordsFields": [{ "fieldName": "category" }]
      }
    }]
  }
}

3. Data Source & Indexer Generator

Generates a matched pair of data source + indexer JSON files tailored to your chosen data source:

Data Source Connection Type Change Detection Deletion Detection
Azure Blob Storage azureblob container connection Built-in blob change tracking Soft-delete policy
Azure SQL Database azuresql with table/view High-watermark (last_updated column) Soft-delete column
Azure Cosmos DB cosmosdb NoSQL API _ts timestamp Soft-delete column

Indexer includes field mappings, output field mappings for skillset-produced chunks and vectors, and an hourly schedule.

Example — Blob data source in search/datasource.json:

{
  "name": "contoso-kb-datasource",
  "type": "azureblob",
  "credentials": {
    "connectionString": "<your-storage-connection-string>"
  },
  "container": {
    "name": "documents"
  },
  "dataDeletionDetectionPolicy": {
    "@odata.type": "#Microsoft.Azure.Search.SoftDeleteColumnDeletionDetectionPolicy",
    "softDeleteColumnName": "IsDeleted",
    "softDeleteMarkerValue": "true"
  }
}

4. Skillset Generator

Creates a multi-skill cognitive skillset with:

# Skill Purpose
1 Text Split Chunks documents using your chosen strategy (semantic/fixed/page) and size
2 Chunk Shaper Generates deterministic chunk IDs from document key + chunk index
3 Azure OpenAI Embedding Calls your embedding deployment to vectorize each chunk (conditional on vector search)
4 Entity Recognition V3 Extracts organizations, products, events, and locations
5 Key Phrase Extraction Pulls out key phrases for metadata enrichment

Includes a Knowledge Store with table projections for analytics.

Example — Embedding skill excerpt from search/skillset.json:

{
  "@odata.type": "#Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill",
  "name": "embedding-skill",
  "description": "Generate vector embeddings via Azure OpenAI",
  "context": "/document/chunks/*",
  "deploymentId": "text-embedding-3-large",
  "modelName": "text-embedding-3-large",
  "inputs": [{ "name": "text", "source": "/document/chunks/*/chunk" }],
  "outputs": [{ "name": "embedding", "targetName": "chunk_vector" }]
}

5. Python App Scaffolder (FastAPI)

Generates 7 production-ready files:

app/
├── requirements.txt        # All dependencies pinned
├── .env.sample             # Connection strings template
├── app/
│   ├── main.py             # FastAPI app with /health, /chat, /chat/stream
│   ├── config.py           # Pydantic Settings from environment variables
│   ├── search_client.py    # Azure AI Search retrieval (vector + semantic)
│   └── chat.py             # RAG orchestration with Azure OpenAI
└── Dockerfile              # python:3.12-slim container

Key capabilities:

  • Vector search — uses VectorizableTextQuery so the Search service calls your vectorizer automatically
  • Semantic ranker — configures query_type="semantic" with your semantic config
  • Streaming — Server-Sent Events (SSE) endpoint at /chat/stream for token-by-token responses
  • Managed identity — toggle USE_MANAGED_IDENTITY=true to use DefaultAzureCredential instead of API keys
  • Conversation history — retains the last 6 turns for multi-turn conversations
  • Citation instructions — system prompt instructs the model to cite sources by document title

Example — calling the chat endpoint:

# Non-streaming
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the return policy for electronics?",
    "history": [
      {"role": "user", "content": "Tell me about warranties"},
      {"role": "assistant", "content": "Our warranty covers..."}
    ]
  }'

# Streaming (SSE)
curl -N -X POST http://localhost:8000/chat/stream \
  -H "Content-Type: application/json" \
  -d '{"query": "Summarize the HR benefits package"}'

Example response:

{
  "answer": "According to the employee handbook [1], the benefits package includes...",
  "sources": [
    { "title": "Employee Handbook 2025", "source": "https://storage.blob.core.windows.net/docs/handbook.pdf" },
    { "title": "Benefits FAQ", "source": "https://storage.blob.core.windows.net/docs/faq.md" }
  ]
}

6. C# App Scaffolder (ASP.NET Minimal API)

Generates 8 production-ready files:

app/
├── contoso-kb.csproj       # .NET 8 project with Azure SDK packages
├── appsettings.json        # Configuration for Search + OpenAI + RAG settings
├── Program.cs              # Minimal API with /health, /chat, /chat/stream
├── Models/
│   ├── ChatRequest.cs      # Request record (Query, History)
│   └── ChatResponse.cs     # Response record (Answer, Sources, Usage)
├── Services/
│   ├── SearchService.cs    # Azure AI Search with VectorizableTextQuery
│   └── ChatService.cs      # RAG orchestration with AzureOpenAIClient
└── Dockerfile              # Multi-stage .NET 8 build

NuGet packages included:

  • Azure.Search.Documents 11.6.0
  • Azure.AI.OpenAI 2.0.0
  • Azure.Identity 1.12.0

Example — calling the C# chat endpoint:

curl -X POST http://localhost:5000/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "How do I configure VPN access?"}'

Example response:

{
  "answer": "To configure VPN access, follow these steps from the IT Security Guide [1]...",
  "sources": [
    { "title": "IT Security Guide", "source": "blob://docs/security-guide.pdf" }
  ],
  "usage": { "promptTokens": 1240, "completionTokens": 387 }
}

7. Bicep Infrastructure Templates

One command deploys your entire Azure backend:

infra/
├── main.bicep                  # Orchestrator (parameters + module references)
├── main.parameters.json        # Parameter values file
└── modules/
    ├── search.bicep             # Azure AI Search (Basic SKU, semantic enabled)
    ├── openai.bicep             # Azure OpenAI + embedding & chat deployments
    ├── storage.bicep            # Blob Storage account + container  (Blob data source)
    ├── sql.bicep                # Azure SQL Server + database       (SQL data source)
    └── cosmos.bicep             # Cosmos DB account + SQL database  (Cosmos data source)
deploy.ps1                      # One-click PowerShell deployment script

Only the data-source module matching your selection is included. Resources are named with a configurable prefix.

Example deployment:

# One-click deploy
./deploy.ps1

# Or manually:
az group create --name contoso-kb-rg --location eastus2
az deployment group create \
    --resource-group contoso-kb-rg \
    --template-file infra/main.bicep \
    --parameters infra/main.parameters.json

Outputs the Search endpoint, OpenAI endpoint, and data source connection info.


Commands

Open the Command Palette (Ctrl+Shift+P) and type RAG Builder:

Command Description
RAG Builder: New RAG Pipeline Full wizard → generates search JSON + app code + Bicep infrastructure + README
RAG Builder: Generate AI Search Index Only Wizard → opens the index JSON in the editor
RAG Builder: Generate Skillset + Indexer Only Wizard → opens the skillset JSON in the editor
RAG Builder: Generate Chat App Only Wizard → writes Python or C# app files to your output folder

Extension Settings

Configure defaults via Settings → Extensions → Azure RAG Pipeline Builder:

Setting Default Description
ragBuilder.defaultLanguage python Default app language (python or csharp)
ragBuilder.defaultDataSource blob Default data source (blob, sql, cosmosdb)
ragBuilder.embeddingModel text-embedding-3-large Azure OpenAI embedding deployment name
ragBuilder.embeddingDimensions 1536 Vector dimensions for the embedding model
ragBuilder.chatModel gpt-4o Azure OpenAI chat model deployment name
ragBuilder.chunkSize 1024 Target chunk size in tokens
ragBuilder.chunkOverlap 128 Overlap between consecutive chunks in tokens

End-to-End Walkthrough

Scenario: Build a knowledge-base chatbot over PDF documents in Blob Storage

Step 1 — Generate the pipeline

  1. Press Ctrl+Shift+P → RAG Builder: New RAG Pipeline
  2. Enter project name: contoso-kb
  3. Pick output folder: C:\Projects\contoso-kb
  4. Data source: Azure Blob Storage → container: documents
  5. OpenAI endpoint: https://myopenai.openai.azure.com/
  6. Embedding model: text-embedding-3-large
  7. Chunking: Semantic (1024 tokens, 128 overlap)
  8. Language: Python
  9. Features: ✅ Vector Search, ✅ Semantic Ranker, ✅ Streaming, ✅ Bicep, ✅ Dockerfile

Step 2 — Deploy infrastructure

cd C:\Projects\contoso-kb
./deploy.ps1
# Creates: contoso-kb-rg with AI Search + OpenAI + Storage Account

Step 3 — Upload documents

az storage blob upload-batch \
    --account-name contosokbstor \
    --destination documents \
    --source ./my-pdfs/

Step 4 — Create the search pipeline

Upload the generated JSON to your Search service using the REST API or Azure Portal:

# Create the index
curl -X POST "https://contosokbsearch.search.windows.net/indexes?api-version=2024-07-01" \
  -H "api-key: <key>" -H "Content-Type: application/json" \
  -d @search/index.json

# Create the data source
curl -X POST "https://contosokbsearch.search.windows.net/datasources?api-version=2024-07-01" \
  -H "api-key: <key>" -H "Content-Type: application/json" \
  -d @search/datasource.json

# Create the skillset
curl -X POST "https://contosokbsearch.search.windows.net/skillsets?api-version=2024-07-01" \
  -H "api-key: <key>" -H "Content-Type: application/json" \
  -d @search/skillset.json

# Create the indexer (starts crawling immediately)
curl -X POST "https://contosokbsearch.search.windows.net/indexers?api-version=2024-07-01" \
  -H "api-key: <key>" -H "Content-Type: application/json" \
  -d @search/indexer.json

Step 5 — Run the app

cd app
cp .env.sample .env
# Fill in AZURE_SEARCH_ENDPOINT, AZURE_OPENAI_ENDPOINT, API keys

pip install -r requirements.txt
uvicorn app.main:app --reload

Step 6 — Chat!

curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the refund policy for enterprise customers?"}'

Generated Project Structure

contoso-kb/
├── README.md                    # Auto-generated project docs
├── deploy.ps1                   # Infrastructure deployment script
│
├── search/                      # Azure AI Search artifacts
│   ├── index.json               # Index with vector + semantic config
│   ├── datasource.json          # Data source connection
│   ├── indexer.json             # Indexer with field mappings
│   └── skillset.json            # Cognitive skillset (chunking + embedding + enrichment)
│
├── app/                         # Chat application
│   ├── requirements.txt / .csproj
│   ├── .env.sample / appsettings.json
│   ├── app/main.py / Program.cs
│   ├── app/config.py / Models/
│   ├── app/search_client.py / Services/SearchService.cs
│   ├── app/chat.py / Services/ChatService.cs
│   └── Dockerfile
│
└── infra/                       # Bicep IaC templates
    ├── main.bicep
    ├── main.parameters.json
    └── modules/
        ├── search.bicep
        ├── openai.bicep
        └── storage.bicep / sql.bicep / cosmos.bicep

Supported Configurations

Data Sources

Source Container Config Change Detection
Azure Blob Storage Container name Native blob tracking
Azure SQL Database Table/view name High-watermark column
Azure Cosmos DB (NoSQL) Database + container _ts timestamp

Embedding Models

Model Dimensions Notes
text-embedding-3-large 1536 (default) or 3072 Best quality, higher cost
text-embedding-3-small 1536 Good balance of quality and cost
text-embedding-ada-002 1536 Legacy model, widely deployed

Chunking Strategies

Strategy Best For Description
Semantic General documents Token-aware splitting that respects sentence boundaries
Fixed-size Uniform content Equal-sized chunks with configurable overlap
Page-based PDFs, presentations Preserves page boundaries from the source document

Requirements

  • VS Code 1.85.0 or later
  • Azure subscription with:
    • Azure AI Search service (Basic tier or above for semantic ranker)
    • Azure OpenAI resource with embedding + chat model deployments
    • Data source (Blob Storage, Azure SQL, or Cosmos DB)
  • Azure CLI (for Bicep deployment)
  • Python 3.10+ or .NET 8 (for running the generated app)

License

MIT

  • Contact us
  • Jobs
  • Privacy
  • Manage cookies
  • Terms of use
  • Trademarks
© 2026 Microsoft