Azure RAG Pipeline Builder

Scaffold production-ready Retrieval-Augmented Generation (RAG) pipelines from inside VS Code. Walk through an interactive wizard, pick your data source, embedding model, and language — and get a complete, deployable project in seconds.

Zero boilerplate. From empty folder to working RAG chat app + Azure AI Search pipeline + Bicep infrastructure.

Features

1. Interactive 8-Step Wizard

Run RAG Builder: New RAG Pipeline from the Command Palette and the extension walks you through every decision:

Step	What You Choose	Options
1	Project name	Free text (e.g. `contoso-kb`)
2	Output folder	Browse to any local directory
3	Data source	Azure Blob Storage, Azure SQL Database, Azure Cosmos DB
4	Azure OpenAI endpoint	Your `https://<name>.openai.azure.com/` URL
5	Embedding model	`text-embedding-3-large`, `text-embedding-3-small`, `text-embedding-ada-002`
6	Chunking strategy	Semantic (OpenAI token-aware), Fixed-size, Page-based
7	App language	Python (FastAPI) or C# (ASP.NET Minimal API)
8	Feature flags	Vector search, Semantic ranker, Streaming chat, Bicep templates, Dockerfile

Every step has sensible defaults — press Enter to accept them and generate a pipeline in under 30 seconds.

2. Azure AI Search Index Generator

Produces a fully-configured search index JSON with:

HNSW vector search — algorithm configuration with m=4, efConstruction=400, plus a vector profile tied to an Azure OpenAI vectorizer
Semantic ranker — title, chunk, and category fields configured as semantic fields
Data-source-specific fields — Blob metadata paths, SQL row IDs, or Cosmos DB _ts/partition keys
Standard fields — id, chunk, title, source, category, chunk_id, parent_id, and the chunk_vector field sized to your chosen embedding dimensions

Example output (search/index.json):

{
  "name": "contoso-kb-index",
  "fields": [
    { "name": "id", "type": "Edm.String", "key": true, "filterable": true },
    { "name": "chunk", "type": "Edm.String", "searchable": true },
    { "name": "title", "type": "Edm.String", "searchable": true, "filterable": true },
    { "name": "source", "type": "Edm.String", "filterable": true },
    { "name": "chunk_vector", "type": "Collection(Edm.Single)",
      "searchable": true, "dimensions": 1536,
      "vectorSearchProfile": "vector-profile" }
  ],
  "vectorSearch": {
    "algorithms": [{ "name": "hnsw-algorithm", "kind": "hnsw",
      "hnswParameters": { "m": 4, "efConstruction": 400, "metric": "cosine" } }],
    "profiles": [{ "name": "vector-profile", "algorithm": "hnsw-algorithm",
      "vectorizer": "openai-vectorizer" }],
    "vectorizers": [{ "name": "openai-vectorizer", "kind": "azureOpenAI",
      "azureOpenAIParameters": { "deploymentId": "text-embedding-3-large",
        "modelName": "text-embedding-3-large" } }]
  },
  "semantic": {
    "configurations": [{
      "name": "semantic-config",
      "prioritizedFields": {
        "titleField": { "fieldName": "title" },
        "contentFields": [{ "fieldName": "chunk" }],
        "keywordsFields": [{ "fieldName": "category" }]
      }
    }]
  }
}

3. Data Source & Indexer Generator

Generates a matched pair of data source + indexer JSON files tailored to your chosen data source:

Data Source	Connection Type	Change Detection	Deletion Detection
Azure Blob Storage	`azureblob` container connection	Built-in blob change tracking	Soft-delete policy
Azure SQL Database	`azuresql` with table/view	High-watermark (`last_updated` column)	Soft-delete column
Azure Cosmos DB	`cosmosdb` NoSQL API	`_ts` timestamp	Soft-delete column

Indexer includes field mappings, output field mappings for skillset-produced chunks and vectors, and an hourly schedule.

Example — Blob data source in search/datasource.json:

{
  "name": "contoso-kb-datasource",
  "type": "azureblob",
  "credentials": {
    "connectionString": "<your-storage-connection-string>"
  },
  "container": {
    "name": "documents"
  },
  "dataDeletionDetectionPolicy": {
    "@odata.type": "#Microsoft.Azure.Search.SoftDeleteColumnDeletionDetectionPolicy",
    "softDeleteColumnName": "IsDeleted",
    "softDeleteMarkerValue": "true"
  }
}

4. Skillset Generator

Creates a multi-skill cognitive skillset with:

#	Skill	Purpose
1	Text Split	Chunks documents using your chosen strategy (semantic/fixed/page) and size
2	Chunk Shaper	Generates deterministic chunk IDs from document key + chunk index
3	Azure OpenAI Embedding	Calls your embedding deployment to vectorize each chunk (conditional on vector search)
4	Entity Recognition V3	Extracts organizations, products, events, and locations
5	Key Phrase Extraction	Pulls out key phrases for metadata enrichment

Includes a Knowledge Store with table projections for analytics.

Example — Embedding skill excerpt from search/skillset.json:

{
  "@odata.type": "#Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill",
  "name": "embedding-skill",
  "description": "Generate vector embeddings via Azure OpenAI",
  "context": "/document/chunks/*",
  "deploymentId": "text-embedding-3-large",
  "modelName": "text-embedding-3-large",
  "inputs": [{ "name": "text", "source": "/document/chunks/*/chunk" }],
  "outputs": [{ "name": "embedding", "targetName": "chunk_vector" }]
}

5. Python App Scaffolder (FastAPI)

Generates 7 production-ready files:

app/
├── requirements.txt        # All dependencies pinned
├── .env.sample             # Connection strings template
├── app/
│   ├── main.py             # FastAPI app with /health, /chat, /chat/stream
│   ├── config.py           # Pydantic Settings from environment variables
│   ├── search_client.py    # Azure AI Search retrieval (vector + semantic)
│   └── chat.py             # RAG orchestration with Azure OpenAI
└── Dockerfile              # python:3.12-slim container

Key capabilities:

Vector search — uses VectorizableTextQuery so the Search service calls your vectorizer automatically
Semantic ranker — configures query_type="semantic" with your semantic config
Streaming — Server-Sent Events (SSE) endpoint at /chat/stream for token-by-token responses
Managed identity — toggle USE_MANAGED_IDENTITY=true to use DefaultAzureCredential instead of API keys
Conversation history — retains the last 6 turns for multi-turn conversations
Citation instructions — system prompt instructs the model to cite sources by document title

Example — calling the chat endpoint:

# Non-streaming
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the return policy for electronics?",
    "history": [
      {"role": "user", "content": "Tell me about warranties"},
      {"role": "assistant", "content": "Our warranty covers..."}
    ]
  }'

# Streaming (SSE)
curl -N -X POST http://localhost:8000/chat/stream \
  -H "Content-Type: application/json" \
  -d '{"query": "Summarize the HR benefits package"}'

Example response:

{
  "answer": "According to the employee handbook [1], the benefits package includes...",
  "sources": [
    { "title": "Employee Handbook 2025", "source": "https://storage.blob.core.windows.net/docs/handbook.pdf" },
    { "title": "Benefits FAQ", "source": "https://storage.blob.core.windows.net/docs/faq.md" }
  ]
}

6. C# App Scaffolder (ASP.NET Minimal API)

Generates 8 production-ready files:

app/
├── contoso-kb.csproj       # .NET 8 project with Azure SDK packages
├── appsettings.json        # Configuration for Search + OpenAI + RAG settings
├── Program.cs              # Minimal API with /health, /chat, /chat/stream
├── Models/
│   ├── ChatRequest.cs      # Request record (Query, History)
│   └── ChatResponse.cs     # Response record (Answer, Sources, Usage)
├── Services/
│   ├── SearchService.cs    # Azure AI Search with VectorizableTextQuery
│   └── ChatService.cs      # RAG orchestration with AzureOpenAIClient
└── Dockerfile              # Multi-stage .NET 8 build

NuGet packages included:

Azure.Search.Documents 11.6.0
Azure.AI.OpenAI 2.0.0
Azure.Identity 1.12.0

Example — calling the C# chat endpoint:

curl -X POST http://localhost:5000/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "How do I configure VPN access?"}'

Example response:

{
  "answer": "To configure VPN access, follow these steps from the IT Security Guide [1]...",
  "sources": [
    { "title": "IT Security Guide", "source": "blob://docs/security-guide.pdf" }
  ],
  "usage": { "promptTokens": 1240, "completionTokens": 387 }
}

7. Bicep Infrastructure Templates

One command deploys your entire Azure backend:

infra/
├── main.bicep                  # Orchestrator (parameters + module references)
├── main.parameters.json        # Parameter values file
└── modules/
    ├── search.bicep             # Azure AI Search (Basic SKU, semantic enabled)
    ├── openai.bicep             # Azure OpenAI + embedding & chat deployments
    ├── storage.bicep            # Blob Storage account + container  (Blob data source)
    ├── sql.bicep                # Azure SQL Server + database       (SQL data source)
    └── cosmos.bicep             # Cosmos DB account + SQL database  (Cosmos data source)
deploy.ps1                      # One-click PowerShell deployment script

Only the data-source module matching your selection is included. Resources are named with a configurable prefix.

Example deployment:

# One-click deploy
./deploy.ps1

# Or manually:
az group create --name contoso-kb-rg --location eastus2
az deployment group create \
    --resource-group contoso-kb-rg \
    --template-file infra/main.bicep \
    --parameters infra/main.parameters.json

Outputs the Search endpoint, OpenAI endpoint, and data source connection info.

Commands

Open the Command Palette (Ctrl+Shift+P) and type RAG Builder:

Command	Description
`RAG Builder: New RAG Pipeline`	Full wizard → generates search JSON + app code + Bicep infrastructure + README
`RAG Builder: Generate AI Search Index Only`	Wizard → opens the index JSON in the editor
`RAG Builder: Generate Skillset + Indexer Only`	Wizard → opens the skillset JSON in the editor
`RAG Builder: Generate Chat App Only`	Wizard → writes Python or C# app files to your output folder

Extension Settings

Configure defaults via Settings → Extensions → Azure RAG Pipeline Builder:

Setting	Default	Description
`ragBuilder.defaultLanguage`	`python`	Default app language (`python` or `csharp`)
`ragBuilder.defaultDataSource`	`blob`	Default data source (`blob`, `sql`, `cosmosdb`)
`ragBuilder.embeddingModel`	`text-embedding-3-large`	Azure OpenAI embedding deployment name
`ragBuilder.embeddingDimensions`	`1536`	Vector dimensions for the embedding model
`ragBuilder.chatModel`	`gpt-4o`	Azure OpenAI chat model deployment name
`ragBuilder.chunkSize`	`1024`	Target chunk size in tokens
`ragBuilder.chunkOverlap`	`128`	Overlap between consecutive chunks in tokens

End-to-End Walkthrough

Scenario: Build a knowledge-base chatbot over PDF documents in Blob Storage

Step 1 — Generate the pipeline

Press Ctrl+Shift+P → RAG Builder: New RAG Pipeline
Enter project name: contoso-kb
Pick output folder: C:\Projects\contoso-kb
Data source: Azure Blob Storage → container: documents
OpenAI endpoint: https://myopenai.openai.azure.com/
Embedding model: text-embedding-3-large
Chunking: Semantic (1024 tokens, 128 overlap)
Language: Python
Features: ✅ Vector Search, ✅ Semantic Ranker, ✅ Streaming, ✅ Bicep, ✅ Dockerfile

Step 2 — Deploy infrastructure

cd C:\Projects\contoso-kb
./deploy.ps1
# Creates: contoso-kb-rg with AI Search + OpenAI + Storage Account

Step 3 — Upload documents

az storage blob upload-batch \
    --account-name contosokbstor \
    --destination documents \
    --source ./my-pdfs/

Step 4 — Create the search pipeline

Upload the generated JSON to your Search service using the REST API or Azure Portal:

# Create the index
curl -X POST "https://contosokbsearch.search.windows.net/indexes?api-version=2024-07-01" \
  -H "api-key: <key>" -H "Content-Type: application/json" \
  -d @search/index.json

# Create the data source
curl -X POST "https://contosokbsearch.search.windows.net/datasources?api-version=2024-07-01" \
  -H "api-key: <key>" -H "Content-Type: application/json" \
  -d @search/datasource.json

# Create the skillset
curl -X POST "https://contosokbsearch.search.windows.net/skillsets?api-version=2024-07-01" \
  -H "api-key: <key>" -H "Content-Type: application/json" \
  -d @search/skillset.json

# Create the indexer (starts crawling immediately)
curl -X POST "https://contosokbsearch.search.windows.net/indexers?api-version=2024-07-01" \
  -H "api-key: <key>" -H "Content-Type: application/json" \
  -d @search/indexer.json

Step 5 — Run the app

cd app
cp .env.sample .env
# Fill in AZURE_SEARCH_ENDPOINT, AZURE_OPENAI_ENDPOINT, API keys

pip install -r requirements.txt
uvicorn app.main:app --reload

Step 6 — Chat!

curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the refund policy for enterprise customers?"}'

Generated Project Structure

contoso-kb/
├── README.md                    # Auto-generated project docs
├── deploy.ps1                   # Infrastructure deployment script
│
├── search/                      # Azure AI Search artifacts
│   ├── index.json               # Index with vector + semantic config
│   ├── datasource.json          # Data source connection
│   ├── indexer.json             # Indexer with field mappings
│   └── skillset.json            # Cognitive skillset (chunking + embedding + enrichment)
│
├── app/                         # Chat application
│   ├── requirements.txt / .csproj
│   ├── .env.sample / appsettings.json
│   ├── app/main.py / Program.cs
│   ├── app/config.py / Models/
│   ├── app/search_client.py / Services/SearchService.cs
│   ├── app/chat.py / Services/ChatService.cs
│   └── Dockerfile
│
└── infra/                       # Bicep IaC templates
    ├── main.bicep
    ├── main.parameters.json
    └── modules/
        ├── search.bicep
        ├── openai.bicep
        └── storage.bicep / sql.bicep / cosmos.bicep

Supported Configurations

Data Sources

Source	Container Config	Change Detection
Azure Blob Storage	Container name	Native blob tracking
Azure SQL Database	Table/view name	High-watermark column
Azure Cosmos DB (NoSQL)	Database + container	`_ts` timestamp

Embedding Models

Model	Dimensions	Notes
`text-embedding-3-large`	1536 (default) or 3072	Best quality, higher cost
`text-embedding-3-small`	1536	Good balance of quality and cost
`text-embedding-ada-002`	1536	Legacy model, widely deployed

Chunking Strategies

Strategy	Best For	Description
Semantic	General documents	Token-aware splitting that respects sentence boundaries
Fixed-size	Uniform content	Equal-sized chunks with configurable overlap
Page-based	PDFs, presentations	Preserves page boundaries from the source document

Requirements

VS Code 1.85.0 or later
Azure subscription with:
- Azure AI Search service (Basic tier or above for semantic ranker)
- Azure OpenAI resource with embedding + chat model deployments
- Data source (Blob Storage, Azure SQL, or Cosmos DB)
Azure CLI (for Bicep deployment)
Python 3.10+ or .NET 8 (for running the generated app)

License

MIT

Azure RAG Pipeline Builder

Shas Vaddi

Azure RAG Pipeline Builder

Features

1. Interactive 8-Step Wizard

2. Azure AI Search Index Generator

3. Data Source & Indexer Generator

4. Skillset Generator

5. Python App Scaffolder (FastAPI)

6. C# App Scaffolder (ASP.NET Minimal API)

7. Bicep Infrastructure Templates

Commands

Extension Settings

End-to-End Walkthrough

Scenario: Build a knowledge-base chatbot over PDF documents in Blob Storage

Generated Project Structure

Supported Configurations

Data Sources

Embedding Models

Chunking Strategies

Requirements

License