Azure RAG Pipeline Builder
Scaffold production-ready Retrieval-Augmented Generation (RAG) pipelines from inside VS Code. Walk through an interactive wizard, pick your data source, embedding model, and language — and get a complete, deployable project in seconds.
Zero boilerplate. From empty folder to working RAG chat app + Azure AI Search pipeline + Bicep infrastructure.
Features
1. Interactive 8-Step Wizard
Run RAG Builder: New RAG Pipeline from the Command Palette and the extension walks you through every decision:
| Step |
What You Choose |
Options |
| 1 |
Project name |
Free text (e.g. contoso-kb) |
| 2 |
Output folder |
Browse to any local directory |
| 3 |
Data source |
Azure Blob Storage, Azure SQL Database, Azure Cosmos DB |
| 4 |
Azure OpenAI endpoint |
Your https://<name>.openai.azure.com/ URL |
| 5 |
Embedding model |
text-embedding-3-large, text-embedding-3-small, text-embedding-ada-002 |
| 6 |
Chunking strategy |
Semantic (OpenAI token-aware), Fixed-size, Page-based |
| 7 |
App language |
Python (FastAPI) or C# (ASP.NET Minimal API) |
| 8 |
Feature flags |
Vector search, Semantic ranker, Streaming chat, Bicep templates, Dockerfile |
Every step has sensible defaults — press Enter to accept them and generate a pipeline in under 30 seconds.
2. Azure AI Search Index Generator
Produces a fully-configured search index JSON with:
- HNSW vector search — algorithm configuration with
m=4, efConstruction=400, plus a vector profile tied to an Azure OpenAI vectorizer
- Semantic ranker — title, chunk, and category fields configured as semantic fields
- Data-source-specific fields — Blob metadata paths, SQL row IDs, or Cosmos DB
_ts/partition keys
- Standard fields —
id, chunk, title, source, category, chunk_id, parent_id, and the chunk_vector field sized to your chosen embedding dimensions
Example output (search/index.json):
{
"name": "contoso-kb-index",
"fields": [
{ "name": "id", "type": "Edm.String", "key": true, "filterable": true },
{ "name": "chunk", "type": "Edm.String", "searchable": true },
{ "name": "title", "type": "Edm.String", "searchable": true, "filterable": true },
{ "name": "source", "type": "Edm.String", "filterable": true },
{ "name": "chunk_vector", "type": "Collection(Edm.Single)",
"searchable": true, "dimensions": 1536,
"vectorSearchProfile": "vector-profile" }
],
"vectorSearch": {
"algorithms": [{ "name": "hnsw-algorithm", "kind": "hnsw",
"hnswParameters": { "m": 4, "efConstruction": 400, "metric": "cosine" } }],
"profiles": [{ "name": "vector-profile", "algorithm": "hnsw-algorithm",
"vectorizer": "openai-vectorizer" }],
"vectorizers": [{ "name": "openai-vectorizer", "kind": "azureOpenAI",
"azureOpenAIParameters": { "deploymentId": "text-embedding-3-large",
"modelName": "text-embedding-3-large" } }]
},
"semantic": {
"configurations": [{
"name": "semantic-config",
"prioritizedFields": {
"titleField": { "fieldName": "title" },
"contentFields": [{ "fieldName": "chunk" }],
"keywordsFields": [{ "fieldName": "category" }]
}
}]
}
}
3. Data Source & Indexer Generator
Generates a matched pair of data source + indexer JSON files tailored to your chosen data source:
| Data Source |
Connection Type |
Change Detection |
Deletion Detection |
| Azure Blob Storage |
azureblob container connection |
Built-in blob change tracking |
Soft-delete policy |
| Azure SQL Database |
azuresql with table/view |
High-watermark (last_updated column) |
Soft-delete column |
| Azure Cosmos DB |
cosmosdb NoSQL API |
_ts timestamp |
Soft-delete column |
Indexer includes field mappings, output field mappings for skillset-produced chunks and vectors, and an hourly schedule.
Example — Blob data source in search/datasource.json:
{
"name": "contoso-kb-datasource",
"type": "azureblob",
"credentials": {
"connectionString": "<your-storage-connection-string>"
},
"container": {
"name": "documents"
},
"dataDeletionDetectionPolicy": {
"@odata.type": "#Microsoft.Azure.Search.SoftDeleteColumnDeletionDetectionPolicy",
"softDeleteColumnName": "IsDeleted",
"softDeleteMarkerValue": "true"
}
}
4. Skillset Generator
Creates a multi-skill cognitive skillset with:
| # |
Skill |
Purpose |
| 1 |
Text Split |
Chunks documents using your chosen strategy (semantic/fixed/page) and size |
| 2 |
Chunk Shaper |
Generates deterministic chunk IDs from document key + chunk index |
| 3 |
Azure OpenAI Embedding |
Calls your embedding deployment to vectorize each chunk (conditional on vector search) |
| 4 |
Entity Recognition V3 |
Extracts organizations, products, events, and locations |
| 5 |
Key Phrase Extraction |
Pulls out key phrases for metadata enrichment |
Includes a Knowledge Store with table projections for analytics.
Example — Embedding skill excerpt from search/skillset.json:
{
"@odata.type": "#Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill",
"name": "embedding-skill",
"description": "Generate vector embeddings via Azure OpenAI",
"context": "/document/chunks/*",
"deploymentId": "text-embedding-3-large",
"modelName": "text-embedding-3-large",
"inputs": [{ "name": "text", "source": "/document/chunks/*/chunk" }],
"outputs": [{ "name": "embedding", "targetName": "chunk_vector" }]
}
5. Python App Scaffolder (FastAPI)
Generates 7 production-ready files:
app/
├── requirements.txt # All dependencies pinned
├── .env.sample # Connection strings template
├── app/
│ ├── main.py # FastAPI app with /health, /chat, /chat/stream
│ ├── config.py # Pydantic Settings from environment variables
│ ├── search_client.py # Azure AI Search retrieval (vector + semantic)
│ └── chat.py # RAG orchestration with Azure OpenAI
└── Dockerfile # python:3.12-slim container
Key capabilities:
- Vector search — uses
VectorizableTextQuery so the Search service calls your vectorizer automatically
- Semantic ranker — configures
query_type="semantic" with your semantic config
- Streaming — Server-Sent Events (SSE) endpoint at
/chat/stream for token-by-token responses
- Managed identity — toggle
USE_MANAGED_IDENTITY=true to use DefaultAzureCredential instead of API keys
- Conversation history — retains the last 6 turns for multi-turn conversations
- Citation instructions — system prompt instructs the model to cite sources by document title
Example — calling the chat endpoint:
# Non-streaming
curl -X POST http://localhost:8000/chat \
-H "Content-Type: application/json" \
-d '{
"query": "What is the return policy for electronics?",
"history": [
{"role": "user", "content": "Tell me about warranties"},
{"role": "assistant", "content": "Our warranty covers..."}
]
}'
# Streaming (SSE)
curl -N -X POST http://localhost:8000/chat/stream \
-H "Content-Type: application/json" \
-d '{"query": "Summarize the HR benefits package"}'
Example response:
{
"answer": "According to the employee handbook [1], the benefits package includes...",
"sources": [
{ "title": "Employee Handbook 2025", "source": "https://storage.blob.core.windows.net/docs/handbook.pdf" },
{ "title": "Benefits FAQ", "source": "https://storage.blob.core.windows.net/docs/faq.md" }
]
}
6. C# App Scaffolder (ASP.NET Minimal API)
Generates 8 production-ready files:
app/
├── contoso-kb.csproj # .NET 8 project with Azure SDK packages
├── appsettings.json # Configuration for Search + OpenAI + RAG settings
├── Program.cs # Minimal API with /health, /chat, /chat/stream
├── Models/
│ ├── ChatRequest.cs # Request record (Query, History)
│ └── ChatResponse.cs # Response record (Answer, Sources, Usage)
├── Services/
│ ├── SearchService.cs # Azure AI Search with VectorizableTextQuery
│ └── ChatService.cs # RAG orchestration with AzureOpenAIClient
└── Dockerfile # Multi-stage .NET 8 build
NuGet packages included:
Azure.Search.Documents 11.6.0
Azure.AI.OpenAI 2.0.0
Azure.Identity 1.12.0
Example — calling the C# chat endpoint:
curl -X POST http://localhost:5000/chat \
-H "Content-Type: application/json" \
-d '{"query": "How do I configure VPN access?"}'
Example response:
{
"answer": "To configure VPN access, follow these steps from the IT Security Guide [1]...",
"sources": [
{ "title": "IT Security Guide", "source": "blob://docs/security-guide.pdf" }
],
"usage": { "promptTokens": 1240, "completionTokens": 387 }
}
7. Bicep Infrastructure Templates
One command deploys your entire Azure backend:
infra/
├── main.bicep # Orchestrator (parameters + module references)
├── main.parameters.json # Parameter values file
└── modules/
├── search.bicep # Azure AI Search (Basic SKU, semantic enabled)
├── openai.bicep # Azure OpenAI + embedding & chat deployments
├── storage.bicep # Blob Storage account + container (Blob data source)
├── sql.bicep # Azure SQL Server + database (SQL data source)
└── cosmos.bicep # Cosmos DB account + SQL database (Cosmos data source)
deploy.ps1 # One-click PowerShell deployment script
Only the data-source module matching your selection is included. Resources are named with a configurable prefix.
Example deployment:
# One-click deploy
./deploy.ps1
# Or manually:
az group create --name contoso-kb-rg --location eastus2
az deployment group create \
--resource-group contoso-kb-rg \
--template-file infra/main.bicep \
--parameters infra/main.parameters.json
Outputs the Search endpoint, OpenAI endpoint, and data source connection info.
Commands
Open the Command Palette (Ctrl+Shift+P) and type RAG Builder:
| Command |
Description |
RAG Builder: New RAG Pipeline |
Full wizard → generates search JSON + app code + Bicep infrastructure + README |
RAG Builder: Generate AI Search Index Only |
Wizard → opens the index JSON in the editor |
RAG Builder: Generate Skillset + Indexer Only |
Wizard → opens the skillset JSON in the editor |
RAG Builder: Generate Chat App Only |
Wizard → writes Python or C# app files to your output folder |
Extension Settings
Configure defaults via Settings → Extensions → Azure RAG Pipeline Builder:
| Setting |
Default |
Description |
ragBuilder.defaultLanguage |
python |
Default app language (python or csharp) |
ragBuilder.defaultDataSource |
blob |
Default data source (blob, sql, cosmosdb) |
ragBuilder.embeddingModel |
text-embedding-3-large |
Azure OpenAI embedding deployment name |
ragBuilder.embeddingDimensions |
1536 |
Vector dimensions for the embedding model |
ragBuilder.chatModel |
gpt-4o |
Azure OpenAI chat model deployment name |
ragBuilder.chunkSize |
1024 |
Target chunk size in tokens |
ragBuilder.chunkOverlap |
128 |
Overlap between consecutive chunks in tokens |
End-to-End Walkthrough
Scenario: Build a knowledge-base chatbot over PDF documents in Blob Storage
Step 1 — Generate the pipeline
- Press
Ctrl+Shift+P → RAG Builder: New RAG Pipeline
- Enter project name:
contoso-kb
- Pick output folder:
C:\Projects\contoso-kb
- Data source: Azure Blob Storage → container:
documents
- OpenAI endpoint:
https://myopenai.openai.azure.com/
- Embedding model:
text-embedding-3-large
- Chunking: Semantic (1024 tokens, 128 overlap)
- Language: Python
- Features: ✅ Vector Search, ✅ Semantic Ranker, ✅ Streaming, ✅ Bicep, ✅ Dockerfile
Step 2 — Deploy infrastructure
cd C:\Projects\contoso-kb
./deploy.ps1
# Creates: contoso-kb-rg with AI Search + OpenAI + Storage Account
Step 3 — Upload documents
az storage blob upload-batch \
--account-name contosokbstor \
--destination documents \
--source ./my-pdfs/
Step 4 — Create the search pipeline
Upload the generated JSON to your Search service using the REST API or Azure Portal:
# Create the index
curl -X POST "https://contosokbsearch.search.windows.net/indexes?api-version=2024-07-01" \
-H "api-key: <key>" -H "Content-Type: application/json" \
-d @search/index.json
# Create the data source
curl -X POST "https://contosokbsearch.search.windows.net/datasources?api-version=2024-07-01" \
-H "api-key: <key>" -H "Content-Type: application/json" \
-d @search/datasource.json
# Create the skillset
curl -X POST "https://contosokbsearch.search.windows.net/skillsets?api-version=2024-07-01" \
-H "api-key: <key>" -H "Content-Type: application/json" \
-d @search/skillset.json
# Create the indexer (starts crawling immediately)
curl -X POST "https://contosokbsearch.search.windows.net/indexers?api-version=2024-07-01" \
-H "api-key: <key>" -H "Content-Type: application/json" \
-d @search/indexer.json
Step 5 — Run the app
cd app
cp .env.sample .env
# Fill in AZURE_SEARCH_ENDPOINT, AZURE_OPENAI_ENDPOINT, API keys
pip install -r requirements.txt
uvicorn app.main:app --reload
Step 6 — Chat!
curl -X POST http://localhost:8000/chat \
-H "Content-Type: application/json" \
-d '{"query": "What is the refund policy for enterprise customers?"}'
Generated Project Structure
contoso-kb/
├── README.md # Auto-generated project docs
├── deploy.ps1 # Infrastructure deployment script
│
├── search/ # Azure AI Search artifacts
│ ├── index.json # Index with vector + semantic config
│ ├── datasource.json # Data source connection
│ ├── indexer.json # Indexer with field mappings
│ └── skillset.json # Cognitive skillset (chunking + embedding + enrichment)
│
├── app/ # Chat application
│ ├── requirements.txt / .csproj
│ ├── .env.sample / appsettings.json
│ ├── app/main.py / Program.cs
│ ├── app/config.py / Models/
│ ├── app/search_client.py / Services/SearchService.cs
│ ├── app/chat.py / Services/ChatService.cs
│ └── Dockerfile
│
└── infra/ # Bicep IaC templates
├── main.bicep
├── main.parameters.json
└── modules/
├── search.bicep
├── openai.bicep
└── storage.bicep / sql.bicep / cosmos.bicep
Supported Configurations
Data Sources
| Source |
Container Config |
Change Detection |
| Azure Blob Storage |
Container name |
Native blob tracking |
| Azure SQL Database |
Table/view name |
High-watermark column |
| Azure Cosmos DB (NoSQL) |
Database + container |
_ts timestamp |
Embedding Models
| Model |
Dimensions |
Notes |
text-embedding-3-large |
1536 (default) or 3072 |
Best quality, higher cost |
text-embedding-3-small |
1536 |
Good balance of quality and cost |
text-embedding-ada-002 |
1536 |
Legacy model, widely deployed |
Chunking Strategies
| Strategy |
Best For |
Description |
| Semantic |
General documents |
Token-aware splitting that respects sentence boundaries |
| Fixed-size |
Uniform content |
Equal-sized chunks with configurable overlap |
| Page-based |
PDFs, presentations |
Preserves page boundaries from the source document |
Requirements
- VS Code 1.85.0 or later
- Azure subscription with:
- Azure AI Search service (Basic tier or above for semantic ranker)
- Azure OpenAI resource with embedding + chat model deployments
- Data source (Blob Storage, Azure SQL, or Cosmos DB)
- Azure CLI (for Bicep deployment)
- Python 3.10+ or .NET 8 (for running the generated app)
License
MIT