Query Cost Estimator

A VS Code extension that predicts SQL query execution cost before running them. Catch expensive queries at write-time with static analysis.

Features

CodeLens Integration: See estimated cost above each SQL query
Real-time Diagnostics: Get warnings for expensive operations
Detailed Hover Info: View cost breakdown by hovering over queries
Static Analysis: No database connection required
Custom Cost Model: Uses Adaptive Query Cost Model (AQCM) algorithm

What It Analyzes

Table scans and index usage
JOIN complexity and cardinality
Subquery and CTE costs
Window functions
Aggregations and sorting
Memory usage estimates
Execution time predictions

Installation

Install the extension from the VS Code Marketplace
Open any .sql file
(Optional) Configure schema path in settings for accurate estimates

Configuration

{
  "queryCostEstimator.enginePath": "",  // Auto-detected by default
  "queryCostEstimator.schemaPath": "/path/to/schema.json"  // Optional
}

Schema Format

Create a schema.json file with your database statistics:

{
  "tables": [
    {
      "name": "users",
      "row_count": 100000,
      "avg_row_size_bytes": 256,
      "columns": {
        "id": { "name": "id", "cardinality": 100000, "avg_width": 8 }
      },
      "indexes": [
        { "name": "users_pkey", "columns": ["id"], "primary": true, "unique": true }
      ]
    }
  ]
}

Usage

Basic Usage

Open any .sql file in VS Code
CodeLens appears automatically above each query showing:
- Total estimated cost
- Estimated execution time
- Complexity level (low/medium/high/critical)
Hover over a query to see detailed breakdown:
- Scan cost
- Join cost
- Sort cost
- Filter cost
- Aggregate cost
- Memory usage
- Warnings and suggestions

Advanced Usage

Custom Schema Configuration

For more accurate estimates, provide a schema file:

Create a schema.json file with your database statistics (see Schema Format above)
Open VS Code Settings (Cmd/Ctrl + ,)
Search for "Query Cost Estimator"
Set the "Schema Path" to your schema.json file
Reload or refresh estimates with the command palette

Working with Complex Queries

The extension analyzes:

Nested subqueries: Each subquery is analyzed independently
CTEs (Common Table Expressions): Materialization costs are estimated
Window functions: Partition and ordering costs included
Multiple JOINs: Cost compounds based on join order and type
UNION/INTERSECT/EXCEPT: Set operation costs calculated

Interpreting Results

The cost estimator provides several metrics:

Total Cost: Overall query complexity score (0-1,000,000+)
Execution Time: Predicted runtime based on cost model
Rows Estimated: Expected result set size
Memory Usage: Approximate memory footprint

Warning Indicators:

Yellow warning: Medium-high cost query
Red error: Critical cost query that may cause performance issues

Cost Levels

Low (< 1000): Simple queries with index usage
Medium (1000-10000): Moderate complexity, small joins
High (10000-100000): Large scans or complex joins
Critical (> 100000): Extremely expensive queries (e.g., cross joins)

Examples

Example 1: Simple Index Lookup

SELECT * FROM users WHERE id = 1;

CodeLens: 💰 Cost: 322 | ⏱️ ~32ms | Complexity: low

Hover Details:

Total Cost: 322.07
Execution Time: ~32ms
Rows Estimated: 10
Memory: 1KB

Breakdown:
- Scan Cost: 122.07
- Filter Cost: 100
- Join Cost: 100

Example 2: Complex JOIN Query

SELECT u.name, o.total, p.product_name
FROM users u
JOIN orders o ON u.id = o.user_id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
WHERE u.created_at > '2024-01-01'
  AND o.status = 'completed';

CodeLens: 💰 Cost: 45,823 | ⏱️ ~4.5s | Complexity: high

Diagnostics: ⚠️ Warning: High cost query detected - Consider adding indexes on join columns

Example 3: Expensive Aggregation

SELECT
  category,
  COUNT(*) as total,
  AVG(price) as avg_price,
  PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY price) as p95_price
FROM products
GROUP BY category
HAVING COUNT(*) > 100;

CodeLens: 💰 Cost: 12,450 | ⏱️ ~1.2s | Complexity: medium

Hover Details:

Total Cost: 12,450
Execution Time: ~1.2s
Rows Estimated: 50
Memory: 256KB

Breakdown:
- Scan Cost: 8,200
- Filter Cost: 1,250
- Aggregate Cost: 2,000
- Sort Cost: 1,000

Warnings:
- Large aggregation without index
- PERCENTILE function is expensive on large datasets

Example 4: Dangerous Cross Join

SELECT * FROM users, orders WHERE users.status = 'active';

CodeLens: 💰 Cost: 1,250,000 | ⏱️ ~125s | Complexity: CRITICAL

Diagnostics: ❌ Error: Cartesian product detected - Missing JOIN condition

Demo in Action

When you open a SQL file, you'll see:

CodeLens above each query - Shows cost at a glance
Inline diagnostics - Yellow/red squiggles for expensive queries
Rich hover tooltips - Detailed cost breakdown and suggestions
Problems panel - All warnings/errors listed for quick navigation

Commands

Query Cost: Refresh Estimates - Manually refresh all cost estimates

Requirements

VS Code 1.85.0 or higher
Works offline - no database connection needed

Known Limitations

Estimation Accuracy

Static analysis only: Estimates are based on query structure, not actual database execution plans
No runtime stats: Cannot account for database cache state, concurrent queries, or I/O performance
Cardinality estimation: Without actual table statistics, row count predictions may be inaccurate
Hardware variability: Actual execution time depends on CPU, memory, disk speed, and database configuration

SQL Feature Support

Current limitations on SQL features:

Stored procedures: Not analyzed (only raw SQL queries)
Triggers: Side effects not included in cost
Database-specific syntax: Optimized for standard SQL, may not recognize vendor extensions
Dynamic SQL: Cannot analyze queries built at runtime
Views: Treated as simple selects, materialization cost may be underestimated

Cost Model Assumptions

Index selectivity: Assumes standard B-tree index structures
Join algorithms: Uses nested loop cost model (actual DB may use hash/merge join)
Parallelization: Does not account for parallel query execution
Buffer pool: Assumes cold start (no warm cache benefit)

Known Issues

Very large queries (>10KB) may have slower analysis
Complex recursive CTEs may have estimation errors
Cross-database queries not supported
Temporary tables require schema definition for accurate estimates

Best Practices

For most accurate results:

Provide a schema.json file with actual table row counts
Include index definitions in schema
Update schema periodically as data grows
Use estimates as relative comparisons, not absolute predictions
Validate critical queries with EXPLAIN ANALYZE on actual database

What This Tool Is NOT

Not a query optimizer: Does not rewrite or suggest alternative queries
Not a profiler: Does not measure actual execution performance
Not database-connected: Does not validate query correctness or schema existence
Not a replacement for testing: Always benchmark critical queries on production-like data

License

Support

For support and inquiries, please use the VSCode Marketplace Q&A section or contact support through the extension settings.

Query Cost Estimator

Prakhar Tripathi

Query Cost Estimator

Features

What It Analyzes

Installation

Configuration

Schema Format

Usage

Basic Usage

Advanced Usage

Custom Schema Configuration

Working with Complex Queries

Interpreting Results

Cost Levels

Examples

Example 1: Simple Index Lookup

Example 2: Complex JOIN Query

Example 3: Expensive Aggregation

Example 4: Dangerous Cross Join

Demo in Action

Commands

Requirements

Known Limitations

Estimation Accuracy

SQL Feature Support

Cost Model Assumptions

Known Issues

Best Practices

What This Tool Is NOT

License

Support