Query Cost Estimator
A VS Code extension that predicts SQL query execution cost before running them. Catch expensive queries at write-time with static analysis.
Features
- CodeLens Integration: See estimated cost above each SQL query
- Real-time Diagnostics: Get warnings for expensive operations
- Detailed Hover Info: View cost breakdown by hovering over queries
- Static Analysis: No database connection required
- Custom Cost Model: Uses Adaptive Query Cost Model (AQCM) algorithm
What It Analyzes
- Table scans and index usage
- JOIN complexity and cardinality
- Subquery and CTE costs
- Window functions
- Aggregations and sorting
- Memory usage estimates
- Execution time predictions
Installation
- Install the extension from the VS Code Marketplace
- Open any
.sql file
- (Optional) Configure schema path in settings for accurate estimates
Configuration
{
"queryCostEstimator.enginePath": "", // Auto-detected by default
"queryCostEstimator.schemaPath": "/path/to/schema.json" // Optional
}
Create a schema.json file with your database statistics:
{
"tables": [
{
"name": "users",
"row_count": 100000,
"avg_row_size_bytes": 256,
"columns": {
"id": { "name": "id", "cardinality": 100000, "avg_width": 8 }
},
"indexes": [
{ "name": "users_pkey", "columns": ["id"], "primary": true, "unique": true }
]
}
]
}
Usage
Basic Usage
Open any .sql file in VS Code
CodeLens appears automatically above each query showing:
- Total estimated cost
- Estimated execution time
- Complexity level (low/medium/high/critical)
Hover over a query to see detailed breakdown:
- Scan cost
- Join cost
- Sort cost
- Filter cost
- Aggregate cost
- Memory usage
- Warnings and suggestions
Advanced Usage
Custom Schema Configuration
For more accurate estimates, provide a schema file:
- Create a
schema.json file with your database statistics (see Schema Format above)
- Open VS Code Settings (Cmd/Ctrl + ,)
- Search for "Query Cost Estimator"
- Set the "Schema Path" to your
schema.json file
- Reload or refresh estimates with the command palette
Working with Complex Queries
The extension analyzes:
- Nested subqueries: Each subquery is analyzed independently
- CTEs (Common Table Expressions): Materialization costs are estimated
- Window functions: Partition and ordering costs included
- Multiple JOINs: Cost compounds based on join order and type
- UNION/INTERSECT/EXCEPT: Set operation costs calculated
Interpreting Results
The cost estimator provides several metrics:
- Total Cost: Overall query complexity score (0-1,000,000+)
- Execution Time: Predicted runtime based on cost model
- Rows Estimated: Expected result set size
- Memory Usage: Approximate memory footprint
Warning Indicators:
- Yellow warning: Medium-high cost query
- Red error: Critical cost query that may cause performance issues
Cost Levels
- Low (< 1000): Simple queries with index usage
- Medium (1000-10000): Moderate complexity, small joins
- High (10000-100000): Large scans or complex joins
- Critical (> 100000): Extremely expensive queries (e.g., cross joins)
Examples
Example 1: Simple Index Lookup
SELECT * FROM users WHERE id = 1;
CodeLens: 💰 Cost: 322 | ⏱️ ~32ms | Complexity: low
Hover Details:
Total Cost: 322.07
Execution Time: ~32ms
Rows Estimated: 10
Memory: 1KB
Breakdown:
- Scan Cost: 122.07
- Filter Cost: 100
- Join Cost: 100
Example 2: Complex JOIN Query
SELECT u.name, o.total, p.product_name
FROM users u
JOIN orders o ON u.id = o.user_id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
WHERE u.created_at > '2024-01-01'
AND o.status = 'completed';
CodeLens: 💰 Cost: 45,823 | ⏱️ ~4.5s | Complexity: high
Diagnostics: ⚠️ Warning: High cost query detected - Consider adding indexes on join columns
Example 3: Expensive Aggregation
SELECT
category,
COUNT(*) as total,
AVG(price) as avg_price,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY price) as p95_price
FROM products
GROUP BY category
HAVING COUNT(*) > 100;
CodeLens: 💰 Cost: 12,450 | ⏱️ ~1.2s | Complexity: medium
Hover Details:
Total Cost: 12,450
Execution Time: ~1.2s
Rows Estimated: 50
Memory: 256KB
Breakdown:
- Scan Cost: 8,200
- Filter Cost: 1,250
- Aggregate Cost: 2,000
- Sort Cost: 1,000
Warnings:
- Large aggregation without index
- PERCENTILE function is expensive on large datasets
Example 4: Dangerous Cross Join
SELECT * FROM users, orders WHERE users.status = 'active';
CodeLens: 💰 Cost: 1,250,000 | ⏱️ ~125s | Complexity: CRITICAL
Diagnostics: ❌ Error: Cartesian product detected - Missing JOIN condition
Demo in Action
When you open a SQL file, you'll see:
- CodeLens above each query - Shows cost at a glance
- Inline diagnostics - Yellow/red squiggles for expensive queries
- Rich hover tooltips - Detailed cost breakdown and suggestions
- Problems panel - All warnings/errors listed for quick navigation
Commands
- Query Cost: Refresh Estimates - Manually refresh all cost estimates
Requirements
- VS Code 1.85.0 or higher
- Works offline - no database connection needed
Known Limitations
Estimation Accuracy
- Static analysis only: Estimates are based on query structure, not actual database execution plans
- No runtime stats: Cannot account for database cache state, concurrent queries, or I/O performance
- Cardinality estimation: Without actual table statistics, row count predictions may be inaccurate
- Hardware variability: Actual execution time depends on CPU, memory, disk speed, and database configuration
SQL Feature Support
Current limitations on SQL features:
- Stored procedures: Not analyzed (only raw SQL queries)
- Triggers: Side effects not included in cost
- Database-specific syntax: Optimized for standard SQL, may not recognize vendor extensions
- Dynamic SQL: Cannot analyze queries built at runtime
- Views: Treated as simple selects, materialization cost may be underestimated
Cost Model Assumptions
- Index selectivity: Assumes standard B-tree index structures
- Join algorithms: Uses nested loop cost model (actual DB may use hash/merge join)
- Parallelization: Does not account for parallel query execution
- Buffer pool: Assumes cold start (no warm cache benefit)
Known Issues
- Very large queries (>10KB) may have slower analysis
- Complex recursive CTEs may have estimation errors
- Cross-database queries not supported
- Temporary tables require schema definition for accurate estimates
Best Practices
For most accurate results:
- Provide a schema.json file with actual table row counts
- Include index definitions in schema
- Update schema periodically as data grows
- Use estimates as relative comparisons, not absolute predictions
- Validate critical queries with EXPLAIN ANALYZE on actual database
- Not a query optimizer: Does not rewrite or suggest alternative queries
- Not a profiler: Does not measure actual execution performance
- Not database-connected: Does not validate query correctness or schema existence
- Not a replacement for testing: Always benchmark critical queries on production-like data
License
MIT
Issues & Feedback
Report issues at: https://github.com/Prakhar998/query-cost/issues