Cost-Based Query Optimization Fundamentals

Query optimization is one of the most consequential components in a relational database system.
The optimizer's job is to transform a declarative SQL statement into an efficient physical execution plan, choosing among a potentially enormous space of semantically equivalent alternatives.
Cost-based optimization (CBO) accomplishes this by estimating the resource consumption of candidate plans and selecting the one with the lowest projected cost.
The quality of this decision directly determines query performance, often by orders of magnitude.

Why Cost-Based Optimization Exists

A single SQL query can be executed in many different ways.
Consider a three-table join: the tables can be joined in any order, each join can use different algorithms (nested loop, hash join, and sort-merge join), and predicates can be pushed down or deferred.
For n tables, the number of join orderings alone is O(n!), and when you factor in access path choices and other physical operator alternatives, the search space grows combinatorially.

Rule-based optimizers, which apply heuristics in a fixed priority order, cannot adapt to differences in data distribution, table size, or index availability.
Cost-based optimizers solve this by modeling the actual work each plan would perform, using statistical metadata about the data to drive decisions.
This approach was pioneered in IBM's System R and formalized in the landmark 1979 paper by Selinger et al.

Architecture of a Cost-Based Optimizer

diagram-1 — Optimizer pipeline: plan enumerator → cardinality → cost model → plan selector

A cost-based optimizer typically has three core components:

Plan enumerator: generates candidate execution plans from the space of logically equivalent alternatives.
Cardinality estimator: predicts the number of rows produced by each operator in a plan, using table and column statistics.
Cost model: translates cardinality estimates and operator characteristics into a scalar cost representing expected resource consumption (CPU, I/O, memory, network).

These components interact tightly.
The enumerator proposes plans, the cardinality estimator annotates each operator node with row count estimates, and the cost model rolls up these annotations into a total plan cost.
The plan with the lowest total cost wins.

Cardinality Estimation

Cardinality estimation is the foundation on which cost estimation rests.
If the optimizer misjudges how many rows flow between operators, the cost model's output becomes unreliable, and the chosen plan may be far from optimal.

Statistics

Databases maintain statistics about tables and columns, typically including:

Row count (table cardinality)
Distinct value count (number of distinct values, or NDV, per column)
Histograms: equi-width or equi-depth histograms that capture the distribution of values in a column, enabling better estimates for range predicates and skewed data
Most common values (MCV) lists: explicit frequency information for the most frequent values
Null fraction: the proportion of NULL values in a column

Selectivity

The selectivity of a predicate is the fraction of rows it retains.
For a simple equality predicate col = value under the uniformity assumption, selectivity is 1 / NDV(col).
For range predicates, histogram buckets provide tighter estimates.
Compound predicates are typically handled by assuming independence between columns: the selectivity of A AND B is sel(A) * sel(B).
This independence assumption is a well-known source of estimation error when columns are correlated.

Join Cardinality

diagram-3 — Join cardinality under containment and uniformity: |R⋈S| = |R|·|S|/max(NDVs)

For an equi-join between tables R and S on columns R.a and S.b, a standard estimate is:

|R ⋈ S| = |R| * |S| / max(NDV(R.a), NDV(S.b))

This formula relies on two assumptions: uniform distribution of values within each column, and the attribute value containment assumption (that the smaller value set is a subset of the larger one, so every value on the smaller side finds a match).
Multi-join cardinality estimation compounds errors multiplicatively, which is why large join queries are notoriously difficult to optimize correctly.

The Cost Model

The cost model assigns a numeric cost to each physical operator.
A simplified cost model might estimate the cost of a sequential table scan as proportional to the number of pages, while an index scan's cost depends on the number of index lookups and the resulting random I/O.
Hash joins have costs related to building and probing the hash table, plus the I/O cost of reading both inputs.
Sort-merge joins add the cost of sorting.

Typical cost model inputs include:

Estimated input and output cardinalities
Average row width
Available memory (buffer pool size)
Whether data is already sorted on a useful key
I/O cost parameters (sequential page read cost, random page read cost)
CPU cost per tuple for filtering, hashing, and comparison

Most production systems weight I/O heavily, since disk access has historically dominated query execution time, though modern systems on SSDs or in-memory databases adjust these weights accordingly.

Plan Enumeration

The System R Dynamic Programming Approach

The Selinger optimizer introduced a bottom-up dynamic programming algorithm for join enumeration that remains the basis of most commercial optimizers.
The key insight is the principle of optimality: the best plan for a set of joined tables can be constructed from the best plans for its subsets.

An important design choice in the original System R algorithm is that it restricts enumeration to left-deep trees — join trees in which the right input of every join is always a base table (not the output of another join).
This dramatically reduces the search space compared to considering all bushy trees (where intermediate join results can appear on either side), while still producing good plans in most cases.
Some modern optimizers, such as those using the Cascades framework, explore bushy trees as well.

The algorithm considers "interesting orders," meaning sort orderings that may benefit upstream operators (such as ORDER BY, GROUP BY, or a merge join).
Rather than keeping only the single cheapest plan for each subset, it retains the cheapest plan for each (subset, interesting order) pair.
This avoids discarding a slightly more expensive plan whose output order eliminates a later sort.

Walkthrough

System R Join Enumeration (Bottom-Up DP, Left-Deep Trees)

diagram-2 — System R bottom-up DP: building left-deep join trees from single-table plans

The following walkthrough illustrates the dynamic programming approach for optimizing a join of tables {R, S, T}, restricted to left-deep join trees, as in the original System R algorithm.

Step 1: Single-table access paths

For each table, enumerate all access paths (sequential scan, each available index scan) and estimate costs.
Retain the cheapest plan per interesting order.

For each table t in {R, S, T}:
    For each access path p on t:
        Compute cost(p), output cardinality, output order
    Store best plan per (table, order) pair in opt_plan[{t}]

Step 2: Two-table join pairs

For each pair of tables, consider all ways to join them.
In left-deep enumeration, the left (outer) input is a previously computed sub-plan and the right (inner) input is a base table.

For each pair {t1, t2} in {{R,S}, {R,T}, {S,T}}:
    For each choice of (outer, inner) where inner is a base table:
        For each join method in {nested_loop, hash_join, sort_merge}:
            left_plan  = opt_plan[{outer}]   (try each interesting order)
            right_plan = opt_plan[{inner}]   (try each interesting order)
            Compute join cost = left_plan.cost + right_plan.cost + join_op_cost
            Compute output cardinality and output order
            Update opt_plan[{t1, t2}] if this plan is cheapest for its order

Step 3: Three-table join (full set)

Extend two-table plans by joining in the remaining base table on the right.

For each way to extend a two-table plan with the remaining base table:
    Partitions (left two-table plan, right base table):
        (opt_plan[{R,S}], T), (opt_plan[{R,T}], S), (opt_plan[{S,T}], R)
    For each join method:
        left_plan  = opt_plan[two-table subset]
        right_plan = opt_plan[remaining base table]
        Compute total cost, update opt_plan[{R,S,T}]

Step 4: Select winner

From opt_plan[{R, S, T}], select the plan with the lowest cost, considering any required final ordering.

Complexity

The DP algorithm maintains an entry for each non-empty subset of the n tables, giving 2^n subsets.
When computing the cost of joining a subset, the algorithm considers all ways to partition it into a left sub-plan and a right single table (for left-deep trees), which aggregates to O(2^n) total work across all subsets.
If bushy trees are considered, the number of ordered partitions across all subsets grows to O(3^n).
Both figures are exponential but tractable for moderate n (typically up to around 15–20 tables).
For larger queries, optimizers use heuristics, greedy algorithms, or randomized search strategies (genetic algorithms, simulated annealing, to avoid exhaustive enumeration.

Limitations and Error Sources

Cost-based optimization is powerful but imperfect.
Several well-known issues arise in practice:

Cardinality estimation errors compound across multiple joins. A 2x error at each of four join steps can produce a 16x error at the top, leading the optimizer to choose a plan that is catastrophically worse than optimal.
Correlation between columns violates the independence assumption. Some systems address this with multi-column statistics or column group statistics.
Cost model inaccuracies arise because the model is a simplification. Caching effects, parallelism, and runtime contention are difficult to model statically.
Parameter sensitivity: queries with bind variables may get a plan optimized for one parameter value that performs poorly for another (known as the parameter sniffing problem in systems like SQL Server). Adaptive techniques such as adaptive cursor sharing (Oracle), plan freezing, and plan cache invalidation address this partially.
Stale statistics cause the optimizer to work with outdated information. Auto-analyze features in modern databases mitigate this.

Research into these problems has produced techniques such as adaptive query processing (re-optimizing during execution), learned cardinality estimators (using machine learning models), and robust query optimization (choosing plans that perform well across a range of estimation errors).

Key Points

Cost-based optimization selects execution plans by estimating and comparing the resource cost of semantically equivalent alternatives.
Cardinality estimation, driven by table and column statistics (histograms, NDV, MCVs), is the most critical input to the cost model and the most common source of plan quality failures.
The Selinger dynamic programming algorithm enumerates join orders bottom-up, restricting the search to left-deep trees, and retaining optimal sub-plans per interesting order. The 2^n subset count governs the left-deep DP; considering bushy trees raises enumeration cost to O(3^n).
The independence assumption for compound predicates and the uniformity and attribute value containment assumptions for join cardinality are convenient but frequently inaccurate, causing multiplicative estimation errors across joins.
The cost model translates cardinality estimates into scalar costs using I/O and CPU parameters, weighting operations like sequential scans, index lookups, hash builds, and sort operations.
For queries joining more than roughly 15–20 tables, exhaustive enumeration is infeasible, and optimizers resort to heuristic or randomized search strategies.
Adaptive query processing and learned cardinality estimation represent active areas of research aimed at compensating for the inherent fragility of static cost-based decisions.

References

P. Griffiths Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, T. G. Price. "Access Path Selection in a Relational Database Management System." Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, 1979.

V. Leis, A. Gubichev, A. Mirchev, P. Boncz, A. Kemper, T. Neumann. "How Good Are Query Optimizers, Really?" Proceedings of the VLDB Endowment, Vol. 9, No. 3, 2015.

S. Chaudhuri. "An Overview of Query Optimization in Relational Systems." Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), 1998.

G. Graefe. "The Cascades Framework for Query Optimization." IEEE Data Engineering Bulletin, Vol. 18, No. 3, 1995.