Statement processing #
- Parsing
- Parse, check syntax error
- Optimization
- Get statistics of tables and columns
- type check
- select access path => Access Specification Language
- Code Generation
- Execution
Research Storage System #
Costs of single relation access paths #
COST = PAGE FETCHES + W * (RSI CALLS)
- PAGE FETCHES: IO
- RSI CALL: Predicted number of tuples
- F * relation cardinalities
- F: Selectivity factor of the Boolean factors
- e.g.
WHERE colulmn = value
, F = 1 / unique key count- (Assume there is an index on column, assume even distribution)
- Calculate cost for interesting order (some order as
GROUP BY
ORDER BY
) and unordered
Join #
- List all join order permutations
- 剪枝: cartesian products should perform late (Join relations with join predicates between first)
- Calculate cost (BFS), decide algorithm (nested loop vs sort-merge) and find cheapest

Cost algorithm #
C-nested-loop-join(pathl,path2) =
C-outer(path1) + N * C-inner(path2)
C-merge(pathl,path2) =
C-outer(path1) + N * C-inner(path2)
- C-outer/inner: cost of scanning
- C-inner will be much less for merge-sort join
- N be the cardinality of the outer relation tuples which satisfy the applicable predicates