![]() |
Special January 2017 - Exploring the join execution plan |
---|
Category : User |
Level : Advanced |
Back to Overview |
To achieve an appropriate performance for a user application, database queries should be executed in a efficient way. This requires the use of index objects wherever possible. The join optimizer of cego is rule based, which makes it very easy to analyse the execution plan of any database select. Today we would like to take a closer look at how queries can be analyzed.
First, we create a table t1 and fill it with a few records
CGCLT > create table t1 ( a int, b string(30)); Table t1 created ok ( 0.031 s ) CGCLT > insert into t1 values ( 1, 'alpha'); 1 tuples inserted ok ( 0.000 s ) CGCLT > insert into t1 values ( 2, 'beta'); 1 tuples inserted ok ( 0.000 s ) CGCLT > insert into t1 values ( 3, 'gamma'); 1 tuples inserted ok ( 0.000 s ) |
Now, we can formulate a simple query on this table
CGCLT > select a from t1 where a = 2; +-----------+ | t1 | | a | +-----------+ | 2 | +-----------+ 1 tuples ok ( 0.062 s ) |
To check out the execution plan for this query, just add the plan keyword at the beginning
CGCLT > plan select a from t1 where a = 2; +-----------------------------------------------------------------------+ | PLAN | | DESCRIPTION | +-----------------------------------------------------------------------+ | Execution plan | | --------------- | | Joining table t1 (t1) with full table scan using condition t1.a = 2 | +-----------------------------------------------------------------------+ 1 tuples ok ( 0.062 s ) |
Since there still does not exist any index object for this table, a full table scan is performed for this query. Now we create an index object for table t1 on attribute a
CGCLT > create btree b1 on t1(a); Btree b1 created ok ( 0.000 s ) |
When we look at the execution plan, we realize that the index is used
CGCLT > plan select a from t1 where a = 2; +-----------------------------------------------------------+ | PLAN | | DESCRIPTION | +-----------------------------------------------------------+ | Execution plan | | --------------- | | Joining table t1 (t1) with full index trace on t1.a = 2 | +-----------------------------------------------------------+ 1 tuples ok ( 0.000 s ) |
If we expand the condition of our simple query, this also has an effect to the execution plan.
CGCLT > plan select a from t1 where a = 2 and b = 'beta'; +-----------------------------------------------------------------------------------------+ | PLAN | | DESCRIPTION | +-----------------------------------------------------------------------------------------+ | Execution plan | | --------------- | | Joining table t1 (t1) with index support on t1.b = 'beta' and t1.a = 2 using index b1 | +-----------------------------------------------------------------------------------------+ 1 tuples ok ( 0.015 s ) |
Now the index b1 is just used in a supported way, this means, that the index is used as a primary filter, but the condition must be evaluated additionally to get the final result list. This kind of execution has some performance impact, and in our example it can be avoided very easliy. We just have to provide an appropriate index object with a key for both attributes.
CGCLT > drop btree b1; Btree b1 dropped ok ( 0.000 s ) CGCLT > create btree b1 on t1(a,b); Btree b1 created ok ( 0.000 s ) |
If we check the execution plan, we get a full index trace again.
CGCLT > plan select a from t1 where a = 2 and b = 'beta'; +-----------------------------------------------------------------------------+ | PLAN | | DESCRIPTION | +-----------------------------------------------------------------------------+ | Execution plan | | --------------- | | Joining table t1 (t1) with full index trace on t1.b = 'beta' and t1.a = 2 | +-----------------------------------------------------------------------------+ 1 tuples ok ( 0.000 s ) |
To analyse more advanced queries, we create a second table t2 and insert some records.
CGCLT > create table t2 ( a int, b string(30)) Table t2 created ok ( 0.000 s ) CGCLT > insert into t2 values ( 2, 'beta'); 1 tuples inserted ok ( 0.000 s ) CGCLT > insert into t2 values ( 3, 'xxx'); 1 tuples inserted ok ( 0.000 s ) |
Now we write a simple join on both tables
CGCLT > select ta.a from t1 ta, t2 tb where ta.a = 2 and ta.a = tb.a; +-----------+ | ta | | a | +-----------+ | 2 | +-----------+ 1 tuples ok ( 0.000 s ) |
The join can be analyzed be printing the execution for the query. For each joined table it is indicated, how the join is performed
CGCLT > plan select ta.a from t1 ta, t2 tb where ta.a = 2 and ta.a = tb.a; +--------------------------------------------------------------------------------+ | PLAN | | DESCRIPTION | +--------------------------------------------------------------------------------+ | Execution plan | | --------------- | | Joining table t1 (ta) with full index trace on ta.a = 2 | | Joining table t2 (tb) with full table scan using condition tb.a = ta.a(null) | +--------------------------------------------------------------------------------+ 1 tuples ok ( 0.000 s ) |
Since on table t2 still a full table scan is performed, we provide another index for t2(a)
CGCLT > create btree b2 on t2(a); create btree b2 on t2(a); Btree b2 created ok ( 0.000 s ) |
Now the query can be performed with full index usage. We check another time the execution for this.
CGCLT > plan select ta.a from t1 ta, t2 tb where ta.a = 2 and ta.a = tb.a; +--------------------------------------------------------------------+ | PLAN | | DESCRIPTION | +--------------------------------------------------------------------+ | Execution plan | | --------------- | | Joining table t1 (ta) with full index trace on ta.a = 2 | | Joining table t2 (tb) with full index trace on tb.a = ta.a(null) | +--------------------------------------------------------------------+ 1 tuples ok ( 0.000 s ) |
As a preparation for the next query, we create a third btree object on t2(b)
CGCLT > create btree b3 on t2(b); create btree b3 on t2(b); Btree b3 created ok ( 0.000 s ) |
Now we check out the plan for a query with a nested subselect. The execution also indicates, how the subselect is executed.
CGCLT > plan select a from t1 where a in ( select a from t2 where b = 'xxx'); +-----------------------------------------------------------------+ | PLAN | | DESCRIPTION | +-----------------------------------------------------------------+ | Execution plan | | --------------- | | Joining table t1 (t1) with full table scan with no condition | | Execution plan for subquery | | ---------------------------- | | Joining table t2 (t2) with full index trace on t2.b = 'xxx' | +-----------------------------------------------------------------+ 1 tuples ok ( 0.000 s ) |
As you can see, table t1 is traced with a full table scan, since there is no appropriate constraint to use the available index object b1. If we add another predicate to the parent select, the table is traced via index
CGCLT > plan select a from t1 where a = 1 and a in ( select a from t2 where b = 'xxx'); +-----------------------------------------------------------------+ | PLAN | | DESCRIPTION | +-----------------------------------------------------------------+ | Execution plan | | --------------- | | Joining table t1 (t1) with full index trace on t1.a = 1 | | Execution plan for subquery | | ---------------------------- | | Joining table t2 (t2) with full index trace on t2.b = 'xxx' | +-----------------------------------------------------------------+ 1 tuples ok ( 0.000 s ) |
The above simple examples illustrate how queries can be examined in general. This can help to achieve an appropriate database performance also for more sophisted and complex queries.