Lemke IT Homepage

Special January 2017 - Exploring the join execution plan
Category : User
Level : Advanced

Back to Overview

To achieve an appropriate performance for a user application, database queries should be executed in a efficient way. This requires the use of index objects wherever possible. The join optimizer of cego is rule based, which makes it very easy to analyse the execution plan of any database select. Today we would like to take a closer look at how queries can be analyzed.

First, we create a table t1 and fill it with a few records

CGCLT > create table t1 ( a int, b string(30));
Table t1 created
ok ( 0.031 s )
CGCLT > insert into t1 values ( 1, 'alpha');
1 tuples inserted
ok ( 0.000 s )
CGCLT > insert into t1 values ( 2, 'beta');
1 tuples inserted
ok ( 0.000 s )
CGCLT > insert into t1 values ( 3, 'gamma');
1 tuples inserted
ok ( 0.000 s )

Now, we can formulate a simple query on this table

CGCLT > select a from t1 where a = 2;
+-----------+
|        t1 |
|         a |
+-----------+
|         2 |
+-----------+
1 tuples
ok ( 0.062 s )

To check out the execution plan for this query, just add the plan keyword at the beginning

CGCLT > plan select a from t1 where a = 2;
+-----------------------------------------------------------------------+
| PLAN                                                                  |
| DESCRIPTION                                                           |
+-----------------------------------------------------------------------+
| Execution plan                                                        |
| ---------------                                                       |
|   Joining table t1 (t1) with full table scan using condition t1.a = 2 |
+-----------------------------------------------------------------------+
1 tuples
ok ( 0.062 s )

Since there still does not exist any index object for this table, a full table scan is performed for this query. Now we create an index object for table t1 on attribute a

CGCLT > create btree b1 on t1(a);
Btree b1 created
ok ( 0.000 s )

When we look at the execution plan, we realize that the index is used

CGCLT > plan select a from t1 where a = 2;
+-----------------------------------------------------------+
| PLAN                                                      |
| DESCRIPTION                                               |
+-----------------------------------------------------------+
| Execution plan                                            |
| ---------------                                           |
|   Joining table t1 (t1) with full index trace on t1.a = 2 |
+-----------------------------------------------------------+
1 tuples
ok ( 0.000 s )

If we expand the condition of our simple query, this also has an effect to the execution plan.

CGCLT > plan select a from t1 where a = 2 and b = 'beta';
+-----------------------------------------------------------------------------------------+
| PLAN                                                                                    |
| DESCRIPTION                                                                             |
+-----------------------------------------------------------------------------------------+
| Execution plan                                                                          |
| ---------------                                                                         |
|   Joining table t1 (t1) with index support on t1.b = 'beta' and t1.a = 2 using index b1 |
+-----------------------------------------------------------------------------------------+
1 tuples
ok ( 0.015 s )

Now the index b1 is just used in a supported way, this means, that the index is used as a primary filter, but the condition must be evaluated additionally to get the final result list. This kind of execution has some performance impact, and in our example it can be avoided very easliy. We just have to provide an appropriate index object with a key for both attributes.

CGCLT > drop btree b1;
Btree b1 dropped
ok ( 0.000 s )
CGCLT > create btree b1 on t1(a,b);
Btree b1 created
ok ( 0.000 s )

If we check the execution plan, we get a full index trace again.

CGCLT > plan select a from t1 where a = 2 and b = 'beta';
+-----------------------------------------------------------------------------+
| PLAN                                                                        |
| DESCRIPTION                                                                 |
+-----------------------------------------------------------------------------+
| Execution plan                                                              |
| ---------------                                                             |
|   Joining table t1 (t1) with full index trace on t1.b = 'beta' and t1.a = 2 |
+-----------------------------------------------------------------------------+
1 tuples
ok ( 0.000 s )

To analyse more advanced queries, we create a second table t2 and insert some records.

CGCLT > create table t2 ( a int, b string(30))
Table t2 created
ok ( 0.000 s )
CGCLT > insert into t2 values ( 2, 'beta');
1 tuples inserted
ok ( 0.000 s )
CGCLT > insert into t2 values ( 3, 'xxx');
1 tuples inserted
ok ( 0.000 s )

Now we write a simple join on both tables

CGCLT > select ta.a from t1 ta, t2 tb where ta.a = 2 and ta.a = tb.a;
+-----------+
|        ta |
|         a |
+-----------+
|         2 |
+-----------+
1 tuples
ok ( 0.000 s )

The join can be analyzed be printing the execution for the query. For each joined table it is indicated, how the join is performed

CGCLT > plan select ta.a from t1 ta, t2 tb where ta.a = 2 and ta.a = tb.a;
+--------------------------------------------------------------------------------+
| PLAN                                                                           |
| DESCRIPTION                                                                    |
+--------------------------------------------------------------------------------+
| Execution plan                                                                 |
| ---------------                                                                |
|   Joining table t1 (ta) with full index trace on ta.a = 2                      |
|   Joining table t2 (tb) with full table scan using condition tb.a = ta.a(null) |
+--------------------------------------------------------------------------------+
1 tuples
ok ( 0.000 s )

Since on table t2 still a full table scan is performed, we provide another index for t2(a)

CGCLT > create btree b2 on t2(a);
create btree b2 on t2(a);
Btree b2 created
ok ( 0.000 s )

Now the query can be performed with full index usage. We check another time the execution for this.

CGCLT > plan select ta.a from t1 ta, t2 tb where ta.a = 2 and ta.a = tb.a;
+--------------------------------------------------------------------+
| PLAN                                                               |
| DESCRIPTION                                                        |
+--------------------------------------------------------------------+
| Execution plan                                                     |
| ---------------                                                    |
|   Joining table t1 (ta) with full index trace on ta.a = 2          |
|   Joining table t2 (tb) with full index trace on tb.a = ta.a(null) |
+--------------------------------------------------------------------+
1 tuples
ok ( 0.000 s )

As a preparation for the next query, we create a third btree object on t2(b)

CGCLT > create btree b3 on t2(b);
create btree b3 on t2(b);
Btree b3 created
ok ( 0.000 s )

Now we check out the plan for a query with a nested subselect. The execution also indicates, how the subselect is executed.

CGCLT > plan select a from t1 where a in ( select a from t2 where b = 'xxx');
+-----------------------------------------------------------------+
| PLAN                                                            |
| DESCRIPTION                                                     |
+-----------------------------------------------------------------+
| Execution plan                                                  |
| ---------------                                                 |
|   Joining table t1 (t1) with full table scan with no condition  |
|   Execution plan for subquery                                   |
|   ----------------------------                                  |
|     Joining table t2 (t2) with full index trace on t2.b = 'xxx' |
+-----------------------------------------------------------------+
1 tuples
ok ( 0.000 s )

As you can see, table t1 is traced with a full table scan, since there is no appropriate constraint to use the available index object b1. If we add another predicate to the parent select, the table is traced via index

CGCLT > plan select a from t1 where a = 1 and a in ( select a from t2 where b = 'xxx');
+-----------------------------------------------------------------+
| PLAN                                                            |
| DESCRIPTION                                                     |
+-----------------------------------------------------------------+
| Execution plan                                                  |
| ---------------                                                 |
|   Joining table t1 (t1) with full index trace on t1.a = 1       |
|   Execution plan for subquery                                   |
|   ----------------------------                                  |
|     Joining table t2 (t2) with full index trace on t2.b = 'xxx' |
+-----------------------------------------------------------------+
1 tuples
ok ( 0.000 s )

The above simple examples illustrate how queries can be examined in general. This can help to achieve an appropriate database performance also for more sophisted and complex queries.