Query Optimization Hands-On

Back to Sam's Club

We're going to continue to use the Sam's Club Database. Log back into the Sam's Club SQL Assistant.

http://enterprise.waltoncollege.uark.edu/sqlassistantweb12.0/

Click on "Walton College Teradata"
Enter the Username and Password from the spreadsheet
For "default database" use "UA_SAMSCLUB"
Click "ok"

Last week we used the "execute" button to run queries. This week we'll use the button next to "execute". This button says "explain" when you mouse over it. To "explain" a query means to provide detailed information about the steps that are taken to execute a query. (Note: the query doesn't actually execute, the system just tells you how it would execute the query.)

We'll start with a basic example.

select * from store_visits;

Copy this SELECT into the query box and click "explain". Look at the results. You'll see that steps 1, 2, and 4 are basically "housekeeping". The first two steps have to do with locking which outside of the scope of this course. The last one has to do with providing the result of the query back to the person running the query. The only real "action" takes place in step 3.

Open a Word document and copy the SQL statement into it. Now copy the text for step 3 into the file. Finally, in English, describe what you believe is happening in this step. We'll discuss the answer as a class when everyone is ready.

We'll build on this by adding a simple WHERE condition:

select * from store_visits where visit_nbr = 100;

Notice that the primary key index produces a very quick result. (If you haven't already, familiarize yourself with the indices on the tables. Click on UA_SAMSCLUB and then "tables". For every table, you can click on the table name and see the "columns" that make up the table and the "indices". Be sure to explore these details before continuing.

Let's modify the query slightly:

select * from store_visits where visit_nbr < 500;

What changed? Was the index used or not? Copy this query into your document and describe in English the steps that are being taken to execute this query.

Let's make add a sorting condition to the end of the query:

select * from store_visits where visit_nbr < 500 order by store_nbr;

Once again, step 3 is the "meat" of this query plan. Cut/paste this query and the "meat" of the query into your document. Describe in English the steps that are being taken to execute this query and to sort the results.

We'll build on the work we did last week. During the lab, we wrote:

select count(*) from store_visits;

Copy this SELECT into the query box and click "explain". Look at the results. You'll see that steps 1, 2, 4, and 5 are basically "housekeeping". The first two steps have to do with locking which outside of the scope of this course. The last two have to do with providing the result of the query back to the person running the query. The only real "action" takes place in step 3. Open a Word document and copy the text for step 3 into the file. In English, describe what you believe is happening in this step. We'll discuss the answer as a class when everyone is ready.

Next let's EXPLAIN this query:

select distinct store_nbr from store_visits;

We are asking for the unique values for store_nbr. From our work last week, we know there are over 1 million rows in the Store_Visits table. We also know that there are only 18 unique store numbers. Before you look at the EXPLAIN results, think about this query. How do you think the database will determine the answer to the query? Now look at the results. Once again, step 3 is the only "non housekeeping" part of the execution plan. Cut/paste it into your document and write out (in English) what you believe is happening.

Next up:

select distinct store_nbr from store_visits where store_nbr between 1 and 20;

This is our first WHERE clause. The database system now has to deal with criteria (in this case, a store number between 1 and 20). We know how the previous query was resolved. How do you think this query will be resolved? EXPLAIN the query and cut/paste it into your document and write out (in English) what you believe is happening.

Let's look at part of our query from last week:

select count(*)
from member_index, store_visits
where member_type = 'V' and
store_visits.membership_nbr = member_index.membership_nbr;

EXPLAIN the query and cut/paste it into your document and write out (in English) what you believe is happening.

The database is performing a sequence of steps based on knowledge of the data in the database. (For example, what is the distribution of "V" members? Is that a small percentage of all members or a large percentage? What is the size of the Store_Visits table as compared to the Member_Index table?)

As you look at the distribution of data below, you can see that member type "I" is very selective. (In fact, only one member has this member type!)

MEMBER_TYPE Count(MEMBER_TYPE)
E 2232
W 1416167
A 93089
1 301
I 1
H 78
Z 1139
Y 5035
D 2994
3 155
G 21260
V 3888773
F 16
X 237135

Change the query to the following and "EXPLAIN" it.

select count(*)
from member_index, store_visits
where member_type = 'I' and
store_visits.membership_nbr = member_index.membership_nbr;

Did the execution plan change? If you were "smarter than a database management system", what steps would you take to ensure this query runs as fast as possible — given that you know that only 1 member has "I" for member_type? In your document, write out the steps and your rationale for the execution plan.

We'll keep working with our query from last week. Here's the next version of the query:

select sum(total_visit_amt)
from member_index, store_visits
where store_visits.membership_nbr = member_index.membership_nbr and
member_type = 'V' and
store_nbr = 20 and
transaction_date BETWEEN '20000101' and '20000110';

Once again, EXPLAIN the SQL, put the non-housekeeping portion of the results in your document, and write out the execution steps in English.

Here is our complete query from last week:

select sum(total_visit_amt)
from member_index, store_visits, item_desc, item_scan
where
store_visits.membership_nbr = member_index.membership_nbr and
store_visits.visit_nbr = item_scan.visit_nbr and
item_scan.item_nbr = item_desc.item_nbr
and
member_type = 'V' and
store_visits.store_nbr = 20 and
store_visits.transaction_date BETWEEN '20000101' and '20000110' and
item_desc.category_nbr = 5;

EXPLAIN and then try to explain it.

To complete this lab, submit your document with SQL statements, query plans, and explanations.

Complete this lab by submitting your execution plan explanations for each of the queries discussed.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License