Data Storage

DBA for a Day

The only way to really understand the "insides" of a database is to act as the database administrator (DBA). For this lab, we're going to use an open source database called MySQL. To get started, download the MySQL software:

http://dev.mysql.com/get/Downloads/MySQL-Cluster-7.0/mysql-cluster-gpl-7.0.7-win32.msi/from/http://mirror.services.wisc.edu/mysql/

If that link doesn't work, choose one of the "mirrors" available here:

http://dev.mysql.com/downloads/mirror.php?id=371914#mirrors

Once downloaded, click on the downloaded file and click through to accept all the defaults.

When you get to the "Wizard Complete" page, be sure to select "Configure the MySQL Server Now" and then click "Finish"

Instance Configuration Wizard

Select "Detailed Configuration"

You will be presented with three choices: "developer machine", "server machine", and "dedicated MySQL server machine". Choose "developer machine". Be sure to read the descriptions carefully. Be prepared to discussed the main difference between the three during the lab.

Next comes three choices for "database usage": "multifunctional", "transactional only", "non-transactional only". Choose "multifunctional" Again, be sure to read the descriptions carefully. Be prepared to discussed the main difference between the three during the lab.

Next comes "tablespace settings": Use \MySQL Datafiles\

"Server settings" offers three choices: "DSS/OLAP", "OLTP", "Manual". Select "Manual" with 5 concurrent connections Do you remember what these acronyms mean? Why would they matter as you set up the database? Be prepared to discuss during the lab.

Do not enable TCP/IP and leave "Strict Mode" set (this has to do with SQL standards)

Choose the character Set (UTF8 and MultiLingualism)

Leave the box checked to start MySQL as a windows service

Choose a new root password (do you know what "root" is?) The current root password is not set (blank, no value).

Click "Execute"

Congratulations, you've installed a database management system!

Command Line Database Administration

Click on the Windows "Start" bar and find your newly installed MySQL database. Run the "MySQL Command Line Client". You'll be asked for a password. Enter the root password you established during the configuration process.

MySQL has a nice interface available for DBAs, but we're going to be "hard core" and stick with the command line interface. This way you'll feel like a real DBA. Start by exploring the interface. Type "help" and hit enter. You'll see a list of commands. Type "help contents" and you'll see a list of categories of commands (e.g. grouping of commands). Since we are being DBAs, let's see what they can do by typing "help administration". You'll see a long list of commands available to DBAs.

To complete this lab, you'll be turning a text file that is created based on everything you do in this lab. To create the text file, we use one of the commands in the "help" list.

tee c:\yourname

This will create a file in the main directory for the C drive with the name you provide. Alternatively, you can create the file on the G drive or on a USB drive. Once you've issued the "tee" command, type in "help" again. This will "spool" the results of the "help" command (exactly what you see on your screen) to the newly created file. To make sure the "tee" is working properly, go find the file on the C drive (or wherever you put it) and open it with Notepad. Be sure to close Notepad before continuing to the next part of the lab.

Back to the lab! We've installed the Database Management System (DBMS), but we haven't actually created a database. Let's first see what databases are provided with the MySQL installation. To do this, type:

show databases;

Note: the ; is required. It tells MySQL you are done entering your command.

You'll see that you have 3 databases in the MySQL DBMS. The most important of these is called "information_schema". This is a special database that provides information about the whole DBMS. It's called a "data dictionary", or "data about data", or "metadata". Let's connect to the data dictionary for our MySQL DBMS:

connect information_schema;

And let's see what tables are in this database:

show tables;

All 28 of these tables are "tables about tables". For example, the table called "COLUMNS" has information about all of the columns in all of the tables in the database. If we wanted to learn about these columns, we issue SQL. Try this as an example:

select table_name, column_name from columns;

Note: you can type "describe name-of-table" to determine the columns in any table.

Now we are going to create our own database and connect to it:

create database world;
connect world;

Then we'll create 3 tables using a file with the necessary SQL to do so. The file is on here. Download it to your C drive so you can issue this command:

source c:\world.sql

Hint: you can use this same technique for other parts of this lab. You can open 'notepad' and copy/paste the commands from the wiki into your file. Save the file as SomeName.sql on the C drive. Then you can "source" the sql file to run it. This makes things much easier when you are trying to make changes to SQL commands. You don't have to retype the whole thing if you use this method.

Let's see what we have in this new database:

show tables; for a list of table names

desc city; to see the columns in the city table

desc country; to see the columns in the country table

desc countrylanguage; to see the columns in the countrylanguage table

Now write your own SQL to count the number of rows in city, country, and countrylanguage.

Let's make this more interesting. Say we want to partition our data. We can determine where the partitions physically reside to optimize the performance of our database. MySQL clusters will do some of this for us but we have to make some decisions up front. How many partitions should we use? What data ranges should go in each partition?

MySQL requires that partitions be based on Primary Key fields. For city, this is the field called "id". It's a number that increments with each new city added. We'll first determine the minimum and maximum id currently in the city table:

select min(id) from city;

select max(id) from city;

How do you want to partition the table? You want to anticipate growth but you don't want a lot of wasted/empty partitions. You also want the partitions to be fairly even in size. In this case, the data won't change much but in other cases you could have a table with lots of deletions and additions of data. Go ahead and make up your own partitioning scheme and use the following to change the city table to a partitioned table:

ALTER TABLE city
PARTITION BY RANGE (id)
(
PARTITION p1 VALUES LESS THAN (you-put-a-number-here),
PARTITION xx VALUES LESS THAN (put-another-number-here),
make as many partitions as you want
…..
PARTITION xx VALUES LESS THAN maxvalue this should be your last partition
);

Now let's look at what we did. Connect to the information_schema to explore the newly created partitions:

connect information_schema;

desc partitions;

select table_name, partition_name, table_rows from partitions where table_name = 'city';

Completing the Lab

Exit out of MySQL by typing "exit" at the command line. Submit your file from the MySQL session (created by "tee" earlier in the lab) via Blackboard.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License