Getting under the hood with DB2 for TSM Admins

Posted by on Dec 21, 2011 in Databases, DB2, Infrastructure, Social, Storage, TSM | No Comments

Over at the TSMADmin blog, Chad Small recently posted “DB2 Doesn’t Make a Difference.”  Chad was concerned that he was not seeing a performance boost in his database queries after upgrading from TSM 5.5 to 6.x.

As you know, one of the biggest changes in TSM 6.x is that it is now backed by DB2.  Previous releases of TSM used a proprietary database that worked well, but had certain limitations, especially with scaling.

In any case, Chad was using the following query to find old Oracle database backups that had not been properly cleaned:

select \
 cast(sum(b.file_size/1073741824) as decimal(18,2)) AS GB_SIZE \
from \
 backups a, \
 contents b \
where \
 a.node_name in ('DEV01_ORA','DEV02_ORA','DEV03_ORA','PRD01_ORA','PROD02_ORA') and \
 a.backup_date < '2011-11-01 00:00:00' and \
 a.object_id=b.object_id

Chad’s problem was that the query was slow.  Really, really slow.  It sounded like an interesting problem (I know, I’m weird), and since one of my many hats is doing Oracle support, I have a seen a fair amount of poorly-performing database queries.  And usually, the culprit is the query itself, and not the database.  I’m not anyone’s idea of a SQL tuning expert, but it seemed like a good opportunity to learn.

I confirmed what Chad was seeing by issuing a select count(*) against backups and contents, and the contents query in particular took very long to execute.  Additionally, it returned tens of millions of rows, so there was a lot of data to process.

Since the CLI for DB2, db2,  is included with TSM, I used it to query the database directly.  Once connected, the first order of business was to find out what the query was hitting: what are backups and contents?

db2 => set schema tsmdb1
DB20000I  The SQL command completed successfully.
db2 => list tables for schema tsmdb1

Table/View                      Schema          Type  Creation time
------------------------------- --------------- ----- --------------------------
.
.
.
BACKUPS                         TSMDB1          V     2011-09-14-19.05.58.424225
.
.
.
CONTENTS                        TSMDB1          V     2011-09-14-19.06.04.636090
.
.
.

Well.  We can see from the ‘V’ in the Type column that the two objects are views.  A view is essentially a SQL code block that you can query and update just like you would a table.  This is great for abstracting a complicated database, but whenever you execute a statement against it, it gets executed itself.  So there may be a whole lot more processing involved than you expect or need.

In order to examine the SQL code for the views, DB2 provides the db2look command-line utility.  Here are the commands to dump the code for the two views.  Note that the output is being placed into a text file.

db2look -d TSMDB1 -e -v backups -o backups.sql

 and

db2look -d TSMDB1 -e -v contents -o contents.sql

After looking at the source code, I found that most of what Chad was looking for seemed to be in the backup_objects table.  Here’s my rewrite of Chad’s query:

select \
  sum(cast(bk.bfsize/1024/1024 as decimal(14,4))) AS MB \
from \
  backup_objects bk \
where \
  bk.nodeid in (select nodeid from nodesview where node_name in ('DEV01_ORA','DEV02_ORA','DEV03_ORA','PRD01_ORA','PROD02_ORA')) and \
  bk.insdate<'2011-11-01 00:00:00'

Chad pointed out that you can’t query the nodes table from the dsmadmc client (why??), however you are able to query against the nodesview view.  You’ll need to change the column from nodename to node_name when you do.

With the new query, Chad is able to get results within 10 minutes.  Still not blazing fast, but it sounds like the query is usable now.

References:

DB2 Doesn’t Make A Difference

db2look – DB2 statistics and DDL extraction tool command