db2top Agent Screen (Part 2)

Agent running in a set of gears

In part 2 of this series on the db2top Agent screen, you will see just how much more detailed the Agent screen becomes on DB2 systems that have the Data Partitioning Feature (DPF). Not only can you see the individual subsections of an executing statement, but also how they are processed at each partition. You also see the information rolled up to the subsection level, the statement level, and all the way up to the application itself. The subsection level information helps you to diagnose problems of skew between partitions, while the higher level aggregates help you identify problems of low sort heap allocation at the application level or heavy sort heap usage at the individual statement level.

When using the Data Partitioning Feature, the Agent screen has a different appearance than its non-DPF form. By default, for screens between 80 and 140 columns wide, the Agent screen looks like the following:

db2top Agent screen in DPF when 80 columns wide
db2top Agent screen in DPF when 80 columns wide

At the top left of the screen below the header we see the application ID followed by the state of the application. Beneath that is a table of subsection information. If we hold the Shift key and press the ‘X’ key to enter extended mode, a new column named “Node Nbr” is added after “Sub Sec”, the columns “Cpu Skew” and “Row Skew” are removed, and each subsection’s data is split across the nodes it runs on instead of being aggregated as in the non-extended mode. So, in a three partition system, we see three times as many rows in the table in extended mode for subsections numbered above zero, as shown below:

db2top Agent screen in DPF (extended mode)
db2top Agent screen in DPF (extended mode)

The table consists of the following columns:

Column Name Definition
Sub Sec A number identifying the subsection (ss_number).
Node Nbr1 Node where the subsection was executed (ss_node_number).
Cpu (Sys+Usr)
Non-Extended mode:
The total CPU time in seconds used by the currently executing statement subsection summed across all partitions.
sum(ss_sys_cpu_time + ss_usr_cpu_time) where sum() is the sum across all partitions.
Extended mode:
The total CPU time in seconds used by the currently executing statement subsection on the partition (ss_sys_cpu_time + ss_usr_cpu_time).
Cpu Skew2 A measure of asymmetry in the distribution of CPU time among partitions when executing the subsection.

1 – ((sum(ss_sys_cpu_time + ss_usr_cpu_time)/node_count) / max(ss_sys_cpu_time + ss_usr_cpu_time) where sum() is the sum across all partitions, max() is the maximum across all partitions, and node_count is the number of nodes executing subsections.

Row Skew2 A measure of asymmetry in the distribution of the reading and writing of data among partitions when executing the subsection.

1 – ((sum(rows_read + rows_written)/node_count) / max(rows_read + rows_written) where sum() is the sum across all partitions, max() is the maximum across all partitions, and node_count is the number of partitions executing subsections.

Rows Read
Non-Extended mode:
The number of rows read for the subsection across all partitions (rows_read).
Extended mode:
The number of rows read for the subsection on the partition (rows_read).
Rows Written
Non-Extended mode:
The number of rows changed (inserted, deleted or updated) for the subsection across all partitions (rows_written).
Extended mode:
The number of rows changed (inserted, deleted or updated) for the subsection on the partition (rows_written).
TqRows Read
Non-Extended mode:
Total number of rows read from table queues across all partitions (tq_rows_read).
Extended mode:
Total number of rows read from table queues on the partition (tq_rows_read).
TqRows Written
Non-Extended mode:
Total number of rows written to table queues across all partitions (tq_rows_written).
Extended mode:
Total number of rows written to table queues on the partition (tq_rows_written).
Tq Spills
Non-Extended mode:
Total number of table queue buffers overflowed to a temporary table across all partitions (tq_tot_send_spills).
Extended mode:
Total number of table queue buffers overflowed to a temporary table on the partition (tq_tot_send_spills).
Memory
Non-Extended mode:
The sum of the current sizes of the memory pools for the subsection across all partitions.
sum(pool_cur_size) where sum() means the sum across all partitions.
Extended mode:
The current size of the memory pool for the subsection on the partition (pool_cur_size).
Exec Time
Non-Extended mode:
The longest amount of time in seconds that it took to execute the subsection out of all the partitions where the subsection was executed.
max(ss_exec_time) where max() chooses the highest number out of all the partitions
Extended mode:
The time in seconds that it took the subsection to execute on the partition (ss_exec_time).
# of DBP Number of database partitions executing the subsection.
Ag.
Non-Extended mode:
Number of agents working on the subsection across all partitions.
Extended mode:
Number of agents working on the subsection on the partition.
SubSection Status
Non-Extended mode:
A set of status codes and, adjacent to each, the number of subsections across all partitions that have that status (ss_status). If there are no subsections with a particular status, that status value is not shown.
Extended mode:
A set of status codes and, adjacent to each, the number of subsections at the partition that have that status (ss_status). If there are no subsections with a particular status, that status value is not shown.

The status codes are:

  • E – executing (SQLM_SSEXEC)
  • C – execution is completed (SQLM_SSCOMPLETED)
  • S – waiting to receive data on a table queue (SQLM_SSTQ_WAIT_TO_RCV)
  • R – waiting to send data on a table queue (SQLM_SSTQ_WAIT_TO_SEND)
Waiting TQueue(s)
Non-Extended mode:
Prints the IDs of any table queues on any partition that are waiting to send or receive (tq_id_waiting_on).
Extended mode:
Prints the IDs of any table queues on the partition that are waiting to send or receive (tq_id_waiting_on).

1 Shown in extended mode only

2 Shown in regular mode only

The table does not scroll, so the 80 column wide screen hides several columns. When the width of the screen exceeds 140 columns, the screen changes dramatically:

db2top Agent screen in DPF when 141 columns wide
db2top Agent screen in DPF when 141 columns wide

However, the screen width needs to be increased to at least 143 columns to include the final table column “Waiting TQueue(s)”. The aggregates shown in the top half of the screen are split into columns based on how wide the screen is. If you hold the Shift key and press the ‘M’ key, the aggregates are hidden and only the subsection table is shown. In the final post in this series on the Agent screen, we will examine each of these aggregates.