Fotia Ltd

Fotia Ltd

Thursday, 29 June 2006

Dynamic Management Views Part II

posted by Stefan Delmarco

In this instalment we'll be looking at a single DMV. This DMV provides such a wealth of information that it will take quite a few passes to cover it in depth. If anything, this DMV will make detecting and troubleshooting poorly performing queries something that can be easily done rather than requiring specialised, and potentially performance damaging tools like Profiler. The problem with Profiler is that it is great for debugging and dissecting SQL Server requests in a controlled environment (such as your development or test environment). It isn't good at running perpetually in a production environment, logging every single query that is executed, with its query plan and runtime statistics. Firstly, the amount of data it would generate for even a modest enterprise-level application would be unmanageable. Secondly, depending on what events are being captured and their frequency, the performance of SQL Server could be adversely affected. I'm not saying that SQL Profiler shouldn't be used in a production environment to assist with investigating and diagnosing a specific issue. Rather, it can't be used by the DBA's to monitor the overall performance of SQL Server, so that they can look for potentially troublesome queries, and be proactive in the detection of performance issues. This task normally falls on the middle-tier / application code to log execution times or update custom performance counters. Even then, it is normally support engineers that look at these logs rather than the DBAs.

In SQL Server 2005 we now have the sys.dm_exec_query_stats DMV. This view contains a row for each query plan in SQL Server's query plan cache. SQL Server is built on the principle that you prepare a query plan once and execute it many times. Caching of the query plan is central to the architecture of SQL Server. Exposing this core part of the internal architecture really means that there are no more secrets!

So, what can we see in this DMV? Firstly, we can obtain a handle to the query plan's TSQL text. In SQL Server 2000 we had to use DBCC INPUTBUFFER or the ::fn_get_sql function (which has now been marked as to-be-deprecated) to be able to see what TSQL a session was executing.

Even then DBCC INPUTBUFFER just showed the last statement executed and ::fn_get_sql only returned the TSQL as it was executing. Neither provided any kind of insight into the actual content of the query plan cache. The only way I knew of was to take a crash dump of the SQL Server process and dissect the dump using Microsoft PSS tools that understood SQL Server's memory layout.

To have a look at every TSQL statement in the query plan cache, execute the following query over the sys.dm_exec_query_stats DMV:

select db_name(sql_text.dbid) as [Database],
    case db_id()
        when sql_text.dbid then object_name(sql_text.objectid)
        else cast(sql_text.objectid as sysname) end as [Object],
    sql_text.text as [TSQL]
from sys.dm_exec_query_stats
    cross apply sys.dm_exec_sql_text(sql_handle) as sql_text;

If you don't see any rows returned, execute this query twice and you'll see the query above appear in the cache! If you see too many rows use DBCC FREEPROCCACHE to empty out the query plan cache.

There are a couple of things in this query you might not have seen before. For example, the APPLY operator is new. We'll look at it in detail in another article but for now just think of it as an inner join from a query to a table-valued function. The table-valued function will be invoked for each row in the query. If the table valued function returns multiple rows a query row, the query row will be returned multiple times (as you'd expect from an inner join).

The CROSS APPLY is taking each sql_handle in the sys.dm_exec_query_stats DMV and uses the sys.dm_exec_sql_text function to return the database, object id and text of the query plan. The case logic is there just to turn the objectid into its object name if the current database is the same as object's database.

OK, so this is all very interesting, but how does this help me see what the performance of these queries is like? Well, for each query plan, SQL Server will maintain an aggregate of:

  • The number of times query plan has been recompiled (find all your SPs containing temp tables here!)
  • A handle to the query plan (we'll look at this in depth later).
  • The last time the query was executed.
  • The number of times it has been executed (how easy is it to spot the most used queries now?!).
  • The total amount of the following resources consumed by all invocations of the query plan as well as the least and greatest amount of CPU consumed by a single invocation of the query plan:
    • CPU
    • Physical Reads
    • Logical Writes
    • Logical Reads
    • CLR Time (if you use CLR in SQL you'd be able to easily spot a performance problem with it!)
    • Elapsed Time

Wow! Christmas definitely came early this year! This mind boggles at the variety of queries that could be constructed to slice and dice this aggregated data. When digging into the TSQL note that the query plan may only be for a fragment of the TSQL text. You need to use the statement_start_offset and statement_end_offset values to find the chunk. Use this helper function to extract the TSQL fragment:

create function dbo.GetTSQLFragment (
    @tsql nvarchar(max),
    @start_offset int,
    @end_offset int) returns nvarchar(max)
with execute as caller
as begin
    declare @len int;
    set @len = (case
        when @end_offset = -1 then len(@tsql) * 2
        else @end_offset end - @start_offset) / 2;

    declare @TSQLFragment nvarchar(max);
    set @TSQLFragment = substring(@tsql, @start_offset / 2 + 1, @len);
    return @TSQLFragment;
end;

The query now becomes:

select db_name(sql_text.dbid) as [Database],
    case db_id()
        when sql_text.dbid then object_name(sql_text.objectid)
        else cast(sql_text.objectid as sysname) end as [Object],
    sql_text.text as [AllTSQL],
    dbo.GetTSQLFragment(
        sql_text.text,
        statement_start_offset,
        statement_end_offset) as [FragmentTSQL]
from sys.dm_exec_query_stats stats
    cross apply sys.dm_exec_sql_text(sql_handle) as sql_text;

To see the fragments in action, run the following query and then have another look at the query plan cache using the above query:

select * from Production.Culture;
select * from Production.Product;
select * from Production.Location;

You'll see the above query repeated three times with each select statement in its own TSQL fragment and will therefore have its own query plan. However, note the presence of the plan_handle column. This can be used to group query plans together as they were submitted as part of the same batch. Include plan_handle in the SELECT from sys.dm_exec_query_stats, execute variations of the triple-select batch statement above (just reordering is sufficient) and observe how the query plans for the same SELECT statements multiply. For example, run the following batches:

select * from Production.Location;
select * from Production.Culture;
select * from Production.Product;
go

select * from Production.Culture;
select * from Production.Location;
select * from Production.Product;
go

select * from Production.Location;
select * from Production.Product;
select * from Production.Culture;
go

It's not hard too see why ad-hoc TSQL is bad when you see what it does to your query plan cache. For stored procedures, no matter how large or complex the stored procedure is, there will only ever be one query plan for it. However, unfortunately there is some fine print. Query plans that use different SET (environment) options cannot be shared. More on this later.

Execute the following batch and have a look at the query plan cache. There is only one query plan for the stored procedure, irrespective of the ordering and parameters:

exec dbo.uspGetBillOfMaterials 800, '2003/01/01';
exec dbo.uspGetBillOfMaterials 801, '2005/01/01';
exec dbo.uspGetBillOfMaterials 800, '2004/02/08';
exec dbo.uspGetBillOfMaterials 800, '2001/02/08';
go

exec dbo.uspGetBillOfMaterials 799, '2004/02/08';
exec dbo.uspGetBillOfMaterials 800, '2003/01/01';
go

exec dbo.uspGetBillOfMaterials 780, '2004/02/08';
exec dbo.uspGetBillOfMaterials 721, '2001/02/08';
go

Now for the fine print. There are a number of factors that affect the optimiser's ability to re-use query plans. It is important to be aware of the factors as query plan re-use == performance (compiling query plans can be expensive). One of the most prominent factors is the SET (environment) options of the connection executing the query. For a query plan to be re-used, the environment options of the cached query plan needs to be the same as the query being executed. Query optimisation and query plan re-use is an enormous topic that we'll cover over the course of the 'Discover SQL Server 2005' articles. However, for now, note that executing the following query results in two different query plans being cached for the same stored procedure. Execute the query then have a look in the cache using the above DMV query:

set ansi_warnings on
exec
dbo.uspGetBillOfMaterials 721, '2001/02/08';
go

set ansi_warnings off
exec
dbo.uspGetBillOfMaterials 721, '2001/02/08';
go

More often than not environment options are a non-issue as all connections to an application's database will be instantiated by some common data access layer code, so all application connections inherently use the same environment options.

Lastly, to illustrate the power of this DMV, look at this. Want to know which are the 50 most CPU intensive queries in your database? This is all that you need:

select top 50 db_name(dbid) as [Database],
    case db_id()
        when sql_text.dbid then object_name(sql_text.objectid)
        else cast(sql_text.objectid as sysname) end as [Object],
    sql_text.text as [AllTSQL],
    dbo.GetTSQLFragment(
        sql_text.text,
        statement_start_offset,
        statement_end_offset) as [FragmentTSQL],
    plan_handle,
    total_worker_time
from sys.dm_exec_query_stats
    cross apply sys.dm_exec_sql_text(sql_handle) as sql_text
order by total_worker_time desc;

We'll be visiting many more DMVs and sample queries off the DMVs in future articles.

Labels: ,

Tuesday, 13 June 2006

Dynamic Management Views (DMVs) Part I

posted by Stefan Delmarco

In SQL Server 2000 we had to dig into the system tables to take a peek under the covers. Often many of these tables were undocumented and we had to rely on reverse engineering, sp_helptext, and the debugger to figure out what was going on. In addition, Microsoft was at pains to tell us that these tables are undocumented and as such they should be treated as 'internal-only'. In SQL Server 2005, Microsoft has really listened to their customers. We now have a set of virtual tables known as Dynamic Management Views (DMVs).

A lot needs to be read into the name. Firstly, they've called them dynamic not only because they change dynamically with the internal state of SQL Server, but also because Microsoft reserves the right to change the DMV's schema in future versions (leopards and spots comes to mind...). Management because they're used to inspect the state of SQL Server and diagnose issues. Views because they're not actual tables (actually, some of them are even functions). They're just a read-only table-like representation of SQL Server in-memory structures (a bit like sysprocesses in SQL Server 2000). The difference between the old system tables and DMVs is that DMVs are documented and fully supported by Microsoft. They reside in the sys schema (so all the DMVs must be prefixed with sys. - more on schemas vs. owner later) and have a dm_ prefix.

Just listing the DMVs with a one line'r would make this a very boring article. I'd rather show you which ones I think are cool and will be useful to use. However, there are so many that it will take some time for all the buried treasure to be found. Broadly speaking the DMVs are split along functional lines. There are DMVs for CLR-integration, I/O, replication, full-text, and many others. Each of these categories has its own prefix. For example, CLR related DMVs all begin with sys.dm_clr_*. To be able to view DMVs does not require sysadmin privileges. The new VIEW SERVER STATE permission is all the is required for most DMVs (i.e. can look but cannot touch). I know some people that will be very happy to hear this!

Whenever I first get my hands on a SQL Server, the first think I like to do is have a look at what processes are running on the SQL Server. In SQL Server 2000 this would be sysprocesses (albeit through sp_who/2). This view still exists in SQL Server 2005 for backwards compatibility.

There isn't just a single replacement DMV for sysprocesses. There are at least 3! sysprocesses was always a bit confused in that it mixed connection, session and request (what was being executed) into a single table. SQL Server 2005 has split this information up into 3 separate DMVs. This DMVs belong in the "Execution Related Dynamic Views" category and carry the sys.dm_exec_* prefix:

  • sys.dm_exec_sessions
  • sys.dm_exec_connections
  • sys.dm_exec_request

To call these DMVs a replacement for sysprocesses is doing a disservice to the DMVs. It would be like saying that the Ford GT90 is a replacement for the Focus! It is so much more, in a different league!

First, let's have a look at sys.dm_exec_sessions as this has the closest resemblance to sysprocesses. One row exists in this view for every connection / session to SQL Server. Run the following query to see what gets returned:

select *
from sys.dm_exec_sessions;

At first glance this looks pretty much the same as sysprocesses, but keep scrolling to the right. You'll find that each session's user / environment options are clearly visible across all connections!

ANSI_NULLS, DATEFORMAT, etc. are all there! In one foul swoop you can now check that all the sessions to the database are using the correct connection properties. Previously this information was only available by using the undocumented DBCC PSS command. Even then, no aggregated view like this DMV was available. Even better, the transaction isolation level is now visible as well. I don't know how many times I've had to convince a developer that their transaction actually gets upgraded to SERIALIZABLE when started under COM+! With this column, it will be the easiest thing in the world to check! Also observe the row count column (how many rows were returned in the last result set returned), previous error number and last request start / end times - all priceless when diagnosing issues.

Depending on how familiar you were with sysprocesses you'll have noticed sys.dm_exec_sessions did not contain all of the columns from sysprocesses, particularly the connection oriented columns. The DMV containing connection specific columns is sys.dm_exec_connections. Run the following query to get a flavour of what the table contains:

select *
from sys.dm_exec_connections;

You'll notice that this table contains a smaller number of rows than sys.dm_exec_sessions. This is because sys.dm_exec_sessions includes system sessions as well (such as the lazy writer, service broker, etc.).

The sys.dm_exec_connections DMV only contains local / remote connections to the SQL Server. This does not include any internal system connections (which aren't real connection anyway). Again, this DMV contains far more detail than sysprocesses. Amongst the new columns we can now see the packet size each connection is using, number of bytes that have been sent / received for each connection (see how the number of ways to spot the resource hogs has increased with DMVs?) and which authentication scheme the connection is using. Anyone that tried to figure whether a connection was using NTLM or Kerberos in SQL Server 2000 will love that!

The last DMV for this instalment is sys.dm_exec_requests. Again, ping the DMV to see what comes back:

select *
from sys.dm_exec_requests;

We can use this DMV to get the overall picture of what requests are currently executing, get handle to the TSQL and plan they are executing, see if the request is blocked / blocking (same as SQL Server 2000). Note that this DMV duplicates some of the session information from sys.dm_exec_sessions so you don't have to join the DMVs. Typically you'd want to see session- and request-oriented data together.

So, you might be wondering, given the amount of detail available in DMV's is there a way to be able to dig into them easily without having to come up with sp_who3...? Luckily Microsoft has integrated portions of Reporting Services into the Query Analyser replacement: SQL Server Management Studio (SSMS). Bring up the Reports window and select one of the many available. Context of the report is driven by the current selection in Object Explorer. For those that don't have ready access to SSMS have a look at the attached screen shot and be wow'd!

Now that we've had a bit of an introduction to DMVs as a transition from sysprocesses, next we'll have a look at the power user's DMVs!

You can download the SQL Server 2005 system view map (which includes DMVs) from the Microsoft Download Center.

Labels: ,


Previous Posts