Wednesday, July 10, 2013

TSQL to find Top Queries by Avg CPU, IO, Time

One of my favourite performance tools has certainly become the RML reports as you can quickly identify queries with common hot spots across performance metrics. However this requires a bit of work, you first have to configure your performance trace (usually with Diag Manager), capture the SQL Profiler Trace, and then analyse it with RML/SQLNexus. Of cause the UI with XEvents makes this a bit easier if you use that as your capture. However what if you want to look at a general overall performance of a server, or for a longer duration.
We know that SQL Server is always capturing data for us in the background and exposes that to us with DMVs and DMFs. So I came up with the following query which uses the DMVs/DMFs to report the top queries.
You can change the sort order at the end of the query however as I have included the Rank number per performance metric (CPU, IO, Time) you can quickly see queries which are in the top of each group without needing to resort the results or run multiple queries.
If you did want to run multiple queries with different sort columns, then the Query Number should remain the same as it is based on CPU and the reason i included this is to allow you to easily compare the queries across multiple executions (depending on the time between executions).
;WITH CTE([Query Num], [Total Executions], [Total CPU], [Avg CPU], [Avg CPU Rank], [Total IO], [Avg IO], [Avg IO Rank], [Total Elapsed Time], [Avg Elapsed Time], [Avg Elapsed Time Rank], [Sample Statement Text], [Query Plan Handle], [Statement Hash], [Query Plan Hash])
AS
(
    SELECT TOP 50
        ROW_NUMBER() OVER (ORDER BY (SUM(total_worker_time) / SUM(execution_count)) DESC) AS [Query Num]        , SUM(execution_count) AS [Total Executions]        , SUM(total_worker_time) AS [Total CPU]        , SUM(total_worker_time) / SUM(execution_count) AS [Avg CPU]        , RANK() OVER (ORDER BY (SUM(total_worker_time) / SUM(execution_count)) DESC) AS [Avg CPU Rank]        , SUM(total_physical_reads + total_logical_reads + total_logical_writes) AS [Total IO]        , SUM(total_physical_reads + total_logical_reads + total_logical_writes) / SUM(execution_count) AS [Avg IO]        , RANK() OVER (ORDER BY (SUM(total_physical_reads + total_logical_reads + total_logical_writes) / SUM(execution_count)) DESC) AS [Avg IO Rank]        , SUM(total_elapsed_time) AS [Total Elapsed Time]
        , SUM(total_elapsed_time) / SUM(execution_count) AS [Avg Elapsed Time]        , RANK() OVER (ORDER BY (SUM(total_elapsed_time) / SUM(execution_count)) DESC) AS [Avg Elapsed Time Rank]
        , MIN(query_text) AS [Sample Statement Text]        , MIN(plan_handle) AS [Query Plan Handle]        , query_hash AS [Statement Hash]        , query_plan_hash AS [Query Plan Hash]
    FROM 
    (
        SELECT 
            qs.*
            SUBSTRING(st.[text], qs.statement_start_offset/2, (
                CASE 
                    WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX), st.[text])) * 2 
                    ELSE qs.statement_end_offset                  END - qs.statement_start_offset)/2 
            ) AS query_text
        FROM sys.dm_exec_query_stats AS qs
        CROSS APPLY sys.dm_exec_sql_text(qs.[sql_handle]) AS st
        WHERE st.[text] NOT LIKE '%sys.dm_%'
        --AND DateDiff(hour, last_execution_time, getdate()) < 1 --change hour time frame
    ) AS query_stats     GROUP BY query_hash, query_plan_hash )
SELECT 
    [Query Num]
   , [Total Executions]
   , [Total CPU]
   , [Avg CPU]
   , [Avg CPU Rank]
   , [Total IO]
   , [Avg IO]
   , [Avg IO Rank]
   , [Total Elapsed Time]
   , [Avg Elapsed Time]
   , [Avg Elapsed Time Rank]
  , DB_Name(qp.dbid) AS [DB Name]   , [Sample Statement Text]
   , qp.query_plan AS [Estimated Query Plan]FROM CTE
OUTER APPLY sys.dm_exec_query_plan([Query Plan Handle]) AS qp --ORDER BY [Avg CPU] DESC
ORDER BY [Avg IO] DESC --ORDER BY [Avg Elapsed Time] DESC


NOTE: This query includes a TOP 50 which improves performance but only looks at very recent data. I would recommend using the predicate on last_execution_time to restrict the data back to a valid time frame. This would still then include the total history for those requests but ensure that you are only looking at active queries.


Here is an example output. In this example I had cleared the cache and only a small test data so the Avg IO rank is the same for each query but this gives you an idea.
TopNQueryiesbyHashandRank

So far I’ve tested this on SQL 2008 R2 and above.
I’m also closing in on finalising the BETA release of my DBA Admin and Performance SSMS reports which will include this query and take this to a whole new level of visualisation.

Legal Stuff: The contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.

No comments:

Post a Comment