MatticusAU Blog: dmv

Showing posts with label dmv. Show all posts

Tuesday, November 11, 2014

Troubleshooting blocking locks in SQL Azure

I was faced with an interesting situation yesterday where a customer had created a blocking chain on their SQL Azure database. The scenario they described was:

A query took out and held a lock on object but the client connection closed and orphaned the session on the SQL Azure database. This resulted in other queries trying to access the object being blocked and timing out (even a SELECT *).

Given we do not have any graphical tools in SSMS to see current activity when you connect to a SQL Azure database, this offers a challenge for those that are use to going to Activity Monitor or the Reports to view this sort of information. Thankfully most of DMV's and DMF's have made their way into support on SQL Azure.

To troubleshoot the above scenario we then used the following queries to identify the culprit and eventually terminate it with the only option available.... KILL [spid]. To demonstrate this I have created the scenario on my Azure database through the use of PowerShell (but i am not sharing that as it is bad practice). The good news is that these queries will also work for your on-premise environments.

TSQL to identify the Blockers and Victims

WITH Blockers AS
    (select DISTINCT blocking_session_id as session_id
 from sys.dm_exec_requests
 where blocking_session_id > 0
)
SELECT 'Blocker' as type_desc
 , sys.dm_exec_sessions.session_id
 , sys.dm_exec_requests.start_time
 , sys.dm_exec_requests.status
 , sys.dm_exec_requests.command
 , sys.dm_exec_requests.wait_type
 , sys.dm_exec_requests.wait_time
 , sys.dm_exec_requests.blocking_session_id
 , '' AS stmt_text
FROM sys.dm_exec_sessions
LEFT JOIN sys.dm_exec_requests ON sys.dm_exec_requests.session_id = sys.dm_exec_sessions.session_id
INNER JOIN Blockers ON Blockers.session_id = sys.dm_exec_sessions.session_id
UNION
SELECT 'Victim' as type_desc
 , sys.dm_exec_sessions.session_id
 , sys.dm_exec_requests.start_time
 , sys.dm_exec_requests.status
 , sys.dm_exec_requests.command
 , sys.dm_exec_requests.wait_type
 , sys.dm_exec_requests.wait_time
 , sys.dm_exec_requests.blocking_session_id
 , ST.text AS stmt_text
FROM sys.dm_exec_sessions
INNER JOIN sys.dm_exec_requests ON sys.dm_exec_requests.session_id = sys.dm_exec_sessions.session_id
CROSS APPLY SYS.DM_EXEC_SQL_TEXT(sys.dm_exec_requests.sql_handle) AS ST
WHERE blocking_session_id > 0

The output of this query looks then like the following where you can clearly see the blocker and it's victims (in fact this example then has a 2nd layer of blocking).

TSQL to view the established locks within the current database

SELECT
 (CASE sys.dm_tran_locks.request_session_id
  WHEN -2 THEN 'ORPHANED DISTRIBUTED TRANSACTION'
  WHEN -3 THEN 'DEFERRED RECOVERY TRANSACTION'
  ELSE sys.dm_tran_locks.request_session_id
 END) AS session_id
 , DB_NAME(sys.dm_tran_locks.resource_database_id) AS database_name
 , sys.objects.name AS locked_obj_name
 , sys.dm_tran_locks.resource_type AS locked_resource
 , sys.dm_tran_locks.request_mode AS lock_type
 , ST.text AS stmt_text
 , sys.dm_exec_sessions.login_name AS login_name
 , sys.dm_exec_sessions.host_name AS host_name
 , sys.dm_tran_locks.request_status as request_status
FROM sys.dm_tran_locks
JOIN sys.objects ON sys.objects.object_id = sys.dm_tran_locks.resource_associated_entity_id
JOIN sys.dm_exec_sessions ON sys.dm_exec_sessions.session_id = sys.dm_tran_locks.request_session_id
JOIN sys.dm_exec_connections ON sys.dm_exec_connections.session_id = sys.dm_exec_sessions.session_id
CROSS APPLY SYS.DM_EXEC_SQL_TEXT(sys.dm_exec_connections.most_recent_sql_handle) AS st
WHERE sys.dm_tran_locks.resource_database_id = DB_ID()
ORDER BY sys.dm_tran_locks.request_session_id

The output of this query shows the various locks which have been established by each session (one-to-many in sessions-to-locks). From here we can see which locks will have taken priority and potentially blocking more queries in the future.

So while you may not have a graphical view of this information we can definitely use our trusted DMV's and DMF's to gain access to the relevant data. Who knows with the rate of change it can surely only be a matter of time until Activity Monitor and the standard reports are available to us in SSMS for SQL Azure.

Legal Stuff: As always the contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.

Wednesday, July 10, 2013

TSQL to find Top Queries by Avg CPU, IO, Time

One of my favourite performance tools has certainly become the RML reports as you can quickly identify queries with common hot spots across performance metrics. However this requires a bit of work, you first have to configure your performance trace (usually with Diag Manager), capture the SQL Profiler Trace, and then analyse it with RML/SQLNexus. Of cause the UI with XEvents makes this a bit easier if you use that as your capture. However what if you want to look at a general overall performance of a server, or for a longer duration.
We know that SQL Server is always capturing data for us in the background and exposes that to us with DMVs and DMFs. So I came up with the following query which uses the DMVs/DMFs to report the top queries.
You can change the sort order at the end of the query however as I have included the Rank number per performance metric (CPU, IO, Time) you can quickly see queries which are in the top of each group without needing to resort the results or run multiple queries.
If you did want to run multiple queries with different sort columns, then the Query Number should remain the same as it is based on CPU and the reason i included this is to allow you to easily compare the queries across multiple executions (depending on the time between executions).
;WITH CTE([Query Num], [Total Executions], [Total CPU], [Avg CPU], [Avg CPU Rank], [Total IO], [Avg IO], [Avg IO Rank], [Total Elapsed Time], [Avg Elapsed Time], [Avg Elapsed Time Rank], [Sample Statement Text], [Query Plan Handle], [Statement Hash], [Query Plan Hash])
AS
(
    SELECT TOP 50
        ROW_NUMBER() OVER (ORDER BY (SUM(total_worker_time) / SUM(execution_count)) DESC) AS [Query Num]        , SUM(execution_count) AS [Total Executions]        , SUM(total_worker_time) AS [Total CPU]        , SUM(total_worker_time) / SUM(execution_count) AS [Avg CPU]        , RANK() OVER (ORDER BY (SUM(total_worker_time) / SUM(execution_count)) DESC) AS [Avg CPU Rank]        , SUM(total_physical_reads + total_logical_reads + total_logical_writes) AS [Total IO]        , SUM(total_physical_reads + total_logical_reads + total_logical_writes) / SUM(execution_count) AS [Avg IO]        , RANK() OVER (ORDER BY (SUM(total_physical_reads + total_logical_reads + total_logical_writes) / SUM(execution_count)) DESC) AS [Avg IO Rank]        , SUM(total_elapsed_time) AS [Total Elapsed Time]
        , SUM(total_elapsed_time) / SUM(execution_count) AS [Avg Elapsed Time]        , RANK() OVER (ORDER BY (SUM(total_elapsed_time) / SUM(execution_count)) DESC) AS [Avg Elapsed Time Rank]
        , MIN(query_text) AS [Sample Statement Text]        , MIN(plan_handle) AS [Query Plan Handle]        , query_hash AS [Statement Hash]        , query_plan_hash AS [Query Plan Hash]
    FROM
    (
        SELECT
            qs.*,
            SUBSTRING(st.[text], qs.statement_start_offset/2, (
                CASE
                    WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX), st.[text])) * 2
                    ELSE qs.statement_end_offset                 END - qs.statement_start_offset)/2
            ) AS query_text
        FROM sys.dm_exec_query_stats AS qs
        CROSS APPLY sys.dm_exec_sql_text(qs.[sql_handle]) AS st
        WHERE st.[text] NOT LIKE '%sys.dm_%'
        --AND DateDiff(hour, last_execution_time, getdate()) < 1 --change hour time frame
    ) AS query_stats     GROUP BY query_hash, query_plan_hash )
SELECT
    [Query Num]
   , [Total Executions]
   , [Total CPU]
   , [Avg CPU]
   , [Avg CPU Rank]
   , [Total IO]
   , [Avg IO]
   , [Avg IO Rank]
   , [Total Elapsed Time]
   , [Avg Elapsed Time]
   , [Avg Elapsed Time Rank] , DB_Name(qp.dbid) AS [DB Name]   , [Sample Statement Text]
   , qp.query_plan AS [Estimated Query Plan]FROM CTE
OUTER APPLY sys.dm_exec_query_plan([Query Plan Handle]) AS qp --ORDER BY [Avg CPU] DESC
ORDER BY [Avg IO] DESC --ORDER BY [Avg Elapsed Time] DESC

NOTE: This query includes a TOP 50 which improves performance but only looks at very recent data. I would recommend using the predicate on last_execution_time to restrict the data back to a valid time frame. This would still then include the total history for those requests but ensure that you are only looking at active queries.

Here is an example output. In this example I had cleared the cache and only a small test data so the Avg IO rank is the same for each query but this gives you an idea.

So far I’ve tested this on SQL 2008 R2 and above.
I’m also closing in on finalising the BETA release of my DBA Admin and Performance SSMS reports which will include this query and take this to a whole new level of visualisation.

Legal Stuff: The contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.

Saturday, April 13, 2013

A look into SQL Server's Plan Caching when object names are not fully qualified

The other day while demonstrating the circumstances which control when SQL Server's Optimiser will create a new query plan or reuse a plan from cache I was asked an interesting question. While I hadn't come across the specific scenario before, I presented my logic based on my knowledge of the optimiser's behaviour and that it would be an interesting investigation. In this blog is my investigation and I am pleased to say my initial answer (based on logical thought) was correct.

Scenario
We have 2 Tables in a database with the same name but owned by separate schema's, if a user executes a query against one of those tables but does not fully qualify the table name using the owning schema (i.e. relies on their default schema to qualify the name), and another user who's default schema would qualify to the other table within the database runs an identical query, would the Optimiser use the same plan for both.

My Initial Thoughts
While not having investigated this specific behaviour before, my knowledge of the Optimisers behaviour is that it would try and fully qualify the name while generating the Query Plan and therefore create separate plans regardless of if the object name is fully qualified or not.

The Investigation
To reproduce this behaviour I have created a test database [CacheTest] and within that database 2 schemas, TestSchema1 and TestSchema2, along with this 2 logins, TestUsr1 and TestUsr2 which are mapped to the database and schema's respectively. Also within the database are 2 tables, one for each schema [TestSchema1].[ProcCacheTest] and [TestSchema2].[ProcCacheTest], and each table has been populated with some test data.

Lets start by ensuring we have a nothing in our plan cache so it is easier to witness the behaviour, obviously you don't want to do this on a production system.

DBCC FREEPROCCACHE

Now we need to run a query as each user individually.

Open a New Query Window and change the connection to authenticate as [TestUsr1] and then run:

SELECT * FROM [TestSchema1].[ProcCacheTest]
WHERE SomeDate = '19000107'

Then open a New Query Window and change the connection to authenticate as [TestUsr1] and then run.

SELECT * FROM [TestSchema2].[ProcCacheTest]
WHERE SomeDate = '19000107'

Now check the cache, for that I am going to use the script that I wrote in my blog here. Nothing special in these results as we have specified the Schema name so we get 2 individual Query Plans cached.

So lets find out what happens when we don't specify a Schema name (i.e. we don't Fully Qualify the object name). Run the following from the connection context of [TestUsr1]:

SELECT * FROM [ProcCacheTest]
WHERE SomeDate = '19000107'

So what does the cache tell us.

As you can see a new Query Plan is cached. Again nothing unexpected here as we changed the text being executed (removed the schema name) so the query text no longer matches and we know that even the inclusion of an extra White Space, Line Break or even a change in Character Case would cause this behaviour.

How about if we execute this exact same query under the context of the [TestUsr2], and be sure that it is the exact same query with no extra White Space or variance in text.

SELECT * FROM [ProcCacheTest]
WHERE SomeDate = '19000107'

The results of this query should return data from the [TestSchema2].[ProcCacheTest] table, even though we don't specify the schema because the users default schema will map back to the table.

After executing the query lets check the cache again.

Whoa, we now have a 4th Plan in the cache, but wait the Query Text looks identical yet the optimiser has generated a new plan. But how can we be sure we are looking at two completely different statements? The sys.dm_exec_query_stats DMV can help us there as it includes the Query_Hash column, and this is the binary hash value calculated on the query and used by the Optimiser to identify queries with similar logic.

So lets add that into the result set.

As you can see the Query Hash for the 2 queries are different, but wait the Query Hash for the queries in rows in 1 and 4 are identical, and so is the Hash for queries 2 and 3 which match the pairs of statements we executed under each user context. So this verifies to us that the Optimiser is creating the Query Plan based on the Fully Qualified Object path, not just the text we enter (in fact you can verify this even further by looking at the Query_Plan_Hash as it shows the same paired values).

We can also use the query_plan output of the sys.dm_exec_query_plan DMV to view the xml or graphical plan, and from within the plan we can see that the Predicate has the fully qualified object name.

Wow this is great, but lets just be sure because we know some connection providers will use the Prepare / Execute method or maybe we are using the Forced Parameterisation option. So lets test with the sp_executesql to ensure parameterisation of the SQL text.

I executed the following in the context of [TestUsr1]

exec sp_executesql @stmt = N'SELECT * FROM [ProcCacheTest] WHERE SomeDate = @1'
, @params = N'@1 varchar(8)'
, @1 = '19000107';

And then the following in the context of [TestUsr2]

exec sp_executesql @stmt = N'SELECT * FROM [ProcCacheTest] WHERE SomeDate = @1'
, @params = N'@1 varchar(8)'
, @1 = '22000107';

Note that it was my intention to use a different date, due to the data that I have in my [TestSchema2].[ProcCacheTest] table and we have already proved that it will not perform a direct match.

So we check the cache again.

Again we see the same behaviour, the Optimiser creates a new Query Plan in the cache, but it recognises that the statements relate to each other based on the Fully Qualified Object path (see the Query Hash or Query Plan Hash columns).

So after all this we can see that the Optimiser is working as we expect, and mapping the Query Plans to the individual objects based on qualified object paths regardless of if the user names the objects correctly or not. Of cause the recommendation should always still be to qualify the name of the object with the Schema to reduce this additional work that the optimiser is required to do as well as some of the other issues that it can cause.

Thanks for reading and props to the student in the recent workshop that spiked my curiosity on this subject :)

Legal Stuff: The contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.

Friday, April 12, 2013

TSQL to find Cached Plans within SQL Server

The following SQL Statement is based off a number of DMV's and DMF's and allows us to look at the query statistics and cached query plans. This becomes useful when trying to determine how SQL Server is caching the execution query plans for our statements, either Adhoc or Proc, but more on that below.

SELECT qs.last_execution_time
, qs.execution_count
, qs.plan_generation_num
--, qs.plan_handle
--, qs.query_plan_hash
--, qs.query_hash
, cp.refcounts as [QueryPlan_RefCounts]
, cp.usecounts as [QueryPlan_UseCounts]
, cp.size_in_bytes as [QueryPlan_SizeBytes]
, cp.cacheobjtype as [CacheObjType]
, cp.objtype as [ObjType]
--, qs.statement_start_offset
--, qs.statement_end_offset
, st.text as [QueryText]
, SUBSTRING(st.text, qs.statement_start_offset/2,
(CASE WHEN qs.statement_end_offset = -1
THEN LEN (CONVERT (nvarchar(max), st.text)) * 2
ELSE qs.statement_end_offset END -
qs.statement_start_offset)/2) as [StmtText]
, qp.query_plan as [QueryPlan]
FROM sys.dm_exec_query_stats qs
INNER JOIN sys.dm_exec_cached_plans cp ON cp.plan_handle = qs.plan_handle
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) as qp
WHERE st.text NOT LIKE N'%sys.dm_%'
AND st.text LIKE N'%'
ORDER BY qs.last_execution_time DESC;

To look for query's related to a specific table or object, then change the where statement as follows:
AND st.text LIKE N'%MyTableName%'

So why would we want to care about the plan cache?

In an environment with lots of adhoc statements a problem can certainly arise where SQL Server does not reuse existing cached plans efficiently. This is because the predicate values are being recorded along with the plans for adhoc statements within the cache. Say for example you have a query such as:

SELECT *
FROM dbo.ProcCacheTest
WHERE someDate = '19000605'

well this would generate a plan within the cache, but then if the application/user reissues the same statement but using different value such as:

SELECT *
FROM dbo.ProcCacheTest
WHERE someDate = '19000608'

This will result in an entirely new plan being generated and cached for the statement. Of cause if the statement is reissued with the exact same predicate values then it will reuse the existing plan from the cache. However if you have an environment with lots of this type of activity and with a large number of different predicate values in use, then this could cause your plan cache to be consumed by all of these statements and not allow other statements to have their plans cached correctly. In other words cause memory pressure within the plan cache and overall resulting in a degradation of performance.

So how can we avoid this, well if we wrap the adhoc query up into a stored procedure such as:

CREATE PROCEDURE dbo.myProcCacheTest (
@pDate varchar(15)
)
as
SELECT *
FROM dbo.ProcCacheTest
WHERE someDate = @pDate
GO

now when the stored procedure is executed the SQL Optimiser will create a plan in the cache for the statement but with parameterization (i.e. without tracking the actual values of the search predicate).

We can also use the Optimize For Ad Hoc Workloads advanced system setting to ensure that SQL Server doesn't cache plans for adhoc statements unless they are executed more than once. However as this means the Optimizer will only create Query Plan Stubs in the cache for individual executions and then only create full Query Plans in the cache after a query's execution count is >1 it may not fix your problem as it may simply allow for more space within the Plan Cache to store these single execution queries.

There is also another option which is to use the FORCE PARAMETERIZATION option on the database, then this will force the Optimizer to convert the search predicate within the adhoc queries to a parameter before generating the plan, thus allowing the plan to be reused. This setting doesn't support all data types so you may still find some adhoc queries being cached. In addition to this be extremely careful about implementing these type of settings in a production environment as while they can provide a vast improvement, in some cases it can have a less desired effect, it all depends on the way the exact queries are being processed.

Still the greatest improvement can be gained from removing the adhoc queries through the use of Stored Procedures and another advantage that a Stored Procedure offers us is that the individual statements with in the procedure are cached with individual plans, and we can see those individual statements/plans using the same statement above. I have commented out the query_hash and query_plan_hash columns, but these are how SQL Server tracks the unique statements and plans within it's cache.

So if we had a stored procedure like:

CREATE PROCEDURE dbo.myProcCacheTest (
@pDate varchar(15)
)asSELECT *
FROM dbo.ProcCacheTest
WHERE someDate = @pDate;

DECLARE @pDate2 DateTime = DateAdd(Day, 7, @pDate);
SELECT *
FROM dbo.ProcCacheTest
WHERE someDate >= @pDate
AND someDate <= @pDate2;
GO

If we were to review the cache for this stored procedure we would find a plan in the cache for the first select query and then another for the second select query. Each will have the same plan_handle, but will have individual query_plan_hash and query_hash values.

Other aspects which are useful to gain from looking at the Cache is when plans are regenerated through Recompile, either due to data modification, SET options, or referenced object schema changes. Remember though that when you change the schema of a procedure it's associated cached plans are dropped and removed.

Enjoy digging deeper into the plan cache.

Legal Stuff: The contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.