MatticusAU Blog: tsql

Showing posts with label tsql. Show all posts

Tuesday, March 28, 2017

Why VSCode has replaced Management Studio as my default SQL Database and Query editor

Firstly let me start by stating that when I originally set out in an IT career I was heading down a developer path, and certainly had a number of developer type roles over the years, or found ways of continuing development projects while working in infrastructure roles..... probably why I have an interest in DevOps. So taking that into account it's no surprise that for my entire career I have always been comfortable working in code and not relying on GUIs. Even for all the years as a SQL DBA armed with SQL Management Studio (SSMS), yet I was always most comfortable working in TSQL rather than the wizards. Probably comes from the days of Enterprise Manager and Query Analyser (ahhhhh nostalgia). Now the MS Product Team has done a great job at improving the wizards in SSMS and making tasks as easy as they can be in the tools. I will also state that this post is by no means saying SSMS is dead because there are just some things where it is better positioned.

What I will cover in this blog post is why my go to TSQL editor and tool for general database work is now VSCode with the MSSQL extension.

Please don't take this as a statement that I have now uninstalled SSMS or Visual Studio with SQL Data Tools (SSDT) from my laptop, I wish, but I have always found those tool a bit bloated with memory consumption when all I want to do is connect to a database, run some queries, or make some basic changes. What I will show is why/how I now perform those tasks with VS Code, but for anything more in depth like designing SSIS packages or performance troubleshooting I still rely on the existing tools (for now).

Another factor that is driving this adoption of a text based editor is that a large amount of my work is now with Azure and other cloud solutions, and for the majority of the work you need to do it is largely console or script based.

Now that you know why I have arrived at this place, lets get into how I setup and use VS Code for this purpose. I look forward to healthy discussions with people around this because I am not a believer of the "one size fits all" approach to a tool set either so it is always great to hear what others use.

Setup and configure your environment
Here are the steps I use to setup my VSCode environment:

Download and install VSCode https://code.visualstudio.com/download
Open VS Code
Press Ctrl+Shift+X (on windows)

Alternatively use the View > Extensions menu item

Locate and install the following extensions

vscode-icons
mssql
powershell
c#

Configure the extensions

From File > Preferences > File Icon Theme select "VSCode Icons"
This will ensure that any files you open and access have nicely displayed icons to make your experience easier.

Configure the environment settings

From File > Preferences > Settings

VS Code works in two setting modes, User and Workspace. User should be personal preferences and Workspace should be used for project specific settings that will ship with the repo.
User settings are stored in the file C:\Users\\AppData\Roaming\Code\User\settings.json but you shouldn't have to edit that manually as the VS Code window provides the best method for working with these files.

I don't change too many settings at this time from the default, but some to consider depending on your needs are:
mssql.splitPaneSelection = "current|next|end"

IntelliSense will help you complete the values if you need to see what is available.

Now you should be ready to start working inside VS Code. However, I recommend reading the release notes when new updates are made as the developer community is extremely active improving VS Code and there is always new and useful features being added.

While VSCode has a built-in integrated terminal, I like the cmder tool for my terminal use. If you aren't familiar with cmder check it out, very versatile, run multiple terminals and languages. Best of all a Quake mode.

http://cmder.net

For more info check out my post on it http://blog.matticus.net/2017/03/cmder-enhance-your-windows-terminal.html

Connecting to a database and executing SQL queries
There are many tricks and ways to work within VS Code but here is a simply walk through on the basics to get you started.

Open VS Code if you haven't already
You do not need to open a folder or save files just to run queries but it could be beneficial. Think of a folder like a Project/Solution, but in a simplier (faster) format. This works great with Git and cross platform collaboration.
For the case of this walk through just create a new file (click New File on the welcome page)
Without saving the file, lets make sure we are in the right language mode.

Click the current language in the tray menu (e.g. Plain Text)

This will open the command palette with prompts to select the language. Either browse or type to find your language and select it.

Now the correct SQL language is shown in the tray menu

Now the color coding and formatting, along with IntelliSense, will be suitable for SQL Server development.

TIP: When you save a file then the language mode is automatically detected based on the file extention.
Press Ctrl+Shift+P to open the Command Palette
Type "mssql" and select the mssql: Connect option or press Ctrl+Shift+C

TIP: Make sure your focus is in a file with the SQL language set and not any other areas of VSCode when you press Ctrl+Shift+C as otherwise it will open a console as per those keyboard shortcuts default.
Select an existing connection profile or select the Create Connection Profile to create a new one. So lets create one.
Follow the wizard filling out your server/instance, database (optional), authentication etc.

Once you start to connect the status is shown in the tray menu

Any errors connecting will be shown with an overlay

Once connected VS Code will update intellisense dictionary and perform other operations set by the extension.
Now write your query in the file
When ready you can execute the query in a few methods

Use the Command Palette and the MSSQL: Execute Query command.

Right click in the editor and select

or my favorite just simply press Ctrl+Shift+E
The query results tab will open. By default this opens in a new split window column, or the next one if you have multiples. The idea here is so you can see the query and result all in one window.

You can put the query results at the bottom of the screen which might be a more familiar view to those use to SSMS. To do this select the Toggle Editor Group Layout from the View menu, or press Alt+Shiftt+1.

Now the results are below the query you executed.

Alternatively you can also set the query results to display in the current split window column (e.g. new tab)

So as you can see you can customise where the results are displayed just like in SSMS.

Something to keep in mind is that a new result tab will open for every file you execute a query from, but if you re-run a query or a new query from the same file then it will use the existing results tab for that file.
Now just like the query editor in SSMS, it will either execute the entire file contents or what you have selected. So like in this example it will just execute the selected query and not the entire file contents.

This is why I like the keyboard shortcut Ctrl+Shift+E to execute queries because it becomes really quick to work from a file and execute different selected queries as desired.

Obviously some people will really miss Object Explorer to understand the schema's of databases they are not familiar with, but keep in mind VSCode is designed for developers and so typically you would have a folder that contains all the scripts for creating the database and therefore your schema to refer to, or you would be familiar with the schema. However, as we all know there are plenty of views you can easily query to get that data (because after all that is all Object Explorer does).

Happy SQL Scripting.

Registering your SQL Server connections

VS Code has a json based configuration system. SQL Connections can be saved in the User Settings file, think along the lines of "Registered Servers" in SSMS. I have already briefly touched on how to create a new profile when you connect. However here is how to register them ahead of time and manage existing connection profiles.

Keep in mind though, these connections are not unique to a project/solution/folder, they are unique to your user settings. So you make sure you give them meaningful names to easily identify which databases/projects they belong to.

Press Ctrl+Shift+P to open the Command Palette
Type "mssql" and select the MSSQL: Manage Connection Profiles option
The Command Palette will then prompt you with some more options.

Create: This will walk you through creating a new profile via the Command Palette prompts
Edit: This will open the User Settings JSON file and allow you to manually edit the connection profiles. NOTE: Passwords can be saved in an encrypted form but are not stored in this file for security.
Remove: This will walk you through removing an existing profile via the Command Palette prompts

This is an example of the JSON configuration provided with the Edit option.
Once you have configured the profile you can then simply select it from the list provided under the MSSQL: Connect command.

References

VS Code official site https://code.visualstudio.com/
VS Code opensource repo https://github.com/Microsoft/vscode

VS Code Extensions https://marketplace.visualstudio.com/VSCode
Cmder http://cmder.net/

Legal Stuff: The contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.

Friday, April 15, 2016

TSQL To identify queries which are candidates for Parameterization

While discussing the concepts of 'optimize for ad hoc workloads' and 'Forced Parameterization' in SQL Server I decided to modify an existing query I wrote to analyse the query performance from the cache (more on it here) to now analyse the adhoc queries for candidates which could benefit from parameterization.

;WITH CTE(QueryCount, StatementTextForExample, plan_handle, QueyHash, QueryPlanHash, CacheObjType, 
    ObjType) 
AS 
( 
   SELECT
    COUNT(query_stats.query_hash) AS QueryCount
    , MIN(query_stats.query_text) AS StatementTextForExample
    , MIN(query_stats.plan_handle) AS plan_handle
    , query_stats.query_hash AS QueryHash
    , query_stats.query_plan_hash AS QueryPlanHash
    , query_stats.CacheObjType
    , query_stats.ObjType
   FROM  
       ( 
           SELECT  
      qs.query_hash
      , qs.query_plan_hash 
      , qs.plan_handle
      , cp.cacheobjtype as [CacheObjType]
      , cp.objtype as [ObjType]
      , SUBSTRING(qt.[text], qs.statement_start_offset/2, ( 
                   CASE  
                       WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX), qt.[text])) * 2  
                       ELSE qs.statement_end_offset  
                   END - qs.statement_start_offset)/2  
               ) AS query_text 
           FROM  
               sys.dm_exec_query_stats AS qs 
      INNER JOIN sys.dm_exec_cached_plans cp ON cp.plan_handle = qs.plan_handle
               CROSS APPLY sys.dm_exec_sql_text(qs.[sql_handle]) AS qt 
           WHERE qt.[text] NOT LIKE '%sys.dm_exec_query_stats%' 
     AND cp.objtype = 'AdHoc'
     --AND qs.last_execution_time BETWEEN DATEADD(hour,-1,GETDATE()) AND GETDATE() --change hour time frame
       ) AS query_stats 
 GROUP BY query_stats.query_hash
  , query_stats.query_plan_hash
  , query_stats.CacheObjType
  , query_stats.ObjType
 HAVING COUNT(query_stats.query_hash) > 1
) 
SELECT  
   CTE.QueryCount
   , CTE.CacheObjType
   , CTE.ObjType
   , CTE.StatementTextForExample
   , tp.query_plan AS StatementPlan
   , CTE.QueyHash
   , CTE.QueryPlanHash 
FROM 
   CTE 
   OUTER APPLY sys.dm_exec_query_plan(CTE.plan_handle) AS tp 
ORDER BY CTE.QueryCount DESC; 
--ORDER BY [Total IO] DESC;

When you identify these candidates you then need to look at what is the most appropriate resolution such as.

Rewrite the query at the application layer to ensure it is called with parameterization from the data provider
Rewrite the query as a stored procedure

Enable 'optimize for ad hoc workloads' on your SQL instance

EXEC sys.sp_configure N'optimize for ad hoc workloads', N'0'
GO
RECONFIGURE WITH OVERRIDE
GO

Enable 'Forced Parameterization' on the affected database

USE [master]
GO
ALTER DATABASE [AdventureWorksPTO] SET PARAMETERIZATION SIMPLE WITH NO_WAIT
GO

Of cause the appropriate option really depends on a larger view of your environment and applications that only you can determine. If you do have any thoughts on this please feel free to add them to the comments below to help others.

Word of caution too. Before adjusting any sp_configure settings with RECONFIGURE make sure you run the following to check for any settings which are not yet active. It is expected to see 'min server memory (MB)' in the results of this though if you leave the setting default at 0 as SQL must reserve the minimum memory possible which is 16mb.

SELECT * FROM sys.configurations
WHERE Value <> value_in_use

Thursday, January 22, 2015

SCOM: Managing Data Aggregation in the OpsMgrDW database

One of the common discussions I have when work with a customer on their Operations Manager (SCOM) environment is how the data aggregation works and how that impacts on the database size. There are plenty of blogs around how to review the data retention periods, my favourite is:

http://blogs.technet.com/b/kevinholman/archive/2010/01/05/understanding-and-modifying-data-warehouse-retention-and-grooming.aspx

So now that you have read that blog and know that the defaults are a bad idea. How do we actually determine the data size that each aggregate dataset takes up?

The typical answer is to use the dwdatarp.exe tool (available from http://blogs.technet.com/b/momteam/archive/2008/05/14/data-warehouse-data-retention-policy-dwdatarp-exe.aspx) however some organizations may have change and security restrictions which prevent the use of such tools. Alternatively people may just be more comfortable with seeing exactly what data is being accessed..... That is the scenario I was faced with this week so I have come up with the below SQL query which will give you the output of each aggregate dataset, the total size, and the percentage of the total database size.

--Which DataSet is usin the most database space (alternative to using dwdatarp.exe
USE OperationsManagerDW;
WITH TableSizes AS
(
 SELECT so.name,
  Coalesce(8 * Sum(CASE WHEN si.indid IN (0, 1) THEN si.reserved END),0) AS data_kb,
  Coalesce(8 * Sum(CASE WHEN si.indid NOT IN (0, 1, 255) THEN si.reserved END), 0) AS index_kb,
  Coalesce(8 * Sum(CASE WHEN si.indid IN (255) THEN si.reserved END), 0) AS blob_kb
 FROM dbo.sysobjects AS so JOIN dbo.sysindexes AS si ON (si.id = so.id)
 WHERE 'U' = so.type 
 GROUP BY so.name 
)
SELECT (CASE WHEN CHARINDEX('_',so.name) > 0 THEN LEFT(so.name,CHARINDEX('_',so.name)-1)
 ELSE so.name
 END) as dataset,
 Coalesce(8 * Sum(CASE WHEN si.indid IN (0, 1) THEN si.reserved END),0) +
 Coalesce(8 * Sum(CASE WHEN si.indid NOT IN (0, 1, 255) THEN si.reserved END), 0) +
 Coalesce(8 * Sum(CASE WHEN si.indid IN (255) THEN si.reserved END), 0) AS totalsize_kb,
 CAST
   (1. * (Coalesce(8 * Sum(CASE WHEN si.indid IN (0, 1) THEN si.reserved END),0) +
   Coalesce(8 * Sum(CASE WHEN si.indid NOT IN (0, 1, 255) THEN si.reserved END), 0) +
   Coalesce(8 * Sum(CASE WHEN si.indid IN (255) THEN si.reserved END), 0))
   / (SELECT SUM(data_kb + index_kb + blob_kb) FROM TableSizes)
   * 100
    AS DECIMAL(5,2) ) AS PercentOfTotal
FROM dbo.sysobjects AS so JOIN dbo.sysindexes AS si ON (si.id = so.id)
WHERE 'U' = so.type 
AND (so.name like '%daily%'
OR so.name like '%hourly%'
OR so.name like '%raw%')
GROUP BY (CASE WHEN CHARINDEX('_',so.name) > 0 THEN LEFT(so.name,CHARINDEX('_',so.name)-1)
 ELSE so.name
 END)
ORDER BY totalsize_kb DESC

In my lab environment, which is not doing anything, I get the following output. Typically though you should see PerfHourly as the highest user when the defaults are in effect.

With this information it becomes much easier to qualify the value in adjusting the aggregate table grooming setting in the blog above.

Legal Stuff: As always the contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.

Monday, July 21, 2014

TSQL to suggest optimal MaxServerMemory value

A while ago a colleague of mine wrote a TSQL script to help with suggesting a MaxServerMemory value and a little less time ago we worked together to re-write that script for SQL 2012. While I haven't "checked" it against SQL 2014 the principals are the same, and the reason I am blogging it today is because I am always looking for the script or URL to direct people to so this post makes that easier :)

Of cause there is no black and white recommendation for MaxServerMemory as it is workload dependent. However this script can assist with understanding what that workload requirement is within a SQL Server Instance, please ensure you carefully consider all other memory needs of the server such as additional SQL Instances, and services such as AV, Backups, Monitoring, etc.

SET NOCOUNT ON;

DECLARE 
--@pg_size INT, 
@Instancename varchar(50),
--@RecMem int,
@MaxMem int,
@MaxRamServer int,
@sql varchar(max),
@SQLVersion tinyint


SELECT @SQLVersion = @@MicrosoftVersion / 0x01000000  -- Get major version

-- SELECT physical_memory_kb as ServerRAM_KB from sys.dm_os_sys_info
-- SELECT @pg_size = low from master..spt_values where number = 1 and type = 'E'
-- SELECT @Instancename = LEFT([object_name], (CHARINDEX(':',[object_name]))) FROM sys.dm_os_performance_counters WHERE counter_name = 'Buffer cache hit ratio'
PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'
PRINT 'Optimal MaxServermemory Setting for SQL Server instance ' + @@SERVERNAME  + ' (' + CAST(SERVERPROPERTY('productversion') AS VARCHAR) + ' - ' +  SUBSTRING(@@VERSION, CHARINDEX('X',@@VERSION),4)  + ' - ' + CAST(SERVERPROPERTY('edition') AS VARCHAR) + ')'
PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'

IF @SQLVersion = 11
BEGIN
 PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'
 PRINT 'Total Memory on the Server (MB)' 
 EXEC sp_executesql N'set @_MaxRamServer= (select physical_memory_kb/1024 from sys.dm_os_sys_info)', N'@_MaxRamServer int OUTPUT', @_MaxRamServer = @MaxRamServer OUTPUT
 Print @MaxRamServer
 PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'
END
ELSE
IF @SQLVersion in (10,9)
BEGIN
 PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'
 PRINT 'Total Memory on the Server (MB)' 
 EXEC sp_executesql N'set @_MaxRamServer= (select physical_memory_in_bytes/1024/1024 from sys.dm_os_sys_info)', N'@_MaxRamServer int OUTPUT', @_MaxRamServer = @MaxRamServer OUTPUT
 Print @MaxRamServer
 PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'
END
ELSE 
BEGIN
 PRINT 'Script only supports SQL Server 2005 or later.'
 RETURN
END

--SELECT @RecMem=physical_memory_kb/1024 from sys.dm_os_sys_info

SET @MaxMem = CASE 
    WHEN @MaxRamServer < = 1024*2 THEN @MaxRamServer - 512  /*When the RAM is Less than or equal to 2GB*/
    WHEN @MaxRamServer < = 1024*4 THEN @MaxRamServer - 1024 /*When the RAM is Less than or equal to 4GB*/
    WHEN @MaxRamServer < = 1024*16 THEN @MaxRamServer - 1024 - Ceiling((@MaxRamServer-4096) / (4.0*1024))*1024 /*When the RAM is Less than or equal to 16GB*/

 -- My machines memory calculation
 -- RAM= 16GB
 -- Case 3 as above:- 16384 RAM-> MaxMem= 16384-1024-[(16384-4096)/4096] *1024
 -- MaxMem= 12106

    WHEN @MaxRamServer > 1024*16 THEN @MaxRamServer - 4096 - Ceiling((@MaxRamServer-1024*16) / (8.0*1024))*1024 /*When the RAM is Greater than or equal to 16GB*/
     END
 SET @sql='
EXEC sp_configure ''Show Advanced Options'',1;
RECONFIGURE WITH OVERRIDE;
EXEC sp_configure ''max server memory'','+CONVERT(VARCHAR(6), @maxMem)+';
RECONFIGURE WITH OVERRIDE;'

PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'
PRINT 'Optimal MaxServerMemory Setting for this instance of SQL' 
Print (@sql) 
PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'

Ping back to Tara S. Jana's site http://sqlmasters.jimdo.com/scripts-tsql

Legal Stuff: As always the contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.

Wednesday, July 10, 2013

TSQL to find Top Queries by Avg CPU, IO, Time

One of my favourite performance tools has certainly become the RML reports as you can quickly identify queries with common hot spots across performance metrics. However this requires a bit of work, you first have to configure your performance trace (usually with Diag Manager), capture the SQL Profiler Trace, and then analyse it with RML/SQLNexus. Of cause the UI with XEvents makes this a bit easier if you use that as your capture. However what if you want to look at a general overall performance of a server, or for a longer duration.
We know that SQL Server is always capturing data for us in the background and exposes that to us with DMVs and DMFs. So I came up with the following query which uses the DMVs/DMFs to report the top queries.
You can change the sort order at the end of the query however as I have included the Rank number per performance metric (CPU, IO, Time) you can quickly see queries which are in the top of each group without needing to resort the results or run multiple queries.
If you did want to run multiple queries with different sort columns, then the Query Number should remain the same as it is based on CPU and the reason i included this is to allow you to easily compare the queries across multiple executions (depending on the time between executions).
;WITH CTE([Query Num], [Total Executions], [Total CPU], [Avg CPU], [Avg CPU Rank], [Total IO], [Avg IO], [Avg IO Rank], [Total Elapsed Time], [Avg Elapsed Time], [Avg Elapsed Time Rank], [Sample Statement Text], [Query Plan Handle], [Statement Hash], [Query Plan Hash])
AS
(
    SELECT TOP 50
        ROW_NUMBER() OVER (ORDER BY (SUM(total_worker_time) / SUM(execution_count)) DESC) AS [Query Num]        , SUM(execution_count) AS [Total Executions]        , SUM(total_worker_time) AS [Total CPU]        , SUM(total_worker_time) / SUM(execution_count) AS [Avg CPU]        , RANK() OVER (ORDER BY (SUM(total_worker_time) / SUM(execution_count)) DESC) AS [Avg CPU Rank]        , SUM(total_physical_reads + total_logical_reads + total_logical_writes) AS [Total IO]        , SUM(total_physical_reads + total_logical_reads + total_logical_writes) / SUM(execution_count) AS [Avg IO]        , RANK() OVER (ORDER BY (SUM(total_physical_reads + total_logical_reads + total_logical_writes) / SUM(execution_count)) DESC) AS [Avg IO Rank]        , SUM(total_elapsed_time) AS [Total Elapsed Time]
        , SUM(total_elapsed_time) / SUM(execution_count) AS [Avg Elapsed Time]        , RANK() OVER (ORDER BY (SUM(total_elapsed_time) / SUM(execution_count)) DESC) AS [Avg Elapsed Time Rank]
        , MIN(query_text) AS [Sample Statement Text]        , MIN(plan_handle) AS [Query Plan Handle]        , query_hash AS [Statement Hash]        , query_plan_hash AS [Query Plan Hash]
    FROM
    (
        SELECT
            qs.*,
            SUBSTRING(st.[text], qs.statement_start_offset/2, (
                CASE
                    WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX), st.[text])) * 2
                    ELSE qs.statement_end_offset                 END - qs.statement_start_offset)/2
            ) AS query_text
        FROM sys.dm_exec_query_stats AS qs
        CROSS APPLY sys.dm_exec_sql_text(qs.[sql_handle]) AS st
        WHERE st.[text] NOT LIKE '%sys.dm_%'
        --AND DateDiff(hour, last_execution_time, getdate()) < 1 --change hour time frame
    ) AS query_stats     GROUP BY query_hash, query_plan_hash )
SELECT
    [Query Num]
   , [Total Executions]
   , [Total CPU]
   , [Avg CPU]
   , [Avg CPU Rank]
   , [Total IO]
   , [Avg IO]
   , [Avg IO Rank]
   , [Total Elapsed Time]
   , [Avg Elapsed Time]
   , [Avg Elapsed Time Rank] , DB_Name(qp.dbid) AS [DB Name]   , [Sample Statement Text]
   , qp.query_plan AS [Estimated Query Plan]FROM CTE
OUTER APPLY sys.dm_exec_query_plan([Query Plan Handle]) AS qp --ORDER BY [Avg CPU] DESC
ORDER BY [Avg IO] DESC --ORDER BY [Avg Elapsed Time] DESC

NOTE: This query includes a TOP 50 which improves performance but only looks at very recent data. I would recommend using the predicate on last_execution_time to restrict the data back to a valid time frame. This would still then include the total history for those requests but ensure that you are only looking at active queries.

Here is an example output. In this example I had cleared the cache and only a small test data so the Avg IO rank is the same for each query but this gives you an idea.

So far I’ve tested this on SQL 2008 R2 and above.
I’m also closing in on finalising the BETA release of my DBA Admin and Performance SSMS reports which will include this query and take this to a whole new level of visualisation.

Legal Stuff: The contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.

Friday, April 12, 2013

TSQL to find Cached Plans within SQL Server

The following SQL Statement is based off a number of DMV's and DMF's and allows us to look at the query statistics and cached query plans. This becomes useful when trying to determine how SQL Server is caching the execution query plans for our statements, either Adhoc or Proc, but more on that below.

SELECT qs.last_execution_time
, qs.execution_count
, qs.plan_generation_num
--, qs.plan_handle
--, qs.query_plan_hash
--, qs.query_hash
, cp.refcounts as [QueryPlan_RefCounts]
, cp.usecounts as [QueryPlan_UseCounts]
, cp.size_in_bytes as [QueryPlan_SizeBytes]
, cp.cacheobjtype as [CacheObjType]
, cp.objtype as [ObjType]
--, qs.statement_start_offset
--, qs.statement_end_offset
, st.text as [QueryText]
, SUBSTRING(st.text, qs.statement_start_offset/2,
(CASE WHEN qs.statement_end_offset = -1
THEN LEN (CONVERT (nvarchar(max), st.text)) * 2
ELSE qs.statement_end_offset END -
qs.statement_start_offset)/2) as [StmtText]
, qp.query_plan as [QueryPlan]
FROM sys.dm_exec_query_stats qs
INNER JOIN sys.dm_exec_cached_plans cp ON cp.plan_handle = qs.plan_handle
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) as qp
WHERE st.text NOT LIKE N'%sys.dm_%'
AND st.text LIKE N'%'
ORDER BY qs.last_execution_time DESC;

To look for query's related to a specific table or object, then change the where statement as follows:
AND st.text LIKE N'%MyTableName%'

So why would we want to care about the plan cache?

In an environment with lots of adhoc statements a problem can certainly arise where SQL Server does not reuse existing cached plans efficiently. This is because the predicate values are being recorded along with the plans for adhoc statements within the cache. Say for example you have a query such as:

SELECT *
FROM dbo.ProcCacheTest
WHERE someDate = '19000605'

well this would generate a plan within the cache, but then if the application/user reissues the same statement but using different value such as:

SELECT *
FROM dbo.ProcCacheTest
WHERE someDate = '19000608'

This will result in an entirely new plan being generated and cached for the statement. Of cause if the statement is reissued with the exact same predicate values then it will reuse the existing plan from the cache. However if you have an environment with lots of this type of activity and with a large number of different predicate values in use, then this could cause your plan cache to be consumed by all of these statements and not allow other statements to have their plans cached correctly. In other words cause memory pressure within the plan cache and overall resulting in a degradation of performance.

So how can we avoid this, well if we wrap the adhoc query up into a stored procedure such as:

CREATE PROCEDURE dbo.myProcCacheTest (
@pDate varchar(15)
)
as
SELECT *
FROM dbo.ProcCacheTest
WHERE someDate = @pDate
GO

now when the stored procedure is executed the SQL Optimiser will create a plan in the cache for the statement but with parameterization (i.e. without tracking the actual values of the search predicate).

We can also use the Optimize For Ad Hoc Workloads advanced system setting to ensure that SQL Server doesn't cache plans for adhoc statements unless they are executed more than once. However as this means the Optimizer will only create Query Plan Stubs in the cache for individual executions and then only create full Query Plans in the cache after a query's execution count is >1 it may not fix your problem as it may simply allow for more space within the Plan Cache to store these single execution queries.

There is also another option which is to use the FORCE PARAMETERIZATION option on the database, then this will force the Optimizer to convert the search predicate within the adhoc queries to a parameter before generating the plan, thus allowing the plan to be reused. This setting doesn't support all data types so you may still find some adhoc queries being cached. In addition to this be extremely careful about implementing these type of settings in a production environment as while they can provide a vast improvement, in some cases it can have a less desired effect, it all depends on the way the exact queries are being processed.

Still the greatest improvement can be gained from removing the adhoc queries through the use of Stored Procedures and another advantage that a Stored Procedure offers us is that the individual statements with in the procedure are cached with individual plans, and we can see those individual statements/plans using the same statement above. I have commented out the query_hash and query_plan_hash columns, but these are how SQL Server tracks the unique statements and plans within it's cache.

So if we had a stored procedure like:

CREATE PROCEDURE dbo.myProcCacheTest (
@pDate varchar(15)
)asSELECT *
FROM dbo.ProcCacheTest
WHERE someDate = @pDate;

DECLARE @pDate2 DateTime = DateAdd(Day, 7, @pDate);
SELECT *
FROM dbo.ProcCacheTest
WHERE someDate >= @pDate
AND someDate <= @pDate2;
GO

If we were to review the cache for this stored procedure we would find a plan in the cache for the first select query and then another for the second select query. Each will have the same plan_handle, but will have individual query_plan_hash and query_hash values.

Other aspects which are useful to gain from looking at the Cache is when plans are regenerated through Recompile, either due to data modification, SET options, or referenced object schema changes. Remember though that when you change the schema of a procedure it's associated cached plans are dropped and removed.

Enjoy digging deeper into the plan cache.

Legal Stuff: The contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.