Showing posts with label performance. Show all posts
Showing posts with label performance. Show all posts

Friday, April 15, 2016

TSQL To identify queries which are candidates for Parameterization

While discussing the concepts of 'optimize for ad hoc workloads' and 'Forced Parameterization' in SQL Server I decided to modify an existing query I wrote to analyse the query performance from the cache (more on it here) to now analyse the adhoc queries for candidates which could benefit from parameterization.

;WITH CTE(QueryCount, StatementTextForExample, plan_handle, QueyHash, QueryPlanHash, CacheObjType, 
    ObjType) 
AS 
( 
   SELECT
    COUNT(query_stats.query_hash) AS QueryCount
    , MIN(query_stats.query_text) AS StatementTextForExample
    , MIN(query_stats.plan_handle) AS plan_handle
    , query_stats.query_hash AS QueryHash
    , query_stats.query_plan_hash AS QueryPlanHash
    , query_stats.CacheObjType
    , query_stats.ObjType
   FROM  
       ( 
           SELECT  
      qs.query_hash
      , qs.query_plan_hash 
      , qs.plan_handle
      , cp.cacheobjtype as [CacheObjType]
      , cp.objtype as [ObjType]
      , SUBSTRING(qt.[text], qs.statement_start_offset/2, ( 
                   CASE  
                       WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX), qt.[text])) * 2  
                       ELSE qs.statement_end_offset  
                   END - qs.statement_start_offset)/2  
               ) AS query_text 
           FROM  
               sys.dm_exec_query_stats AS qs 
      INNER JOIN sys.dm_exec_cached_plans cp ON cp.plan_handle = qs.plan_handle
               CROSS APPLY sys.dm_exec_sql_text(qs.[sql_handle]) AS qt 
           WHERE qt.[text] NOT LIKE '%sys.dm_exec_query_stats%' 
     AND cp.objtype = 'AdHoc'
     --AND qs.last_execution_time BETWEEN DATEADD(hour,-1,GETDATE()) AND GETDATE() --change hour time frame
       ) AS query_stats 
 GROUP BY query_stats.query_hash
  , query_stats.query_plan_hash
  , query_stats.CacheObjType
  , query_stats.ObjType
 HAVING COUNT(query_stats.query_hash) > 1
) 
SELECT  
   CTE.QueryCount
   , CTE.CacheObjType
   , CTE.ObjType
   , CTE.StatementTextForExample
   , tp.query_plan AS StatementPlan
   , CTE.QueyHash
   , CTE.QueryPlanHash 
FROM 
   CTE 
   OUTER APPLY sys.dm_exec_query_plan(CTE.plan_handle) AS tp 
ORDER BY CTE.QueryCount DESC; 
--ORDER BY [Total IO] DESC; 

When you identify these candidates you then need to look at what is the most appropriate resolution such as.

  1. Rewrite the query at the application layer to ensure it is called with parameterization from the data provider
  2. Rewrite the query as a stored procedure
  3. Enable 'optimize for ad hoc workloads' on your SQL instance
    EXEC sys.sp_configure N'optimize for ad hoc workloads', N'0'
    GO
    RECONFIGURE WITH OVERRIDE
    GO
    
  4. Enable 'Forced Parameterization' on the affected database
    USE [master]
    GO
    ALTER DATABASE [AdventureWorksPTO] SET PARAMETERIZATION SIMPLE WITH NO_WAIT
    GO
    
Of cause the appropriate option really depends on a larger view of your environment and applications that only you can determine. If you do have any thoughts on this please feel free to add them to the comments below to help others.

Word of caution too. Before adjusting any sp_configure settings with RECONFIGURE make sure you run the following to check for any settings which are not yet active. It is expected to see 'min server memory (MB)' in the results of this though if you leave the setting default at 0 as SQL must reserve the minimum memory possible which is 16mb.
SELECT * FROM sys.configurations
WHERE Value <> value_in_use


Legal Stuff: The contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.





 

Monday, July 21, 2014

TSQL to suggest optimal MaxServerMemory value

A while ago a colleague of mine wrote a TSQL script to help with suggesting a MaxServerMemory value and a little less time ago we worked together to re-write that script for SQL 2012. While I haven't "checked" it against SQL 2014 the principals are the same, and the reason I am blogging it today is because I am always looking for the script or URL to direct people to so this post makes that easier :)

Of cause there is no black and white recommendation for MaxServerMemory as it is workload dependent. However this script can assist with understanding what that workload requirement is within a SQL Server Instance, please ensure you carefully consider all other memory needs of the server such as additional SQL Instances, and services such as AV, Backups, Monitoring, etc.


SET NOCOUNT ON;

DECLARE 
--@pg_size INT, 
@Instancename varchar(50),
--@RecMem int,
@MaxMem int,
@MaxRamServer int,
@sql varchar(max),
@SQLVersion tinyint


SELECT @SQLVersion = @@MicrosoftVersion / 0x01000000  -- Get major version

-- SELECT physical_memory_kb as ServerRAM_KB from sys.dm_os_sys_info
-- SELECT @pg_size = low from master..spt_values where number = 1 and type = 'E'
-- SELECT @Instancename = LEFT([object_name], (CHARINDEX(':',[object_name]))) FROM sys.dm_os_performance_counters WHERE counter_name = 'Buffer cache hit ratio'
PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'
PRINT 'Optimal MaxServermemory Setting for SQL Server instance ' + @@SERVERNAME  + ' (' + CAST(SERVERPROPERTY('productversion') AS VARCHAR) + ' - ' +  SUBSTRING(@@VERSION, CHARINDEX('X',@@VERSION),4)  + ' - ' + CAST(SERVERPROPERTY('edition') AS VARCHAR) + ')'
PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'

IF @SQLVersion = 11
BEGIN
 PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'
 PRINT 'Total Memory on the Server (MB)' 
 EXEC sp_executesql N'set @_MaxRamServer= (select physical_memory_kb/1024 from sys.dm_os_sys_info)', N'@_MaxRamServer int OUTPUT', @_MaxRamServer = @MaxRamServer OUTPUT
 Print @MaxRamServer
 PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'
END
ELSE
IF @SQLVersion in (10,9)
BEGIN
 PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'
 PRINT 'Total Memory on the Server (MB)' 
 EXEC sp_executesql N'set @_MaxRamServer= (select physical_memory_in_bytes/1024/1024 from sys.dm_os_sys_info)', N'@_MaxRamServer int OUTPUT', @_MaxRamServer = @MaxRamServer OUTPUT
 Print @MaxRamServer
 PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'
END
ELSE 
BEGIN
 PRINT 'Script only supports SQL Server 2005 or later.'
 RETURN
END

--SELECT @RecMem=physical_memory_kb/1024 from sys.dm_os_sys_info

SET @MaxMem = CASE 
    WHEN @MaxRamServer < = 1024*2 THEN @MaxRamServer - 512  /*When the RAM is Less than or equal to 2GB*/
    WHEN @MaxRamServer < = 1024*4 THEN @MaxRamServer - 1024 /*When the RAM is Less than or equal to 4GB*/
    WHEN @MaxRamServer < = 1024*16 THEN @MaxRamServer - 1024 - Ceiling((@MaxRamServer-4096) / (4.0*1024))*1024 /*When the RAM is Less than or equal to 16GB*/

 -- My machines memory calculation
 -- RAM= 16GB
 -- Case 3 as above:- 16384 RAM-> MaxMem= 16384-1024-[(16384-4096)/4096] *1024
 -- MaxMem= 12106

    WHEN @MaxRamServer > 1024*16 THEN @MaxRamServer - 4096 - Ceiling((@MaxRamServer-1024*16) / (8.0*1024))*1024 /*When the RAM is Greater than or equal to 16GB*/
     END
 SET @sql='
EXEC sp_configure ''Show Advanced Options'',1;
RECONFIGURE WITH OVERRIDE;
EXEC sp_configure ''max server memory'','+CONVERT(VARCHAR(6), @maxMem)+';
RECONFIGURE WITH OVERRIDE;'

PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'
PRINT 'Optimal MaxServerMemory Setting for this instance of SQL' 
Print (@sql) 
PRINT '------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'






Ping back to Tara S. Jana's site http://sqlmasters.jimdo.com/scripts-tsql


Legal Stuff: As always the contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.

Friday, July 4, 2014

Format Operator vs Format Parameter for Date Time

So most of my PowerShell scripts include logging or export functionality which requires the current date/time to be in a particular format. So this got me thinking, just which method actually performs better.

Get-Date Format Parameter

Command:
Get-Date -Format {yyyyMMddhhmmss}


Individual Times (ms):  0.9507, 0.3538, 0.3797, 0.2996, 0.2996, 0.3575, 0.1991, 0.2011, 0.1297, 0.1625
Overall Avg (ms): 0.33333

Format Operator

Command:
"{0:yyyy}{0:MM}{0:dd}{0:hh}{0:mm}{0:ss}" -f (Get-Date)

Individual Times (ms):  1.6881, 0.4051, 1.2866, 0.7377, 0.1761, 0.4794, 0.9889, 0.7307, 0.5377, 0.8777
Overall Avg (ms): 0.7908


And the winner is....... Get-Date with the Format parameter, most likely because the formatting is conducted early in the stack, (i.e. closer to the .Net framework layer). However if you look at some of the individual results there isn't much in it at times.


The full script used to perform this test is as follows:

[array]$Results = @();

for ($i=0; $i -lt 10; $i++)
{
   $Results += Measure-Command {
      Get-Date -Format {yyyyMMddhhmmss}
   } | select TotalMilliseconds;
   #helps give real results by letting the system pause for a moment
   Start-Sleep -Seconds 1;
}
Write-Host "Get-Date format paramter results"
$Results

[array]$Results = @();
for ($i=0; $i -lt 10; $i++)
{
   $Results += Measure-Command {
      "{0:yyyy}{0:MM}{0:dd}{0:hh}{0:mm}{0:ss}" -f (Get-Date)
   } | select TotalMilliseconds;
   #helps give real results by letting the system pause for a moment
   Start-Sleep -Seconds 1;
}
Write-Host "Format operator results"
$Results

Wednesday, July 10, 2013

TSQL to find Top Queries by Avg CPU, IO, Time

One of my favourite performance tools has certainly become the RML reports as you can quickly identify queries with common hot spots across performance metrics. However this requires a bit of work, you first have to configure your performance trace (usually with Diag Manager), capture the SQL Profiler Trace, and then analyse it with RML/SQLNexus. Of cause the UI with XEvents makes this a bit easier if you use that as your capture. However what if you want to look at a general overall performance of a server, or for a longer duration.
We know that SQL Server is always capturing data for us in the background and exposes that to us with DMVs and DMFs. So I came up with the following query which uses the DMVs/DMFs to report the top queries.
You can change the sort order at the end of the query however as I have included the Rank number per performance metric (CPU, IO, Time) you can quickly see queries which are in the top of each group without needing to resort the results or run multiple queries.
If you did want to run multiple queries with different sort columns, then the Query Number should remain the same as it is based on CPU and the reason i included this is to allow you to easily compare the queries across multiple executions (depending on the time between executions).
;WITH CTE([Query Num], [Total Executions], [Total CPU], [Avg CPU], [Avg CPU Rank], [Total IO], [Avg IO], [Avg IO Rank], [Total Elapsed Time], [Avg Elapsed Time], [Avg Elapsed Time Rank], [Sample Statement Text], [Query Plan Handle], [Statement Hash], [Query Plan Hash])
AS
(
    SELECT TOP 50
        ROW_NUMBER() OVER (ORDER BY (SUM(total_worker_time) / SUM(execution_count)) DESC) AS [Query Num]        , SUM(execution_count) AS [Total Executions]        , SUM(total_worker_time) AS [Total CPU]        , SUM(total_worker_time) / SUM(execution_count) AS [Avg CPU]        , RANK() OVER (ORDER BY (SUM(total_worker_time) / SUM(execution_count)) DESC) AS [Avg CPU Rank]        , SUM(total_physical_reads + total_logical_reads + total_logical_writes) AS [Total IO]        , SUM(total_physical_reads + total_logical_reads + total_logical_writes) / SUM(execution_count) AS [Avg IO]        , RANK() OVER (ORDER BY (SUM(total_physical_reads + total_logical_reads + total_logical_writes) / SUM(execution_count)) DESC) AS [Avg IO Rank]        , SUM(total_elapsed_time) AS [Total Elapsed Time]
        , SUM(total_elapsed_time) / SUM(execution_count) AS [Avg Elapsed Time]        , RANK() OVER (ORDER BY (SUM(total_elapsed_time) / SUM(execution_count)) DESC) AS [Avg Elapsed Time Rank]
        , MIN(query_text) AS [Sample Statement Text]        , MIN(plan_handle) AS [Query Plan Handle]        , query_hash AS [Statement Hash]        , query_plan_hash AS [Query Plan Hash]
    FROM 
    (
        SELECT 
            qs.*
            SUBSTRING(st.[text], qs.statement_start_offset/2, (
                CASE 
                    WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX), st.[text])) * 2 
                    ELSE qs.statement_end_offset                  END - qs.statement_start_offset)/2 
            ) AS query_text
        FROM sys.dm_exec_query_stats AS qs
        CROSS APPLY sys.dm_exec_sql_text(qs.[sql_handle]) AS st
        WHERE st.[text] NOT LIKE '%sys.dm_%'
        --AND DateDiff(hour, last_execution_time, getdate()) < 1 --change hour time frame
    ) AS query_stats     GROUP BY query_hash, query_plan_hash )
SELECT 
    [Query Num]
   , [Total Executions]
   , [Total CPU]
   , [Avg CPU]
   , [Avg CPU Rank]
   , [Total IO]
   , [Avg IO]
   , [Avg IO Rank]
   , [Total Elapsed Time]
   , [Avg Elapsed Time]
   , [Avg Elapsed Time Rank]
  , DB_Name(qp.dbid) AS [DB Name]   , [Sample Statement Text]
   , qp.query_plan AS [Estimated Query Plan]FROM CTE
OUTER APPLY sys.dm_exec_query_plan([Query Plan Handle]) AS qp --ORDER BY [Avg CPU] DESC
ORDER BY [Avg IO] DESC --ORDER BY [Avg Elapsed Time] DESC


NOTE: This query includes a TOP 50 which improves performance but only looks at very recent data. I would recommend using the predicate on last_execution_time to restrict the data back to a valid time frame. This would still then include the total history for those requests but ensure that you are only looking at active queries.


Here is an example output. In this example I had cleared the cache and only a small test data so the Avg IO rank is the same for each query but this gives you an idea.
TopNQueryiesbyHashandRank

So far I’ve tested this on SQL 2008 R2 and above.
I’m also closing in on finalising the BETA release of my DBA Admin and Performance SSMS reports which will include this query and take this to a whole new level of visualisation.

Legal Stuff: The contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.

Tuesday, April 23, 2013

TSQL to identify databases with high number of VLFs

If you aren't aware there can be significant impact to performance when a database has a high number of VLFs within it's Transaction Log. SQL Server divides each physical transaction log file internally into a number of virtual log files (VLFs) and it is these that contain the log records.  There is no fixed size for a VLF, and there is no fixed number which would be created within a physical log file. These values are all determined dynamically by SQL Server when creating or extending physical log files.  While SQL Server tries to maintain a small number of VLFs for the best performance, file growth can result in a database having a large number of VLFs because the following equation will be used to determine the number of files to create within the new space.

- Less than 64MB = 4 VLFs
- Greater than 64MB and less than 1GB = 8 VLFs
- Greater than or equal to 1GB = 16 VLFs

So for example if I initially create a 1GB transaction log file it will be divided into 16 64MB VLFs, then if it grew a few times by 10 Mb before being switched to 10% the following trend would be experienced and result in a quickly increasing number of VLFs when then compared to manually growing the file:

ActionSpace AddedVLFs CreatedTotal SizeTotal VLFs
Initial Size1024 Mb161024 Mb16
Auto Grow10 Mb41034 Mb20
Auto Grow10 Mb41044 Mb24 
Auto Grow10 Mb41054 Mb28
Auto Grow10 Mb41064 Mb32
Auto Grow106 Mb81170 Mb40
Auto Grow117 Mb81287 Mb48
 Auto Grow128 Mb81415 Mb56
 Auto Grow141 Mb81556 Mb64
 Auto Grow155 Mb81711 Mb72
 Auto Grow171 Mb81882 Mb80
 Auto Grow188 Mb82070 Mb88

The only way to reduce the number of VLFs is to then Shrink the file and manually grow the file by a set size such as:

ActionSpace AddedVLFs CreatedTotal SizeTotal VLFs
Initial Size1024 Mb161024 Mb16
Manual Grow1024 Mb162048 Mb32
Auto Grow204 Mb82252 Mb 40

Ideally you should keep the number of VLFs to as small as possible. In SQL 2012 there is now a warning raised when a database has >10,000 VLFs, although there is no warning available in earlier versions of SQL Server. You can use the following TSQL though to report on the number of VLFs per database within your SQL Instance. Then if you compare this with your auto-grow settings for the database you can determine the reason for why the count is the way it is.

SET NOCOUNT ON;
/* declare variables required */DECLARE @DatabaseId INT;
DECLARE @TSQL varchar(MAX);
DECLARE cur_DBs CURSOR FOR
SELECT database_id FROM sys.databases;
OPEN cur_DBs;
FETCH NEXT FROM cur_DBs INTO @DatabaseId

--These table variables will be used to store the data
DECLARE @tblAllDBs Table (DBName sysname
   , FileId INT
   , FileSize BIGINT
   , StartOffset BIGINT
   , FSeqNo INT
   , Status TinyInt
   , Parity INT
   , CreateLSN NUMERIC(25,0)
)IF '11' = substring(convert(char(12),serverproperty('productversion')), 1, 2)
BEGIN   DECLARE @tblVLFs2012 Table (RecoveryUnitId BIGINT
      , FileId INT
      , FileSize BIGINT
      , StartOffset BIGINT
      , FSeqNo INT
      , Status TinyInt
      , Parity INT
      , CreateLSN NUMERIC(25,0)
   );
END
ELSE
BEGIN
   DECLARE @tblVLFs Table (
      FileId INT
      , FileSize BIGINT
      , StartOffset BIGINT
      , FSeqNo INT
      , Status TinyInt
      , Parity INT
      , CreateLSN NUMERIC(25,0)
   );
END
--loop through each database and get the info
WHILE @@FETCH_STATUS = 0
BEGIN   PRINT 'DB: ' + CONVERT(varchar(200), DB_NAME(@DatabaseId));
   SET @TSQL = 'dbcc loginfo('+CONVERT(varchar(12), @DatabaseId)+');';
   IF '11' = substring(convert(char(12),serverproperty('productversion')), 1, 2)
   BEGIN
      DELETE FROM @tblVLFs2012;
      INSERT INTO @tblVLFs2012
      EXEC(@TSQL);
      INSERT INTO @tblAllDBs
      SELECT DB_NAME(@DatabaseId)
         , FileId
         , FileSize
         , StartOffset
         , FSeqNo
         , Status
         , Parity
         , CreateLSN
      FROM @tblVLFs2012;
   END
   ELSE
   BEGIN
      DELETE FROM @tblVLFs;
      INSERT INTO @tblVLFs
      EXEC(@TSQL);
      INSERT INTO @tblAllDBs
      SELECT DB_NAME(@DatabaseId)
         , FileId
         , FileSize
         , StartOffset
         , FSeqNo
         , Status
         , Parity
         , CreateLSN
      FROM @tblVLFs;
   END
   FETCH NEXT FROM cur_DBs INTO @DatabaseId
ENDCLOSE cur_DBs;
DEALLOCATE cur_DBs;

--just for formating if output to Text
PRINT '';
PRINT '';
PRINT '';

--Return the data based on what we have found
SELECT a.DBName
    , COUNT(a.FileId) AS [TotalVLFs]
    , MAX(b.[ActiveVLFs]) AS [ActiveVLFs]
    , (SUM(a.FileSize) / COUNT(a.FileId) / 1024) AS [AvgFileSizeKb]
FROM @tblAllDBs a
INNER JOIN (
    SELECT DBName
        , COUNT(FileId) [ActiveVLFs]
    FROM @tblAllDBs
    WHERE Status = 2
    GROUP BY DBName
) b
ON b.DBName = a.DBName
GROUP BY a.DBName
ORDER BY TotalVLFs DESC;

SET NOCOUNT OFF;


Legal Stuff: The contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.