MatticusAU Blog: SQL 2016

Showing posts with label SQL 2016. Show all posts

Tuesday, December 20, 2016

Why is the with_copy option critical when restoring a database configured with StretchDb

StretchDB is a new capability in SQL Server 2016 which provides the ability to effectively partition your data between warm and cold data, and more importantly allow SQL Server to automatically host that data in an Azure SQL Database.... sounds great in concept and in cases where you have a large amount of cold data which just needs to be queried and not updated (e.g. archived data for reporting).

Now, there is one critically thing you really need to be aware of and that is when it comes time to Restore or more specifically Migrate the on-premises database that is stretched. Let's take this scenario for example.

You are planning to upgrade the production SQL Server to SQL Server 2016 SP1. The business needs do not allow for an in-place upgrade, so you have built a new server and are planning to migrate production database to the new server.

It could be a DR scenario but more likely the issues I am talking about will be during migrations.

The issue you need to be aware of is that after restoring the on-premises database you then need to reauthorize the connection to the Azure SQL Database (official details at https://msdn.microsoft.com/en-us/library/mt733205.aspx). When you perform this step, you have to specify the parameter WITH_COPY which can either point the database to the existing remote table in the Azure SQL Database, or create a new copy of the remote table to use. The recommended approach is to use a new copy of the remote table (with_copy = 1). I would also recommend this approach for database migrations, and then manually clean up the old table. The reason I recommend this is as I will show in this post, if your old/current database is still in use and you point a new database to the same remote table, you can experience conflicts between the data reconciliations and this will result in cold data loss.

So let's explore the issue.

Firstly here is the environment I am working in:

On-premises SQL Server 2016 SP1 (13.0.4001.0). I have one database StretchDBDemo which contains the following tables:

dbo.Department

DepartmentID, DepartmentName, OfficeLoc

dbo.People

PersonID, FullName, IsEmployee, PhoneNumber, EmailAddress

The full schema is available in the script at the end of this post.

We are going to use the dbo.People table to demonstrate this issue. So lets start with some data.

INSERT INTO [dbo].[People] ([PersonID], [FullName], [IsEmployee], [PhoneNumber], [EmailAddress]) VALUES (1, 'Matt', 1, '0','yourname@email.com');
INSERT INTO [dbo].[People] ([PersonID], [FullName], [IsEmployee], [PhoneNumber], [EmailAddress]) VALUES (2, 'Steve', 0, '0','yourname@email.com');

Next step is to setup StretchDB feature. This has to be done through the GUI and is pretty straight forward but here are the steps I have used:

Right click on the Database StretchDBDemo in Object Explorer
Select Tasks > Stretch > Enable

In the GUI select Next
Place a tick in the box next to the dbo.People table. Notice the warning icon. Important to note as it does have some relevance to the behaviour we will explore.
Click the "Entire Table" link under the Migrate column to launch the filter wizard

Setup a filter called IsEmployee False, with the predicate "IsEmployee = 0". Click the Check button and then Done.

NOTE: If you don't have any rows in the dbo.People table that match this predicate you won't be able to proceed past this point.

Click Next and then authenticate to Azure.
Select either an existing Azure SQL Database Server or create a new StretchDB server.
Follow the wizard to completion. I recommend taking a screen snippet of the summary page for reference.

You can then check that the StretchDb feature is working by using the Monitor from the Stretch menu on the Database object.

You could also query the following tables to check this data manually
```
SELECT * FROM sys.dm_db_rda_migration_status
SELECT * FROM sys.dm_db_rda_schema_update_status
```

You will need to check the configuration of the StretchDb, this is critical as we need the Azure SQL Server address.

-- the db config

SELECT * FROM sys.remote_data_archive_databases
-- the tables config
SELECT * FROM sys.remote_data_archive_tables
-- the data source
SELECT * FROM sys.external_data_sources

Now that you have the remote server's address, in SSMS Object Explorer connect to the Azure SQL Database server

Expand the Databases branch in Object Explorer, expand your stretched remote database (the name is in the tables queried above). Then expand Tables. Note how no tables are displayed here. Any stretched tables are deliberately hidden and you should not query those tables in general practice. However we are troubleshooting/demonstrating an issue so we will query the tables directly.
Query sys.Tables to find the name of the stretched table
Now query the stretched Table in the Azure SQL Database. You should have 1 record for 'Steve'.

Lets add a new record to our on-premises database that will be stretched to the remote server.

INSERT INTO [dbo].[People] ([PersonID], [FullName], [IsEmployee], [PhoneNumber], [EmailAddress]) VALUES (3, 'Chris', 0, '0','yourname@email.com');

Using the Monitor or the TSQL queries from earlier check on the status of the StretchDb feature. After some time query the stretched table in the Azure SQL Database again to make sure the new record is stretched there.

Now the fun part starts. Lets simulate a database migration. In my lab I am just using the same server, but the experience is the same.

Take a backup of the on-premises StretchDbDemo database. You could elect to disable StretchDb first and bring all the data back on-premises, but we trust the Azure backups to keep our data in the cloud safe.
Next restore the backup to the new database we are migrating the application to. I have used the database name StretchDbDemoNew.
Once the database is restored it isn't immediately Stretching database as it needs to be re-authorized to use the remote server. This is outlined in https://msdn.microsoft.com/en-us/library/mt733205.aspx.

First get the name of the credential that exists. If you were migrating this to another server you will need to recreate the credential on that server. The name of the credential needs to match the remote server address.
```
SELECT * FROM sys.database_scoped_credentials
```

Now use the following statement to

-- reauthorize using the existing credential information
USE [StretchDBDemoNew];
GO
EXEC sp_rda_reauthorize_db 
  @credential = N'',
  @with_copy = 0;
GO

Here is where the problem begins.

When you execute sp_rda_reauthorize_db and specify with_copy = 0, it uses the existing Azure SQL Database as the endpoint. So now we actually have two on-premises databases StretchDBDemo and StretchDBDemoNew, both pointing to the same Azure SQL Database table for stretched data. If you specify with_copy = 1, it actually creates a copy of the data in a new table and therefore the old and new on-premises databases have different stretch data end-points.

So why does this matter. Well lets add some data to the new on-premises database because well we want to make sure it's all work as you would in any migration.

INSERT INTO [StretchDBDemoNew].[dbo].[People] ([PersonID], [FullName], [IsEmployee], [PhoneNumber], [EmailAddress]) VALUES (10, 'Bill', 0, '0','yourname@email.com');

HINT: Any rows I insert into the new database I will use a PersonID >= 10 as this helps with the demonstration.

Now this should get stretched to the Azure SQL Database. So switch to that query window and check it's there.

NOTE: it might take a few minutes for the reconciliation to occur.

Here is where things get interesting. After a few minutes try querying the data again and see if there is any change. While writing this post the remote record disappeared, however when writing the scripts it didn't. So what is happening..... to find out lets compare local and remote data more.

With the new record still in Azure, query the old on-premises database. If your lucky you will find the record inserted into the new database which has been stretched to Azure also returned when you query the original database. Your experience here may differ as it's all a matter of timing.

Just to add fuel to the fire I inserted another record into my new database that would be stretched to Azure.

INSERT INTO [StretchDBDemoNew].[dbo].[People] ([PersonID], [FullName], [IsEmployee], [PhoneNumber], [EmailAddress]) VALUES (11, 'Bart', 0, '0','yourname@email.com');

But what about our existing on-premises database. Remember in my scenario we are only testing the migration so you would assume there are still active connections querying that database and potentially data being updated which would then stretch into Azure. So lets insert two records into that database which will be stretch.

INSERT INTO [StretchDBDemo].[dbo].[People] ([PersonID], [FullName], [IsEmployee], [PhoneNumber], [EmailAddress]) VALUES (4, 'Chad', 0, '0','yourname@email.com');
INSERT INTO [StretchDBDemo].[dbo].[People] ([PersonID], [FullName], [IsEmployee], [PhoneNumber], [EmailAddress]) VALUES (5, 'Dan', 0, '0','yourname@email.com');

So in our existing database we now have this data being returned

What about in our new database.

Wait a minute, those records 'Chad' and 'Dan' were just inserted into our existing database, and where did our records for 'Bill' and 'Bart' go.

Here is the issue, with both on-premises databases pointing to the same remote database through the with_copy = 0 option, it means that the reconciliation processes in the databases will conflict. One of the databases will effectively become the master and overwrite what records the other sends. As I mentioned both times I ran through the scripts I got different experiences where one time the new database was the master, and the other time (writing this blog) the original database was the master and overwrote the records.

So, the good news is that if you use with_copy = 1 option then the databases use separate remote tables and therefore do not conflict. So my recommendation is ALWAYS USE "WITH_COPY = 1" when reauthorizing a database to the remote server.

In saying that a few things to keep in mind, over time if you move the database a lot you could end up with lots of tables in the remote database so best to monitor that if you want to keep your costs down.

Also if at any stage you are querying the stretched tables in the on-premises databases and you have this conflict situation you could experience a number of cases of this error.

I hope this blog post helps when it comes time to planning your migrations. Any questions or feedback please leave them in the comments section.

Get the complete demo scripts here.

Legal Stuff: The contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.

Friday, April 15, 2016

TSQL To identify queries which are candidates for Parameterization

While discussing the concepts of 'optimize for ad hoc workloads' and 'Forced Parameterization' in SQL Server I decided to modify an existing query I wrote to analyse the query performance from the cache (more on it here) to now analyse the adhoc queries for candidates which could benefit from parameterization.

;WITH CTE(QueryCount, StatementTextForExample, plan_handle, QueyHash, QueryPlanHash, CacheObjType, 
    ObjType) 
AS 
( 
   SELECT
    COUNT(query_stats.query_hash) AS QueryCount
    , MIN(query_stats.query_text) AS StatementTextForExample
    , MIN(query_stats.plan_handle) AS plan_handle
    , query_stats.query_hash AS QueryHash
    , query_stats.query_plan_hash AS QueryPlanHash
    , query_stats.CacheObjType
    , query_stats.ObjType
   FROM  
       ( 
           SELECT  
      qs.query_hash
      , qs.query_plan_hash 
      , qs.plan_handle
      , cp.cacheobjtype as [CacheObjType]
      , cp.objtype as [ObjType]
      , SUBSTRING(qt.[text], qs.statement_start_offset/2, ( 
                   CASE  
                       WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX), qt.[text])) * 2  
                       ELSE qs.statement_end_offset  
                   END - qs.statement_start_offset)/2  
               ) AS query_text 
           FROM  
               sys.dm_exec_query_stats AS qs 
      INNER JOIN sys.dm_exec_cached_plans cp ON cp.plan_handle = qs.plan_handle
               CROSS APPLY sys.dm_exec_sql_text(qs.[sql_handle]) AS qt 
           WHERE qt.[text] NOT LIKE '%sys.dm_exec_query_stats%' 
     AND cp.objtype = 'AdHoc'
     --AND qs.last_execution_time BETWEEN DATEADD(hour,-1,GETDATE()) AND GETDATE() --change hour time frame
       ) AS query_stats 
 GROUP BY query_stats.query_hash
  , query_stats.query_plan_hash
  , query_stats.CacheObjType
  , query_stats.ObjType
 HAVING COUNT(query_stats.query_hash) > 1
) 
SELECT  
   CTE.QueryCount
   , CTE.CacheObjType
   , CTE.ObjType
   , CTE.StatementTextForExample
   , tp.query_plan AS StatementPlan
   , CTE.QueyHash
   , CTE.QueryPlanHash 
FROM 
   CTE 
   OUTER APPLY sys.dm_exec_query_plan(CTE.plan_handle) AS tp 
ORDER BY CTE.QueryCount DESC; 
--ORDER BY [Total IO] DESC;

When you identify these candidates you then need to look at what is the most appropriate resolution such as.

Rewrite the query at the application layer to ensure it is called with parameterization from the data provider
Rewrite the query as a stored procedure

Enable 'optimize for ad hoc workloads' on your SQL instance

EXEC sys.sp_configure N'optimize for ad hoc workloads', N'0'
GO
RECONFIGURE WITH OVERRIDE
GO

Enable 'Forced Parameterization' on the affected database

USE [master]
GO
ALTER DATABASE [AdventureWorksPTO] SET PARAMETERIZATION SIMPLE WITH NO_WAIT
GO

Of cause the appropriate option really depends on a larger view of your environment and applications that only you can determine. If you do have any thoughts on this please feel free to add them to the comments below to help others.

Word of caution too. Before adjusting any sp_configure settings with RECONFIGURE make sure you run the following to check for any settings which are not yet active. It is expected to see 'min server memory (MB)' in the results of this though if you leave the setting default at 0 as SQL must reserve the minimum memory possible which is 16mb.

SELECT * FROM sys.configurations
WHERE Value <> value_in_use

Thursday, October 29, 2015

Latest SSMS stream lines the process for adding Azure Firewall Rules at connection

If you haven't heard, Microsoft have broken the release of SQL Management Studio away from the main product. While at this stage it is continuing to be released at the same cadence as the SQL 2016 CTP 2 cycle (e.g. Monthly) the updates are targeting features for existing releases, Azure SQL, and of cause SQL 2016. You can get the latest release @ http://aka.ms/ssms.

I have been working on a project of late with an Azure SQL DB back-end and one thing that has always frustrated me given my role keeps me out in the field is that I am constantly having to log into the portal and update the firewall rules....... Well the latest release of SSMS makes this process much easier.

Now when you try and connect to an Azure SQL Server and a firewall rule on the server blocks your connection, you are no longer prompted with the message informing you that the client is not allowed to connect. Now you are prompted with a dialog asking you to log into Azure and create a firewall rule. It even populates the client IP Address for you and suggests an IP range.

One you authenticate, select the appropriate option for the Firewall Rule (e.g. static or IP range). Click OK and SSMS will place a call to the Azure web services to add the firewall rule.

Once the rule is added, the authentication process continues which is a nice touch because I was half expecting to be prompted to log in again.

Obviously for this to work the user then needs the required permissions on the subscription

This is a great example of how the team is actively working on improving the toolsets and responding to feedback so make sure you keep the connect items rolling in.

In other news, I am presenting with a colleague at MS Ignite Australia next month on some of the new toolset features in SQL 2016. If your attending make sure you attend and come say hi as I will be around the Data Den and Exhibition hall throughout the event.

https://msftignite.com.au/sessions/session-details/1524

Legal Stuff: As always the contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.

Friday, May 29, 2015

SQL Server 2016: Query Store first looks

In case you missed the announcement SQL Server 2016 CTP2 preview was publicly announced today. You can read about it in the following post:

http://blogs.technet.com/b/dataplatforminsider/archive/2015/05/27/sql-server-2016-first-public-preview-now-available.aspx

There are already some training materials up on https://technet.microsoft.com/en-us/virtuallabs

The first new feature I noticed was a new type of actual query plan to include, Live Query Statistics.

When enabled the actual query plan is displayed as the query is executed, however it now also displays the real time progress as each stage of the query is executed. You can see this in the following screen shot where the Sort is at 65% (and subsequently all tasks after it are also at 65%). I wasn't fast enough to capture it before it reached the sort but it really does visually show you real time query execution. I'm going to have to find a really large workload to try it out on ;)

One of the new feature Query Store is one I have been "geeking out" on since earlier in the year and I am so glad I can now talk about it publicly and here is my first look at the feature first hand.

First off when you view the properties of a database there is a new configuration section of "Query Store" as seen in the following screen shot.

Before you can access this feature you need to enable it on each database. Like all data collection there is a small resource overhead but it's minimal due to the layer of integration so just enable it.... and given the product is in preview status should there be performance impact file a Connect Item and let the product team know.

So now you have enabled it you will have the ability to customise the data frequency, buffer hardening (flush), data retention/grooming, etc. Here are the defaults.

In my lab I have adjusted these values to be more granular, for the purpose of this blog, but you will need the find the sweet spot for your environment ;)

In the earlier screen shot of the database properties you can also see that there are graphical representations of the Query Store storage usage and a button to flush the storage if you need to.

Now if you refresh the database in Object Explorer you will see a new Query Store branch which contains some useful views.

These views offer some interesting data that previously we would have had to use extra tools to obtain.

The following shows the Overall Resource Consumption view.

Each of the views has the ability to be configured to adjust the metrics, date range, etc

The most interesting of the views in my opinion is the Top Resource Consuming Queries as this shows the top X queries on the top left box, and the query plans which have been used to return the results in the right top box.

When you select a "plan id" it displays the cached plan for that id in the lower pane.

Why is the so interesting, well like you can see in this screen shot I have had two plans for the same query. One with Parallelism and one without..... and from the bubble chart I can see which one has executed at times with the highest resource consumption and which has executed with the least resource consumption (based on my charts configuration). So from this data you can then make a choice if you should "force" a plan, or if you have forced the plan previously then you can 'unforce' it.

This makes all that effort of trying to figure out if a query plan has changed and is causing a degradation in performance, and then having to extract that plan and create the guide for it. Or many other changes in patterns of performance behavior.

Behind the scenes there is a great deal of data being captured about query execution to which we can access from a number of system tables.

For example on the table sys.query_store_query you can see the data around the internals of query times such as compilation time, parse time, etc.

I am definitely looking forward to exploring this feature further and all the time I am going to get back when looking into performance issue.

Obviously this is CTP2 and features/functionality will change before RTM.

Legal Stuff: As always the contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.

Thursday, May 7, 2015

SQL Server 2016 annoucement

SQL Server 2016 has been publically announced with the public Release Candidate coming in the American Summer. I will be using this post to collate blog posts and announcements regarding SQL 2016, and in particular the features which I am looking forward to.

SQL Server 2016 public preview coming this summer
http://blogs.technet.com/b/dataplatforminsider/archive/2015/05/04/sql-server-2016-public-preview-coming-this-summer.aspx

Query Store
http://slavasql.blogspot.com.au/2014/11/newest-sql-server-feature-query-store.html

Legal Stuff: As always the contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.