Saturday, April 13, 2013

A look into SQL Server's Plan Caching when object names are not fully qualified

The other day while demonstrating the circumstances which control when SQL Server's Optimiser will create a new query plan or reuse a plan from cache I was asked an interesting question. While I hadn't come across the specific scenario before, I presented my logic based on my knowledge of the optimiser's behaviour and that it would be an interesting investigation. In this blog is my investigation and I am pleased to say my initial answer (based on logical thought) was correct.

Scenario
We have 2 Tables in a database with the same name but owned by separate schema's, if a user executes a query against one of those tables but does not fully qualify the table name using the owning schema (i.e. relies on their default schema to qualify the name), and another user who's default schema would qualify to the other table within the database runs an identical query, would the Optimiser use the same plan for both.

My Initial Thoughts
While not having investigated this specific behaviour before, my knowledge of the Optimisers behaviour is that it would try and fully qualify the name while generating the Query Plan and therefore create separate plans regardless of if the object name is fully qualified or not.

The Investigation
To reproduce this behaviour I have created a test database [CacheTest] and within that database 2 schemas, TestSchema1 and TestSchema2, along with this 2 logins, TestUsr1 and TestUsr2 which are mapped to the database and schema's respectively. Also within the database are 2 tables, one for each schema [TestSchema1].[ProcCacheTest] and [TestSchema2].[ProcCacheTest], and each table has been populated with some test data.

Lets start by ensuring we have a nothing in our plan cache so it is easier to witness the behaviour, obviously you don't want to do this on a production system.

DBCC FREEPROCCACHE

Now we need to run a query as each user individually.

Open a New Query Window and change the connection to authenticate as [TestUsr1] and then run:

SELECT * FROM [TestSchema1].[ProcCacheTest]
WHERE SomeDate = '19000107'

Then open a New Query Window and change the connection to authenticate as [TestUsr1] and then run.

SELECT * FROM [TestSchema2].[ProcCacheTest]
WHERE SomeDate = '19000107'

Now check the cache, for that I am going to use the script that I wrote in my blog here. Nothing special in these results as we have specified the Schema name so we get 2 individual Query Plans cached.



So lets find out what happens when we don't specify a Schema name (i.e. we don't Fully Qualify the object name). Run the following from the connection context of [TestUsr1]:

SELECT * FROM [ProcCacheTest]
WHERE SomeDate = '19000107'

So what does the cache tell us.


As you can see a new Query Plan is cached. Again nothing unexpected here as we changed the text being executed (removed the schema name) so the query text no longer matches and we know that even the inclusion of an extra White Space, Line Break or even a change in Character Case would cause this behaviour.

How about if we execute this exact same query under the context of the [TestUsr2], and be sure that it is the exact same query with no extra White Space or variance in text.

SELECT * FROM [ProcCacheTest]
WHERE SomeDate = '19000107'

The results of this query should return data from the [TestSchema2].[ProcCacheTest] table, even though we don't specify the schema because the users default schema will map back to the table.

After executing the query lets check the cache again.


Whoa, we now have a 4th Plan in the cache, but wait the Query Text looks identical yet the optimiser has generated a new plan. But how can we be sure we are looking at two completely different statements? The sys.dm_exec_query_stats DMV can help us there as it includes the Query_Hash column, and this is the binary hash value calculated on the query and used by the Optimiser to identify queries with similar logic.

So lets add that into the result set.



As you can see the Query Hash for the 2 queries are different, but wait the Query Hash for the queries in rows in 1 and 4 are identical, and so is the Hash for queries 2 and 3 which match the pairs of statements we executed under each user context. So this verifies to us that the Optimiser is creating the Query Plan based on the Fully Qualified Object path, not just the text we enter (in fact you can verify this even further by looking at the Query_Plan_Hash as it shows the same paired values).

We can also use the query_plan output of the sys.dm_exec_query_plan DMV to view the xml or graphical plan, and from within the plan we can see that the Predicate has the fully qualified object name.



Wow this is great, but lets just be sure because we know some connection providers will use the Prepare / Execute method or maybe we are using the Forced Parameterisation option. So lets test with the sp_executesql to ensure parameterisation of the SQL text.

I executed the following in the context of [TestUsr1]

exec sp_executesql @stmt = N'SELECT * FROM [ProcCacheTest] WHERE SomeDate = @1'
, @params = N'@1 varchar(8)'
, @1 = '19000107';

And then the following in the context of [TestUsr2]

exec sp_executesql @stmt = N'SELECT * FROM [ProcCacheTest] WHERE SomeDate = @1'
, @params = N'@1 varchar(8)'
, @1 = '22000107';

Note that it was my intention to use a different date, due to the data that I have in my [TestSchema2].[ProcCacheTest] table and we have already proved that it will not perform a direct match.

So we check the cache again.


Again we see the same behaviour, the Optimiser creates a new Query Plan in the cache, but it recognises that the statements relate to each other based on the Fully Qualified Object path (see the Query Hash or Query Plan Hash columns).

So after all this we can see that the Optimiser is working as we expect, and mapping the Query Plans to the individual objects based on qualified object paths regardless of if the user names the objects correctly or not. Of cause the recommendation should always still be to qualify the name of the object with the Schema to reduce this additional work that the optimiser is required to do as well as some of the other issues that it can cause.

Thanks for reading and props to the student in the recent workshop that spiked my curiosity on this subject :)

Legal Stuff: The contents of this blog is provided “as-is”. The information, opinions and views expressed are those of the author and do not necessarily state or reflect those of any other company with affiliation to the products discussed. This includes any URLs or Tools. The author does not accept any responsibility from the use of the information or tools mentioned within this blog, and recommends adequate evaluation against your own requirements to measure suitability.

No comments:

Post a Comment