Friday, August 21, 2009

Windows 2008 and xp_logininfo

I've come across an issue which appears to be caused by changes in the security of Windows 2008.

Originally I believed this was related to SQL 2008 however I have replicated the same behavior on SQL 2008 and SQL 2005 when running on a Windows 2008 OS.

Scenario
In an environment where you have user accounts existing in a parent domain (e.g. DOMAIN01) and Computer and Service accounts existing in a Child Domain (e.g. DOMAIN02) you may find that SQL Server running on a computer in the Child Domain will report the following error when you perform specific operations from a SQL Server running either SQL Server 2005 or SQL Server 2008 on a Windows 2008 Operating System.

Msg 15404, Level 16, State 19, Procedure xp_logininfo, Line 62
Could not obtain information about Windows NT group/user 'DOMAIN01\matticus', error code 0x5.

Some operations which may cause this are:
  • Attempting to restore a database using a 3rd party application such as Red-Gate SQL Backup
  • Creating a schedule for the re-population of a Full Text Catalogue or creating an SQL agent job with the owner set to a domain account within the Parent Domain (e.g. DOMAIN01)
  • Using xp_logininfo from within an application / script to check the group membership of a domain user account (e.g. exec xp_logininfo 'DOMAIN01\matticus')

This error occurs due to changes in the security of Windows 2008 and specifically changes to the Microsoft Windows API calls used to retrieve information about the Windows domains and the Windows user accounts.

The issue will only occur when SQL Server is attempting to retrieve information regarding Windows Domains and User accounts from the Parent Domain (e.g. DOMAIN01). Queries to AD objects within the same domain (e.g. DOMAIN02) will be successful.


Steps to Reproduce the Issue
The following steps are provided here to reproduce the error within an environment to meets the configuration as stated in the scenario above.
  1. Connect to the SQL Server running in DOMAIN02 using SSMS and an account with SYSADMIN rights to the SQL Instance
  2. Open a New Query window
  3. Run the following TSQL to reproduce the error (replacing the account name with an account within the parent domain)

    exec xp_logininfo 'DOMAIN01\matticus'

  4. You should receive the following error:

    Msg 15404, Level 16, State 19, Procedure xp_logininfo, Line 62
    Could not obtain information about Windows NT group/user 'DOMAIN01\matticus', error code 0x5.

  5. Now run the following TSQL to demonstrate that querying within the same domain is not affected (replacing the account name with the service account within the child domain)

    exec xp_logininfo 'DOMAIN02\sqlserviceacc01'

  6. This should complete successfully and return a result set. The same would occur for any other accounts within the Child Domain.




Solution 1
At this time the only solution I have found to this problem is to add the service account which the SQL Server service is running on to the "BUILTIN\Windows Authorization Access Group" group within the parent domain. This membership could be inherited through another domain group being added to that BuiltIn group if you have a large number of servers to manage in your organisation.

NOTE: In the past I had found a similar case where membership of the "BUILTIN\Windows Authorization Access Group" group in the parent domain was required. However that was for an issue with the SCOM2007 Service Broker and also SSRS Email Delivery where I believe they had their own methods of querying AD group memberships or had started to implement the changes which MS were planning for Windows 2008 as all servers in that case were running on Windows 2003 OS. These were though due to the accounts for those services being different to the main service account for the related application. That has been the only other case where I have seen this behaviour. I found this article which clearly outlines the SSRS Email Delivery issue.


Reference Material

How to troubleshoot a SQL Server 8198 error
http://support.microsoft.com/kb/834124

xp_logininfo (Transact-SQL)
http://msdn.microsoft.com/en-us/library/ms190369.aspx

Some applications and APIs require access to authorization information on account objects
http://support.microsoft.com/kb/331951

xp_logininfo fails for agent account
http://www.sqlservercentral.com/Forums/Topic612567-149-1.aspx

Wednesday, August 19, 2009

Just a few SQL 2008 links

Just some links relating to SQL 2008 which I have found from fellow bloggers which have some really good information:

Management Studio Improvements in SQL Server 2008
http://www.simple-talk.com/sql/learn-sql-server/management-studio-improvements-in-sql-server-2008/


Activity Monitor – SSMS 2008
http://sql-articles.com/blogs/activity-monitor-ssms-2008/
This is a really good break down of the activity monitor including permissions and the different tabs available.

New Challenge from the guys at "TSQL Challenges"

The guys at "TSQL Challenges" have released Challenge #12
http://beyondrelational.com/blogs/tc/archive/2009/08/09/tsql-challenge-12-build-sequential-ranges-of-dates-with-propagation-to-missing-values.aspx

Wednesday, June 10, 2009

When is a Backup Considered Successful?

Yesterday while reading the many industry related newsletters I subscribe to I came across a really good article which is essentially a DBA's "Best Practise" guide for ensuring they have backups available when they need them and that they are successful. I was please to find that there wasn't anything written in this article which I was not already aware of and performing as part of my DBA role. Of cause performing test restores for the ~13000 databases my team manage is not very realist.
I also found that this was a good article for not just the DBA but also for the end user to read and therefore gain an understanding of exactly what is involved in a Recovery / Restore request and therefore assist in setting a realistic expectation.

When is a Backup Considered Successful?
By Steven Bouffard, 2009/06/08 @ SQLServerCentral.com
http://www.sqlservercentral.com/articles/Backup+and+Recovery/66703/

Sunday, April 26, 2009

Useful web site for Report Viewer Control

Tonight while investigating some issues a client was having with the SRS 2008 Report Viewer control I stumbled across this useful web site. http://www.gotreportviewer.com/

Thursday, April 16, 2009

Report Viewer returning "Unable to load client print control" when trying to print a report

This week I came across an issue with the SRS Report Viewer component which I believed we had fixed back in October 2008.... if you cast your minds back you may recall the issue with MS Update KB 956391 and that the update to Kill Bits stopped the Report Viewer's Print functionality. If you don't then read http://blogs.msdn.com/brianhartman/archive/2008/11/05/client-print-fails-to-load-after-microsoft-update-956391.aspx

During the investigation into the issue we found a number of forums also mentioning that the issue had re-occurred so hopefully this helps point people in the right direction and save them time in their investigation.

So back to the client's issue. As with when MS rolled out update KB 956391 if you were to try and print a report you would receive the "Unable to load client print control" error. After reapplying the hotfix KB 954606 and then also upgrading the SQL Server to SP3 without any impact on the issue, I started to look further into the clients configuration.

The clients environment consisted of a front-end Web server which was serving their web site (which includes a front-end custom built report viewer), and also a back-end SQL Server which also has an installation of SRS's Report Manager web interface (and therefore IIS installed) mostly for administration of the report environment. The problem I found was that we were focusing our attention on the SQL Server as an assumption was made that the Report Viewer control was being launched from there. Unfortunately this was not the case, as after checking the clients application I found that their web site (hosted on the front-end web server) actually contained a custom built report viewing tool built in .Net using the Report Viewer component.

Once I had identified that the issue was related to the front-end web site and not the SQL Server, I was then able to track down the following assemblies for the Report Viewer component being servered by their web application's /bin directory, and what I found that these were actually the incorrect versions and were in fact pre hotfix KB954606.

Name: Microsoft.ReportViewer.WebForms.dll
Version: 8.0.50727.42

Name: Microsoft.ReportViewer.Common.dll
Version: 8.0.50727.42

When I compared these to the version of the assemblies in the GAC on the SQL Server I found that these were quite out dated:

Microsoft.ReportViewer.WebForms.dll
8.0.50727.1843

Microsoft.ReportViewer.Common.dll
8.0.50727.1843

TIP: To check the versions of a DLL in the GAC correctly, right click the assembly and open it's properties. Then click the Version tab and the correct version will be reported. The version reported on the General tab and also in the GAC list is not reported correctly.

Once we replaced the out dated DLL's with the latest DLL's from the SQL Server the Print functionality started working again.

On a side note to this I actually found that the Developer had not patched his/her system with either hotfix KB954606 or SP3 for SQL Server and therefore when they published an update to the application they broke the Print functionality again. A lesson should be learnt here to make sure all of your patching is kept up to date, for not only your production systems but also development systems.

To give credit where credit is due this post definitely helped me on my way to identifying the cause.

Tuesday, April 14, 2009

Recently I came across the an issue with MOM 2005 Agents reporting communication issues when trying to contact the Management Server. While investigating the issue I found a number of posts online which indicated had the same symptoms.

Most of what was being commented about online related to MS KB 885416 which involves the following events being reported in the Agent's Application Event Log.

Event Type: Error
Event Source: Microsoft Operations Manager
Event Category: None
Event ID: 21293
User: NT AUTHORITY\SYSTEM
Description: The agent was unable to send data to the MOM Server at ManagementServerName. The error code is 10054. An existing connection was forcibly closed by the remote host.

Event Type: Information
Event Source: Microsoft Operations Manager
Event Category: None
Event ID: 26021
User: NT AUTHORITY\SYSTEM
Description: The agent has restored communication to ManagementServerName


Unfortunately like the MS KB 885416 everything pointed to needing to install a hotfix or Service Pack 1, however like other people posting about this issue I had already installed Service Pack1. In fact this issue is the first thing fixed by Service Pack 1 according to http://support.microsoft.com/kb/905420

After hitting a brick wall with my investigation I ended up contacting Microsoft support for assistance and after providing their technician with a very clear overview of everything I had already investigated he immediately found that there was another error also being logged. This error which he found was:

Event ID: 26009
Category: None
Source: Microsoft Operations Manager
Type: Error

Machine: ComputerName
Message:
The agent could not connect to the MOM Server FQDN.MOM.Server. The error reported is 'No connection could be made because the target machine actively refused it.'. Verify the management group name is correct, the MOM Server is running, that it is listening on port 1270, and that any firewalls between this agent and the MOM server are configured to pass TCP and UDP traffic on port 1270.

Where the MS tech found this error is still beyond me as I could never see it but it was either in the Event Log or the MOM log files from the Agent.

Once this error was detected Microsoft directed me to the MB KB 934441 which matched the problem we were having.

After downloading the Hotfix as described in MB KB 934441 and applying it to the affected Management Server the communication issue was immediately corrected and monitoring restored.

I am still not 100% sure of the cause of this as Microsoft never provided a definite cause for why the errors were being reported but the hotfix has definitely corrected almost every case of this issue we came across.

NOTE: After patching the affected Management Server I also then patched all of my other Management Servers (6 in total) and then proceeded to patch the Agents, which I am still working through.

Querying the pending actions for agents in MOM 2005

Recently I came across a situation where I had patched our MOM2005 Management Servers with a MS hotfix to correct a problem we were experiencing and needed to then rollout the patch to all of our agents. As our environment is a combination of agents with either Control Levels Full or None I needed to determine and keep track of which agents I had patched.

To do this I wrote the following TSQL statement to query the OnePoint database and allow me to export the results to Excell to make it easier to keep track of what was still to be patched.

SELECT C.[Name] AS Name,
C.[Domain],
CM.[Name] AS ConfigManagerName,
(CASE Convert(varchar(20), C.ManagedType)
WHEN '0' THEN 'Unmanaged'
WHEN '2' THEN 'Agent-Managed'
ELSE Convert(varchar(20), C.ManagedType)
END) as 'Management Mode',
(CASE IsNull(dbo.fn_GetControlLevel(C.idComputer),'FUll')
WHEN 'Full' THEN 'Full'
WHEN 'Group' THEN 'None'
ELSE 'UNK'
END) As ControlLevel,
(CASE Convert(varchar(20),C.PendingAction)
WHEN '0' THEN ''
WHEN '1' THEN 'Approve Manual Agent Install'
WHEN '2' THEN 'Uninstall Agent'
WHEN '4' THEN 'Install Agent'
WHEN '5' THEN 'Requires Patching'
ELSE Convert(varchar(20),C.PendingAction)
END) AS PendingAction
FROM
[dbo].[Computer] C
LEFT OUTER JOIN [dbo].[Computer] CM ON C.[idConfigManager] = CM.[idComputer]
WHERE C.IsAddedByServiceDiscovery = 0
ORDER BY C.Name