I Have an Enduring Love for Configuration Manager, but…

…sometimes it tests me…

We have to install SP2 for ConfigMgr before we can start our Windows 7 imaging. The customer we’re working on has SP1 deployed, the site was originally upgraded from SMS 2003. Suffice to say that the SQL database is in a bit of a mess.

Problem 1.

The installation bombs out with an error in the C:\ConfigMgrSetup.log:

The login already has an account under a different user name.

This turns out to be an issue where we have a SQL account with the NetBIOS name of the server but a login id of the full domain machine account. E.g. The SQL User Name is “SERVER” but the login ID is DOMAIN\SERVER$.

ConfigMgr attempts to add its computer account account to SQL (which is also DOMAIN\Server$) and gets the above error. Deleting the existing NetBIOS name account isn’t possible as it owns the “SMS Admins” SQL schema.

The solution is to grant another account (NETWORKSERVICE seems to work) the ownership of the SMS Admins schema, remove ownership for the NetBIOS named account, then you can delete it and once the account is re-added as DOMAIN\SERVER$ you can transfer the ownership of the SMS Admins schema back to this account.

This was, however, just the start of the problems.

Problem 2.

Error in Setup log:

Cannot create a row of size 8065 which is greater than the allowable maximum of 8060.

I still haven’t got to the bottom of this one except that compressing the database seems to make it go away… At the same time as we were having these problems, our DB was also growing by around 200MB per hour, there’s some discussion here that this may be related to a post SP2 hotfix, but I have not been able to confirm this.

Problem 3

Our database is corrupt. And unfortunately it’s in quite an important table…

Error in the setup log:

The Database ID 58, Page (1:140044), slot 20 for LOB data type node does not exist. This is usually caused by transactions that can read uncommitted data on a data page. Run DBCC CHECKTABLE.

Running DBCC Checktable tells us that the data is unrecoverable and that only running DBCC CheckDB with REPAIR_ALLOW_DATA_LOSS is going to fix it. After scrabbling around for a while without joy we ran this and it scrapped around 100 rows from the database. Unfortunately the data is in the CI_ConfigurationItems table. There’s a lot of other tables which depend on this, referential integrity issues await!

Problem 3.

Having “repaired” the database we now essentially have a bunch of orphaned items scattered around the Configuration Manager SQL database. This is a real problem as the SP2 setup routine is going to recreate all of the SQL Foreign Key links and if there’s incompatible data in the linked tables the setup routine will bomb out:

The ALTER TABLE statement conflicted with the FOREIGN KEY constraint "Update_ComplianceStatus_CI_UpdateCIs_FK". The conflict occurred in database "SMS_XXX", table "dbo.CI_UpdateCIs", column ‘CI_ID’.

The table will vary, but essentially what this is telling us is that there are records in the Update_ComplianceStatus table that do not have a corresponding record in the CI_ConfigurationItems table, because they were corrupt and have now been removed.

To identify the offending records I executed the following SQL Query

Select distinct CI_ID
From Update_ComplianceStatus
Where CI_ID not in (select CI_ID from CI_ConfigurationItems)

This results in a few records being returned which represent tens of thousands of status messages, inventory records, whatever… They need to be removed:

Delete from Update_ComplianceStatus where CI_ID = ’9420′
Delete from Update_ComplianceStatus where CI_ID = ’9644′
Delete from Update_ComplianceStatus where CI_ID = ’11666′
Delete from Update_ComplianceStatus where CI_ID = ’25041′
Delete from Update_ComplianceStatus where CI_ID = ’26261′
Delete from Update_ComplianceStatus where CI_ID = ’29627′
Delete from Update_ComplianceStatus where CI_ID = ’31677′
Delete from Update_ComplianceStatus where CI_ID = ’33272′

I created the above statements in Excel using Concatenate to add the text to the ID, I recognise that you can do the whole thing programmatically, but this way you can run them one at a time in the SQL interface:

image

I like this, as these kind of things make me a little nervous…

Once these tasks are complete I rerun setup and wait for the next failure in the log, then repeat the above. A lot.

It took EIGHT HOURS!!!!! but:

image

Now I need a pint!

Meeting Jeff Wettlaufer

The crushing disappointment of not making it to MMS this year (due to plate tectonics) was soothed slightly last week by the Best of MMS UK event held at Microsoft’s London offices in Cardinal Place. As well as a chance to see some of the content delivered at MMS (including plenty of demos using Configuration Manager v.Next and Virtual Machine Manager v.Next) it was also a good opportunity to meet up with some former colleagues and some new associates and to swap stories over a beer or two.

It was also a chance to catch up with Jeff Wettlaufer, a Senior Technical Product Manager in the Microsoft System Center division. Jeff’s always interested on the view from the trenches and was kind enough to interview me and Carl from Orinoko about our recent experiences with Windows 7 and System Center. Whether the interview ever sees the light of day, who knows… we’re not exactly well rehearsed in these things so are mainly tongue-tied for the whole thing, but it really did happen:

Jeff, Carl and John

Jeff, Carl and John, Cardinal Place, 18th May 2010

Adding Server Core Roles in the Task Sequence

We are deploying Hyper-V Server to Dell M610 blades using Configuration Manager OSD. Deploying servers, as opposed to workstations, has some different challenges, though much is the same. In particular it’s critical to manage the drivers as tightly as you would with a workstation deployment. We’ve suffered a few BSOD’s from having out of date drivers…

When deploying the OS we want to add some roles for clustering, DotNet, Powershell, etc. It turns out that the MDT Add Roles task isn’t particularly aware of Server Core or Hyper-V Server, so the DotNet and PowerShell commands don’t work. This isn’t a big deal, as we can just use DISM commands. So,to install DotNet 2, we execute:

DISM /Online /Enable-Feature /FeatureName:NetFx2-ServerCore

NB This is case sensitive. It is also critical to disable 64-bit file system redirection, if you don’t, the 32-bit Configuration Manager agent will execute DISM from SYSWOW64 and you will receive an error in the Task Sequence status messages “The operating system reported error code 11: An attempt was made to load a program with an incorrect format”

Using PowerShell to Manipulate Clients En-masse

Working with Configuration Manager, I am constantly reminding customers that Configuration Manager is a patient man’s tool. Oftentimes there’s no point in trying to speed the application along, there’s latency built into some of the processes occurring under the Configuration Manager hood and there’s generally not a great deal of point in interfering…

That said, I do like SCCM Client Center from Roger Zander. I personally preferred the look and feel of the old SMS Client Center over the new version, but nonetheless, you can get some great results with the new one.

Anyhow, I digress. As I’ve been mucking about fixing hundreds and hundreds of duplicate GUIDs, I’ve also been monkeying around with PowerShell manipulating clients. I am aware of a number of console extensions that enable you to do this kind of thing to whole collections, but, I’m a command-line guy and I am generally working on customer’s infrastructures so don’t want to be cluttering up the console. Further to this, I now frequently use a sequenced version of the Configuration Manager console which makes adding in extensions a little more complex.

As you’re probably aware, PowerShell makes WMI handling nice and easy. I’m a fan of WMIC (WMIC COMPUTERSYSTEM GET MODEL is my 3rd favourite command line trick) but it isn’t remotable, and won’t run through PSExec :-(    I specifically wanted a few hundred machines to give me a DDR this morning, I was in that kind of mood, so to PowerShell.

Powershell WMI

First thing to do is to create yourself a function. Open the PS console:

Function GenerateDDR
{
$SMSCli = [wmiclass] \\$strComputer\root\ccm:sms_client
$SMSCli.TriggerSchedule("{00000000-0000-0000-0000-000000000003}")
}

You’ll need to press return a couple of times after the last line. Now we can execute:

GenerateDDR MYCOMPUTERNAME

Brilliant. There are a few different trigger actions:

Trigger Actions

Hardware Inventory {00000000-0000-0000-0000-000000000001}

Software Inventory {00000000-0000-0000-0000-000000000002}

Discovery Data Record {00000000-0000-0000-0000-000000000003}

Machine Policy Retrieval & Evaluation {00000000-0000-0000-0000-000000000021}

File Collection {00000000-0000-0000-0000-000000000010}

SW Metering Usage Report {00000000-0000-0000-0000-000000000022}

Windows Installer Source List {00000000-0000-0000-0000-000000000032}

Software Updates Scan {00000000-0000-0000-0000-000000000113}

Software Updates Store {00000000-0000-0000-0000-000000000114}

Software Updates Deployment {00000000-0000-0000-0000-000000000108}

There are also a bunch of other client methods:
EvaluateMachinePolicy, GetAssignedSite, PDPMaintenanceTask, RepairClient, RequestMachinePolicy, ResetGlobalLoggingConfiguration, SetAssignedSite, SetClientProvisioningMode and SetGlobalLoggingConfiguration.

So obviously the above function could be modified easily to a GenerateHW, GenerateSW etc. function just by modifying the last two or three digits of the Trigger type.

Now, using the fantastic file handling available in PS we can remotely run this against loads of machines (as long as they’re powered up and accessible…)

To do this we use the magic FOREACH command:

foreach ($StrComputer in Get-Content D:\ClientFix\test.csv) {GenerateDDR $StrComputer}

The output of this (assuming the machines are powered up) is:

image

For me, this is a great way of performing bulk live client manipulation. PowerShell is a superb utility, but unless I use it frequently I find I quickly forget the syntax of the commands and have to start from scratch every time. This year I’ve promised myself I’m going to migrate from black screen to blue, time to move on from CMD.EXE.

I had a good bit of help with this work from Greg Ramsey and Don Jones.

Repairing Duplicate GUID issues

On site with a customer who currently have SMS 2003. We’re replacing SMS with ConfigMgr of course, so we deploy the ConfigMgr client with SMS feeding it a parameter for SMS_SLP=NEWSERVER which forces the clients to switch to our new infrastrutcure. So far so good.

Around 50% of the clients show up pretty much immediately, but around 1500 are missing. It eventually turns out that this is because these 1500 machines all had the SMS client duplicated on them in a Ghost image, and now we’ve maintained that ID through the ConfigMgr upgrade.

A bit of a nightmare.

In the past under SMS we would have just deleted the SMSCFG.INI from the Windows folder and restarted the SMS Agent Host service which would have regenerated the GUID, these days the ConfigMgr client certificates don’t much like this, so we have to be a little bit smarter about it. We have our own client installation wrapper, SMSSamurai which we use for the client deployment in the first instance. In a scenario where we know there is going to be difficulty with duplicate GUIDs, we have this wrapper generate a new GUID at install time, but in this instance the duplication was not anticipated.

The main problem now is that the machines need to be powered on to be repaired, so it’s a long slow process. I’ve messed around with trying to script this in Powershell, but there’s no WMI provider for regenerating the GUID, so I’ve fallen back on trusty old Tranguid.exe from the SMS 2003 resource kit and PSexec from Sysinternals.

I’ve created a batch file:

REM Cleaining Up Duplicate GUIDS

REM Firstly we kill off the SMS Agent Host (CCMEXEC) service on the remote machine
TASKKILL /S %1 /IM CCMEXEC.EXE

REM Next we generate a new GUID for the machine
PSEXEC \\%1 -c Tranguid.exe /R

REM Now we start the SMS Agent Service back up
SC \\%1 START CCMEXEC

I save this as DeDupe.bat , then executing dedupe.bat brokenclientmachinename will fix the duplicate GUID.

You can check the fix on the client by checking the ClientIDManagerstartup.log:

image

The batch file above restarts the SMS Agent Host service once the GUID has been renewed. This will result, a couple of minutes later, in a new heartbeat discovery (DDR) being generated. If you are feeling particularly impatient you can kick off a DDR as soon as the batch file completes.

The final piece of the jigsaw is to run the batch file against all clients. To do this we create a “Non Clients” collection (where Client=NULL)

select SMS_R_SYSTEM.ResourceID,SMS_R_SYSTEM.ResourceType,SMS_R_SYSTEM.Name,SMS_R_SYSTEM.SMSUniqueIdentifier,
SMS_R_SYSTEM.ResourceDomainORWorkgroup,SMS_R_SYSTEM.Client from SMS_R_System where SMS_R_System.Client is null

Now use View-  Export list to dump this collection to a NonClients.CSV file.

Finally, logged on as a user with admin rights on all PCs run the following:

for /f “skip=1 delims=,” %L in (NonClients.csv) do Dedupe.bat %L

It is possible to wrap this command in a ping task to check that the machine is powered on before attempting the remote commands which will speed the process up considerably.

Onsite here I’m finding that this is repairing around 200 clients per hour. That’s a little slow, so adding a ping test can improve things a bit.

for /f “skip=1 delims=,” %L in (NonClients.csv) do ping –n 1 %L && Dedupe.bat %L

The double & in this command line allows CMD.exe to process the Dedupe.bat file to execute if the ping command produces an errorcode of 0, which it does if the machine responds.

The WMI object contained an invalid value in property BIOSNumLock

Not an error I’ve come across before, but trying to start up a Configuration Manager VM I’ve not used in a while, Hyper-V Manager stated “The WMI object contained an invalid value in property BIOSNumLock” and that the machine was in in “saved-state-critical”.

Opening up the properties of the machine and selecting then deselecting the Num Lock option in the BIOS configuration made this go away.

image

Very odd.

Orinoko

Orinoko Logo

Hello my blog-reading faithful, (apologies for the re-post if you saw this briefly a couple of weeks back, user error…). After a rewarding five years at BT Engage IT, (previously known as Lynx Technology) I have decided to move on from employment within a global mega-corporation to a smaller, dedicated System Center consultancy.

I will be working at Orinoko alongside some good friends with whom I’ve done great work over the past 15 years. Cutting the ties of permanent employment (I’m becoming a director and co-owner of Orinoko) has not been an easy decision to make, but I am excited at the potential we have in our new team, and am happy to be able to concentrate solely on the technologies which most excite me.

Orinoko work closely with Microsoft technologies and specifically with the System Center product suite. Our main offerings are aligned as shown here:

image

Using the tools available with Windows Server and System Center, coupled with the expertise of the Orinoko team, we provide industry-leading, high performance systems management solutions which enable our customers to achieve remarkable things.

We have a team with over 30 years collective experience of the products in the System Center suite and are, actively marketing our services to both our partners and direct to market.

We have already begun our first projects and I hope to be able to keep this blog up to date with details on some of these implementations very soon.

Thanks for dropping by.

John

John.Quirk@orinoko.co.uk

ConfigMgr Disaster Recovery

Some friends of ours recently had some problems with their ConfigMgr infrastructure and ended up with a system which was booting, but the ConfigMgr console showed an unending list of errors… Reporting didn’t function, the distribution manager log was full of checksum errors, lots of WMI and DCOM errors, it wasn’t looking too promising.

After an hour or so investigation it was obvious that fixing it was going to be more difficult that recovering from backup, so a quick server rebuild and we ran through the following process:

Reinstalling SQL

Step one is to re-add SQL. This was the source of the main gotcha, obviously, it’s pretty important to get the collation order right. If you don’t, what you’ll find immediately you’ve reinstalled and recovered ConfigMgr is that the colleval (and other) logs will fill up with collation match errors. This type of error is a source of unending frustration. I nearly always have the same kind of issue with Package Mapping as the Deployment database the MDT creates has a different collation to the ConfigMgr one.

So, to avoid having to detach databases and reinstall SQL, remember that the default SQL install doesn’t have the correct SQL collation, ConfigMgr requires Latin1_General_CI_AS, the default Latin1_General collation would be Latin1_General_CI_AI. In case this is Klingon to you, the CI and AS bits are related to Case and Accent sensitivity. CI=Case Insensitive and CS=Case Sensitive, same for Accent.

To install SQL with Latin1_General_CI_AS you have to tick the Accent Sensitive button and clear the Case Sensitive button in the SQL collation setup routine.

Reinstalling ConfigMgr

This is pretty straightforward, just reinstall it in the same folder as before.

Recovering ConfigMgr

Again, this is pretty straightforward. I have found that the ConfigMgr Site Repair Wizard (from the Configuration Manager Start Menu folder) can be a little unresponsive when you’re launching it, running it as an admin probably makes a difference, but once it’s launched you’re good to go.

All you now have to do is point it at your recovery wizard at your ConfigMgr backup (you do have a backup right?) and it’ll pull the site back together for you.

A Couple of Minor Gotchas

The first problem we had following the restore was that you need to recreate all of the shares you had previously. Obvious, but an easy one to forget, and one which will break your OS builds.

Also, remember to re-distribute your boot media to the PXE service point. Chances are your other package contents is still where you left it, so this will be ok, but you’ve reinstalled WDS, so will need to repopulate it.

Reinstall the MDT. If you’re using MDT, naturally it’ll need to be reinstalled. The console integration will be put back by the site recovery, but the wizards won’t work until you reinstall the app.

Create and delete some dummy collections, advertisements, packages. Any objects you created between back and site loss will be lost now. To avoid any mix-up in the infrastructure it’s a good idea to create a few collections to take the COLLID autonumber beyond anything you might previously created. The same with packages and adverts. This only takes a couple of minutes and can avoid some headscratching later when machines start installing things they weren’t supposed to.

Expect some inventory resyncs. Any machine which submitted it inventory data in the period between site backup and recovery will, in a week’s time (depending on inventory windows of course) send in updated inventory. ConfigMgr will not like this as it will feel like it’s missed out on some inventory so will request a full resync from the client. These will show up as warnings in the Inventory Dataloader. Don’t worry about these, it’s perfectly normal.

A Failure Exit Code of 16389 was returned

 

clip_image002

I’ve been struggling with this a little, so thought it prudent to add to the glut of improbably remedies to 16389 which litter the internet.

Here I’m trying to build a Windows Server 2008 R2 server using ConfigMgr. The whole thing looks fine until the Windows Setup routine kicks off whereupon I get this error. Accessing a command prompt on the failed machine gives me access to the logs, interestingly there’s nothing on the C: drive other than the log folder:

image

Delving into the SMSTSLog folder there are the usual suspects, none of which give me anything unusual, but in the C:\SMSTSLog\WindowsSetupLogs folder there’s a SetupAct.log, this is setup actions (not accounts…) and describes in detail each action carried out during the unattended Windows Server installation. Reading backwards from the end I quickly happen across these lines:

——-
Fallback_Productkey_Validate_Unattend: An error occurred preventing setup from being able to validate the product key;

PublishMessage: Publishing message [The unattend answer file contains an invalid product key. Either remove the invalid key or provide a valid product key in the unattend answer file to proceed with Windows installation.]

This installation is blocked from completing due to compliance failures or invalid input; this is not an internal error.

——

So if you were ever wondering what happens in ConfigMgr if you mix up your Windows Standard product key with your Datacenter edition one, now you know!

Windows 7 Reference Task Sequence Creation With ConfigMgr and MDT Integration

A customer earlier in the week had implemented ConfigMgr for their builds and was getting good results with it. They hadn’t implemented MDT as they couldn’t see the benefit, so with this series of posts I’m going to highlight why we mostly do it this way, and what benefits using MDT Task Sequences brings.

Now that SP2 for ConfigMgr is in Release Candidate (and due for RTM at the end of October according to Mr Niehaus) we can use this stuff for Windows 7 deployment.

First up, install SQL, ConfigMgr, its dependencies, and MDT 2010 RTM.

Now, integrate MDT with ConfigMgr by clicking

image

Now open the ConfigMgr console. nothing much has changed, but you have a couple of new options when you right click in the OS Deployment node. You can create MDT Boot Media clicking in boot images and you can create an MDT Task Sequence clicking in Task Sequences, let’s do that now!

image

When we do this we are prompted to pick a template. So, here’s the first benefit with MDT. More pre-configured templates:

image

The standard ConfigMgr task sequence only gives three options:

image

Further to that, the ConfigMgr standard Task Sequence expects you to have set up all of the dependent packages, boot images, etc. yourself before running through the wizard. MDT will create them for you if required…

Ok, so we pick to deploy the MDT Client Task Sequence, I can already tell that this is going to be a rambling post, but the first thing of note is that you no longer have to provide a capture destination if you’re not going to be doing a capture. Hurray, a minor irritant squashed (it’s the little things…)!

image

That said, I need to capture this build, so I fill in the box.

With the standard ConfigMgr task sequence, I’d need to select one of the pre-built boot images, but I want the goodness of ADO and other brilliant, so I get MDT to make a special one just for me:

image

The MDT asks if you want other languages, a custom wallpaper (I always make mine in PowerPoint, some of the templates make for pretty wallpapers, when there’s all this technology around there’s still no getting away from the fact that the customers like their logos on things, and why not.). On the same screen as the wallpaper, language, ADO options etc. you can also provide an extra directory to add. I put my diag tools in here (Trace32.exe etc.) they make life easier if you have problems in PE).

image

I create a Deployment Toolkit File Package. This holds all the scripts and bits that the MDT task sequence needs. Those of you still with us may notice that I put everything in a sub-folder of a root folder called OSImaging. This keeps things nice and tidy as far as I’m concerned, and is something I recommend.

image

Now MDT wants to create our OS package for us. Again under the standard ConfigMgr task sequence you’d have to do this outside of the wizard.

It also creates the ConfigMgr client package for you. Again, it’s not hard to do yourself, but why bother when the wizard can sort you out…

image

and USMT package:

image

Last thing is the MDT Settings Package. This handles the unattend.xml and customsettings.ini files.image

We don’t need Sysprep, so can skip the final screen and then we’re ready. The wizard goes off and creates all the objects listed above.

image

Once it’s finished we just need to add the packages created to distribution points (this includes the OS install, so it can take a little while). I’ve got a PXE Service Point, so I add my new boot image to that DP too.

Next we’ll deploy and capture this and then start to look at the clever stuff we can do with the MDT integration to streamline deployment and support advanced deployment scenarios.

Follow

Get every new post delivered to your Inbox.