DFSR Failure After VM Restore (DFSR Error 2104)
Reading Time: 3 minutesI have an environment that heavily leverages DFS, and recently one of the replication member servers had to be restored from a VEEAM backup. Typically VEEAM is great and doesn’t cause any issues, in this case though DFS completely broke. I got a TON of SCOM alerts, and the event log was littered with them as well.
The DFS Replication service failed to recover from an internal database error on volume D:. Replication has been stopped for all replicated folders on this volume. Additional Information: Error: 9214 (Internal database error (-1605)) Volume: D: xxxxxx Database: D:\System Volume Information\DFSR
Event 2212, DFSR
The DFS Replication service has detected an unexpected shutdown on volume D:. This can occur if the service terminated abnormally (due to a power loss, for example) or an error occurred on the volume. The service has automatically initiated a recovery process. The service will rebuild the database if it determines it cannot reliably recover. No user action is required.
Additional Information:
Volume: D:
GUID: xxxxxxxxxxxxxxx
Error 2104, DFSR
The DFS Replication service failed to recover from an internal database error on volume D:. Replication has been stopped for all replicated folders on this volume.
Additional Information:
Error: 9214 (Internal database error (-1605))
Volume: xxxxxxxxxxxxxxxxxxxxxxxx
Database: D:\System Volume Information\DFSR
The important error here is 2104, noting the database issue. There are multiple topics out there that talk about this, but they all end up linking back to this support article.
In the end, essentially the database that is used by DFS replication becomes corrupted. It is a system-generated database so all you need to do is disable the replication service, delete the database, and start the replication service back up. Easy? No. There are a myriad of issues with doing this, mostly because the database is hosted in “System Volume Information” on the volume that hosts the DFS Root folder, or wherever you’ve placed the replication targets. Luckily for you, I hit my head against a wall for hours on end and figured out the solution.
Step 1: Stop DFSR service (stop-service DFSR)
Step 2: Grant yourself visibility to the “System Volume Information” folder. This entails flipping the radio button in explorer to “view hidden files”, as well as unchecking the box for “hide all system protected folders”.
Step 3: Grant yourself proper permissions to the “System Volume Information” folder. Go to the root of the volume that holds the replication targets eg. D:\. You will now see a grayed-out folder with a lock on it called “System Volume Information”. Go through the normal rigamaroo to grant “Administrators” full control over the folder. You should then be able to open it up, before it would have said “Access Denied”.
Step 4: Delete or rename the “DFSR” folder inside “System Volume Information”. Unfortunately, that’s not easy. Based on what I saw, it was because the file names in the database folder exceeded the limitations of explorer ( https://hansencloud.com/2014/04/22/varying-file-name-too-long-issues ). Here the easiest thing to use is the wonderful Robocopy /MIR! Create an empty folder in the root of the drive and copy it into the DFSR folder using the /mir flag in robocopy. This will “mirror” the source folder into the destination folder.
Now the DFSR folder should be completely empty.
Step 5: Start the DFS Replication service (start-service DFSR)
Step 6: Check for validating event logs.
Event 4102, DFSR
The DFS Replication service initialized the replicated folder at local path D:\xxxxxx and is waiting to perform initial replication. The replicated folder will remain in this state until it has received replicated data, directly or indirectly, from the designated primary member.
Additional Information:
Replicated Folder Name: XXXXXXX
Replicated Folder ID: XXXXXXXXXXXXXXXXXXXX
Replication Group Name: XXXXX\XXXX
Replication Group ID: XXXXXXXXXX
Member ID: XXXXXXXXXXXXX
Event 4412, DFSR
The DFS Replication service detected that a file was changed on multiple servers. A conflict resolution algorithm was used to determine the winning file. The losing file was moved to the Conflict and Deleted folder.
Additional Information:
Original File Path: D:\XXXXXXX
New Name in Conflict Folder: XXXXXXXXXXX
Replicated Folder Root: D:\XXXXXXXX
File ID: XXXXXXXXXXXXXXXX
Replicated Folder Name: XXXXXXXXXXXX
Replicated Folder ID: XXXXXXXXXXXXXXX
Replication Group Name: XXXXXXXXXXXXXX
Replication Group ID: XXXXXXXXXXXXXXXXX
Member ID: XXXXXXXXXXXXXXXXXXXX
There you go! You’ve done it! Microsoft said you had to contact their support to fix it, but you crafty devil – you’ve gone and done it yourself.
I hope I’ve made your day at least a little bit easier.
Configure Server Core for IIS Remote Management
Reading Time: 3 minutesEveryone’s familiar by now with reasons why you want to use Server Core Edition for things like IIS, DNS, etc. In a recent project I found an interesting scenario where my GUI management server couldn’t connect remotely to the IIS instance that I was running on Server 2016 Core. There are a few oddities, so I decided to blog about it – let’s get going.
TL;DR steps are as follows:
- Install IIS Web Role
- Install IIS Management Feature
- Change Registry Setting for Remote Management
- Set Management Service to start automatically
- Connect
- Work
- Get a promotion
- Get a raise
- Get a boat
Maybe not the boat, but that’s the dream right? Anyways, here’s the nitty gritty.
First, we need to see if IIS is installed. Assumedly because you’re already trying to figure out how to connect to it you’ve already done this. It’s good to check anyways, just to be sure. Note that in server core it first drops you into a cmd shell. This is 2017 and everything is done in powershell now, so go ahead and launch yourself into a PS shell. Then, we’ll check if the feature is installed by running the following command
Get-WindowsFeature | Where-Object {$_.DisplayName -eq “Web Server (IIS)”}
Here we can see that IIS is in fact not installed, so let’s go ahead and fix that. While we install IIS, it’s also important to install the IIS Remote Management Feature as well. Otherwise, there will be no connecting remotely to the instance. I’m installing both on the same line, using the following command.
Install-WindowsFeature Web-Server, Web-Mgmt-Service
It shouldn’t take too long. When it’s done you’ll get your output showing it’s complete.
Now that everything is installed, there is actually a registry key that needs modified. RegEdit is able to launch from Server Core from the command line, and you’ll need to set the following key to “1” rather than the default setting of “0”.
HKLM\SOFTWARE\Microsoft\WebManagement\Server\EnableRemoteManagement
Right, now we’ve got the settings in place. Unfortunately things still don’t work. That’s because the IIS Remote Management Service is disabled by default. Let’s go ahead and fix that by setting the service startup type to “automatic”, starting the service, and querying it’s state to confirm. We will do that by using the following three commands.
Set-Service WMSVC -StartupType “Automatic”
Start-Service WMSVC
Get-Service WMSVC
The status is now running, so we should be good to go. Let’s give it a shot by going into the GUI management server, launching the IIS console, and connecting to the server core box.
It will prompt you for the server name, and a user/password combo. After which, everything should be all set!
So there you have it, we’ve configured all the required settings to remotely manage IIS on server core!
I hope this makes your day at least a little bit easier.
Thanks,
Changing Azure Recovery Services Vault to LRS Storage
Reading Time: 2 minutesBack in the classic portal with backup services it was an easy fix. Simply change the settings value of storage replication type. I’ve recently started moving my workloads to recovery serveries vaults in ARM, and noticed something peculiar. By default, the storage replication type of the vault is GRS.
If your needs require geographically redundant storage, that that’s perfectly fine. I however don’t have such needs, and trust in Microsoft’s ability to keep data generally available in a LRS replication topology. It should be just like it was in classic, as an option anyways, right? Strangely, the option to change the replication type for the storage configuration on the vault is grayed out.
Odd, right? I thought so, until I found this.
Okay, well it’s not optimal but it looks like I need to remove the backup data from the vault to change the storage replication types right? Well, I gave that a shot and no go. I had the same issue, the option was still grayed out.
I ultimately had to completely delete, and create a new recovery services vault. Once it’s initially created you can change the replication type.
Ah, finally! Then register the VM(s), run some backup jobs and voila! Confirmation that the vault is using LRS storage.
I hope this makes your day at least a little bit easier.
Thanks,
Explanation: F5 LTM Full-Proxy Architecture && SSL Bridging
Reading Time: 2 minutesThe concept of a full-proxy architecture, along with SSL Bridging has seemed to confuse a good majority of people to whom I’ve attempted to explain. In that light, here we go. I could write a long drawn-out explanation of this process (and will, if requested) but most folks reading this want a quick answer. Let’s proceed.
A few things to note:
- “Full Proxy Architecture”, this means that clients or servers on either side of the F5 never talk to each other. The client thinks the F5’s endpoint (iApp) is the server, and the server thinks the F5 is the client. They never talk to each other.
- “SSL Bridging”, this means Client -> F5 is encrypted, then decrypted for processing, then re-encrypted, and F5 -> server is encrypted.
- “F5” is actually a company name, this products have many other names, such as F5 BIG-IP LTM ADC.
- It is a networking device, not a server, you can’t RDP to it like some people have assumed (although you can SSH into the management system and the TMSH data plane).
There is typically some confusion around what certs are on what box and whether or not they match. If they use the F5, the answer is – it doesn’t matter. They ONLY need to care about, and trust the cert that’s applied by the SSL Bridging profile attached the iApp that corresponds with the endpoint for that app. In the example I’ve drawn below (thanks to a fancy bright-link board) I show that the source client (which can be a server if you want), the F5, and the destination server all have different certs. Though, again all that matters to the anyone besides the F5 is the cert that the F5 uses. Note that the steps are numbered in green.
I hope this makes your day at least a little bit easier.
Thanks,
WSUS App Pool Crashes with SCCM Syncronization
Reading Time: 3 minutesI’ve seen this a few times now, sometimes with standalone WSUS but mostly with SCCM running a software update point. Every time SCCM does an update synchronization – the app pool crashes. If it runs again it will typically complete, but it’s still rather annoying especially if you have the SCOM management pack for IIS and/or SCCM. You’ll see things like the following.
Alert: ConfigMgr Server Component Issue
Source: ConfigMgr WSUS Synchronization Manager
Last modified by: System
Last modified time: 3/27/2017 4:14:02 AM Alert description: Component ConfigMgr WSUS Synchronization Manager - SCCMServer.domain.local (SMS_WSUS_SYNC_MANAGER) on server SCCMServer.domain.local is not working properly.
Application Error:
Faulting application name: w3wp.exe, version: 7.5.7601.17514, time stamp: 0x4ce7afa2
Faulting module name: KERNELBASE.dll, version: 6.1.7601.17651, time stamp: 0x4e21213c
Exception code: 0xe0434352
Fault offset: 0x000000000000cacd
Faulting process id: 0x141c
Faulting application start time: 0x01cd64a70072cec1
Faulting application path: c:\windows\system32\inetsrv\w3wp.exe
Faulting module path: C:\Windows\system32\KERNELBASE.dll
Report Id: 3e5d5bdc-d09a-11e1-a2f5-00155d2c1824
Log Name: System
Source: Microsoft-Windows-WAS
Event ID: 5074
A worker process with process id of ‘%1’ serving application pool ‘%2’ has requested a recycle because the worker process reached its allowed processing time limit.
Log Name: Application
Source: Windows Server Update Services
Event ID: 12072
The WSUS content directory is not accessible.
System.Net.WebException: The remote server returned an error: (503) Server Unavailable.
at System.Net.HttpWebRequest.GetResponse()
at Microsoft.UpdateServices.Internal.HealthMonitoring.HmtWebServices.CheckContentDirWebAccess(EventLoggingType type, HealthEventLogger logger)
Log Name: Application
Source: SMS Server
Event ID: 7000
On 8/13/2015 3:22:40 AM, component SMS_WSUS_CONTROL_MANAGER on computer WSUS.fqdn reported: WSUS Control Manager failed to configure proxy settings on WSUS Server “WSUS.fqdn”.
Possible cause: WSUS Server version 3.0 SP2 or above is not installed or cannot be contacted.
Solution: Verify that the WSUS Server version 3.0 SP2 or greater is installed. Verify that the IIS ports configured in the site are same as those configured on the WSUS IIS website.You can receive failure because proxy is set but proxy name is not specified or proxy server port is invalid.
What it turns out to be is the WSUS App Pool has some “rapid-fail” settings on the application pool in IIS itself. They are being overrun with the overhead of the SCCM SUP sync and causing a pool recycle. It turns out, this is actually a pretty easy fix.
- Launch IIS Manager on the server that hosts WSUS
- Open Application Pools
- Right click “WSUSPool”, then “Advanced Settings”
- Change ‘Queue Length’ from the default 1,000 to 25,000. You will note this number is also the same as the maximum number of clients supported per SUP in an SCCM architecture.
- Locate the “Private Memory Limit (KB). Default is set to “1843200” (~1.8GB) and a good practice I’ve found is to set it to “7843200” (~7.8GB). If for whatever reason you are still exceeding this limit you can set this to “0” denoting an unlimited amount.
- Restart the “WSUSPOOL” app pool.
If you run a standalone WSUS instance, you can now go do a manual synchronization in the WSUS management console to test the change.
Or, if you have an SCCM instance leveraging WSUS you DO NOT DO ANYTHING IN THE WSUS CONSOLE (if you didn’t know). Go ahead and launch your SCCM console and do the sync from there.
These changes should have fixed your problems, and all should be running well! If not, I recommend you contact Microsoft (especially if you have a very large infrastructure) since there are a few more tweaks you can make in IIS.
I hope this makes your day at least a little bit easier.
Thanks,
SCOM Alert Severity and Priority Values
Reading Time: < 1 minuteI’m not sure why I have a hard time remembering which way the numeric representation goes – but here it is.
Severity:
- Informational = 0
- Warning = 1
- Critical =2
Priority:
- Low = 0
- Medium = 1
- High = 2
Thanks!
PowerShell Script to Check Symantec Endpoint Protection Definition Updates
Reading Time: 2 minutesSymantec Endpoint Protection has quite a hold on the Anti-Virus market share. Many have environments where it’s used, and may not be the administrators or even able to view data from the Symantec Endpoint Protection Manager. In light of that, I’ve written a PowerShell script to check the last update time for SEP definitions that can either be run manually or set as a scheduled task.
# Check if Symantec Endpoint Protection is installed. If not, exit.
#Check last write date of AV definitions and compare to a variable set for time – 7 days.
# Write to the event log whether definitions are current or not
#Send email if definitions are out of date
*Things to Note*
- As it stands, in each of the “if ($writetime” blocks there is a “write-host”. If you plan on running this as a scheduled task you’ll want to remove or comment out those lines.
- I will also be writing this as a SCOM management pack, and an SCCM Compliance Item.
###################################################################
## Check Symantec Endpoint Protection Antivirus Definition Dates ##
## v1.1 ##
## Matt Hansen // 01-06-2017 ##
###################################################################
#Set Variables
$hostname = hostname
$7daysago = (get-date).AddDays(-7)
$key = 'HKLM:SOFTWARE\Wow6432Node\Symantec\Symantec Endpoint Protection\CurrentVersion\SharedDefs'
#Test for registry key path and execute if neccessary
if (test-path -path $key)
{
$path = (Get-ItemProperty -Path $key -Name DEFWATCH_10).DEFWATCH_10
$writetime = [datetime](Get-ItemProperty -Path $path -Name LastWriteTime).lastwritetime
#Write-Host A min ago was $7daysago. DEFs was last written at $writetime
if ($writetime -lt $7daysago)
{Write-host "You have old defs"
Write-EventLog -LogName "Application" -Source "Symantec Antivirus" -EventId "7076" -EntryType "Warning" -Message "Symantec Definitions are older than 7 days. Last update time is was $writetime"
$notify = "yes"
}
if ($writetime -gt $7daysago)
{Write-host "You have current defs"
Write-EventLog -LogName "Application" -Source "Symantec Antivirus" -EventId "7077" -EntryType "Information" -Message "Symantec Definitions are current within 7 days. Last update time is was $writetime"
$notify = "no"
}
#Email Notify
if ($notify -eq "yes")
{
$param = @{
SmtpServer = "smtpserver@company.local"
Port = 25
UseSsl = $false
#Credential = "you@gmail.com"
From = "SymantecDefChecks@mcompany.local"
To = "administrator@company.local"
Subject = "Symantec Defintions Out-of-Date on $hostname"
Body = "Symantec Definitions are older than 7 days. Last update time is was $writetime on $hostname"
}
Send-MailMessage @param
#write-host "Email Sent"
}
}
Else {Write-host "Not installed"}
I hope this makes your day at least a little bit easier.
Thanks,
How to move SCVMM VMs into a Cloud
Reading Time: 2 minutesIf you’ve ever added hosts to an SCVMM instance you’ll know that there’s seeminly no easy way to move the newly imported VMs from those hosts into SCVMM clouds. I’ve found the best way to do this is by using the SCVMM command-line interface, which unfortunately has a few quirks.
Set-SCVirtualMachine is the command you’ll need to use, with the flag “-Cloud” like in the example below.
Set-SCVirtualMachine -VM “NewVM1” -Cloud “Cloud1”
Unfortunately, every time I’ve tried this I’ve gotten an error saying it can’t convert the value type correctly like as shown below.
For whatever reason, I’ve found that the work around here is to set both the VM and the Cloud as variables and run the command again.
$VM = Get-SCVirtualMachine “NewVM1”
$Cloud = Get-SCCloud “Cloud1”
Set-SCVirtualMachine -VM $VM -Cloud $Cloud
Then we have success!
I’ve yet to figure out why this is, but at least it works.
I hope this makes your day at least a little bit easier.
Thanks,
SCVMM Error 2912 “The configuration registry database is corrupt (0x800703F1)”
Reading Time: < 1 minuteI recently spun up a new SCVMM environment, created my first VM, and attempted to create a template only to be faced with a job error.
Error (2912)
An internal error has occurred trying to contact the Host01 server: : .WinRM: URL: [http://Host01.lab.local:5985], Verb: [INVOKE], Method: [LoadSubkey], Resource: [http://schemas.microsoft.com/wbem/wsman/1/wmi/root/scvmm/P2VSourceFixup?RegFileName=C:\Users\SVC_VMM\AppData\Local\Temp\tmp6AB5.tmp]
The configuration registry database is corrupt (0x800703F1)
Recommended Action
Check that WS-Management service is installed and running on server host01.lab.local. For more information use the command “winrm helpmsg hresult”. If host01.lab.local is a host/library/update server or a PXE server role then ensure that VMM agent is installed and running. Refer to http://support.microsoft.com/kb/2742275 for more details.
I’ve seen this issue before and typically it’s because I go on auto-pilot and sysprep the VM by hand. That will cause an issue, go ahead and start the VM and login, shutdown and let VMM do the sysprep.
Unfortunately this time that wasn’t the problem, though it was similar. When I shut the VM down I accidentally hit “Turn Off” and it hard powered the VM down. A simple boot, login, and retry fixed the problems here.
I hope this makes your day at least a little bit easier.
Thanks,
SCCM 2012 R2 Reinstall Fails – Configuration Manager Requires a Dedicated SQL Server Instance
Reading Time: < 1 minuteRecently I had to reinstall an SCCM 2012 R2 instance, and came across a strange error when I ran the Prerequisite Check.
Dedicated SQL Server instance: Configuration Manager requires a dedicated SQL Server instance to host the site database. You selected the SQL Server instance that site hosts the Configuration Manager database for another site. Select a different SQL Server instance for this new site to use, or resolve the conflict by uninstalling The Other excellant site or moving to a different database SQL Server instance.
After some research it turns out this (in my case anyways) is due to the SCCM uninstall process not properly completing. SCCM doesn’t need it’s own SQL instance, it just requires that you only have one SCCM instance per SQL instance. To make sure this is the case, the prerequisite checker looks for a few registry keys on the SQL server that the install is looking to use. To fix this error, delete the following keys from the SQL server.
[HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\SMS_SITE_SQL_BACKUP_<SITESERVERNAME>]
[HKLM\SOFTWARE\Microsoft\SMS\SMS_SITE_SQL_BACKUP_<SITESERVERNAME>]
[HKLM\SOFTWARE\Microsoft\SMS\Components\ SMS_SITE_SQL_BACKUP_<SITESERVERNAME>]
After they are deleted, run the prerequisite checker again and viola!
I hope this makes your day at least a little bit easier.
Thanks,