MSEndpointMgr

Proactive Hard Drive Replacement with Endpoint Analytics

We have all been there, you are working away fine and then you hear the unmistakable tick tick tick sound of a failing hard disk. Of course today SSD’s are common place so a lot of you might be thinking, what is he on about, but if you ask those around in IT long enough, they will tell you stories of how the impending sound of doom when your hard disk starts to fail.

The trouble of course is how do we monitor for signs of a disk failure and more importantly, how can we be proactive about replacing hardware before something goes to the scrap heap.

Monitoring Hard Disk Health

There are several ways with modern machines that we can monitor hard disk health, from in-built SMART (Self-Monitoring, Analysis, and Reporting Technology) errors, do read error counts, and through to wear values, with most values readily available to query via WMI. Lets take a look at each of these in isolation first.

SMART

The Self-Monitoring, Analysis, and Reporting Technology feature that is built into most hard disks today, is a self monitoring system that allows the disk to report on predicted and determined anomalies during normal operation. When an anomaly is detected, a corresponding error code is set which can then be read by software or other means to warn you that something isn’t quite right with your hard disk.

Below are a sample list of failures errors (source S.M.A.R.T. – Wikipedia)

IDAttribute nameIdeal!Description
01
0x01
Read Error RateLow(Vendor specific raw value.) Stores data related to the rate of hardware read errors that occurred when reading data from a disk surface. The raw value has different structure for different vendors and is often not meaningful as a decimal number. For some drives, this number may increase during normal operation without necessarily signifying errors.
02
0x02
Throughput PerformanceHighOverall (general) throughput performance of a hard disk drive. If the value of this attribute is decreasing there is a high probability that there is a problem with the disk.
03
0x03
Spin-Up TimeLowAverage time of spindle spin up (from zero RPM to fully operational [milliseconds]).
04
0x04
Start/Stop CountA tally of spindle start/stop cycles. The spindle turns on, and hence the count is increased, both when the hard disk is turned on after having before been turned entirely off (disconnected from power source) and when the hard disk returns from having previously been put to sleep mode.
05
0x05
Reallocated Sectors CountLowCount of reallocated sectors. The raw value represents a count of the bad sectors that have been found and remapped. Thus, the higher the attribute value, the more sectors the drive has had to reallocate. This value is primarily used as a metric of the life expectancy of the drive; a drive which has had any reallocations at all is significantly more likely to fail in the immediate months.
06
0x06
Read Channel MarginMargin of a channel while reading data. The function of this attribute is not specified.
07
0x07
Seek Error RateVaries(Vendor specific raw value.) Rate of seek errors of the magnetic heads. If there is a partial failure in the mechanical positioning system, then seek errors will arise. Such a failure may be due to numerous factors, such as damage to a servo, or thermal widening of the hard disk. The raw value has different structure for different vendors and is often not meaningful as a decimal number. For some drives, this number may increase during normal operation without necessarily signifying errors.
08
0x08
Seek Time PerformanceHighAverage performance of seek operations of the magnetic heads. If this attribute is decreasing, it is a sign of problems in the mechanical subsystem.
09
0x09
Power On HoursCount of hours in power-on state. The raw value of this attribute shows total count of hours (or minutes, or seconds, depending on manufacturer) in power-on state. “By default, the total expected lifetime of a hard disk in perfect condition is defined as 5 years (running every day and night on all days). This is equal to 1825 days in 24/7 mode or 43800 hours.” On some pre-2005 drives, this raw value may advance erratically and/or “wrap around” (reset to zero periodically).

We can integrate these values using the MSStorageDriver_FailurePredictStatus class in the root\WMI namespace;

Get-WmiObject -namespace root\wmi –class MSStorageDriver_FailurePredictStatus

Drive Health Status

Windows also maintains its own health monitoring state for storage, which can be queried using the Get-PhysicalDisk cmdlet in PowerShell;

Get-PhysicalDisk

Using this we have a quick and easy means of obtaining a health value for the disk.

Read / Write Error Count

Read and write error counts often signify poor cells or sectors on the disk and can be used by some OEM’s to replace a hard disk. In this instance, the values are captured and flagged once they go beyond a predefined maximum contained within the script.

Temperature

As with all electronics, temperature plays a vital part in the longevity of the component. This is why of course server rooms as an example are maintained at low temperatures, ensuring that the hardware provides as long as possible before failure. The same then is true of your laptop or desktop drive, and again we can monitor this value where the drive supports these values.

Get-PhysicalDisk | Get-StorageReliabilityCounter | Select-Object -Property DeviceID, Wear, ReadErrorsTotal, ReadErrorsCorrected, WriteErrorsTotal, WriteErrorsUncorrected, Temperature, TemperatureMax | FT

In researching this I found that not all disks report their manufacturer max temperature states, so determining if the drive is out of spec can’t always be relied upon, although you could of course determine an average across your devices and set that as your lowest common warning state.

Read / Write Errors

In the previous screenshot you can see using that query in PowerShell we were able to obtain information including read and write errors, looking at this on a machine with issues, we can see the counters being incremented;

Wear Value

There is another interesting value that we can obtain, this is the “wear” value. This value is set between 0-100 where 100 indicates that the drive has reached the end of its useable life. The Microsoft definition of this value is stated below;

Wear

Data type: UInt8

Access type: Read-only

The storage device wear indicator, in percentage. At 100 percent, the estimated wear limit will have been reached.

Source – MSFT\_StorageReliabilityCounter class | Microsoft Docs

Where SSD’s are concerned, I am assuming (although not confirmed) that the wear value is checking the amount of spare cells being used, as SSD’s have in-built redundancy to provide the stated storage value as long as possible and thus using the spare cells as normal cells reach their end.

Endpoint Analytics / Proactive Remediation’s FTW (Again)

Given that we have all of these values I thought wouldn’t it be useful to monitor these, or indeed notify the end user of the impending doom that faces them. Lets face it no one likes a machine failing when they are trying to work, and with remote working being common place now, obtaining a machine or disk replacement quickly could pose a challenge.

So first of all let me state that the monitoring solution is provided as a “predictive” or “proactive” solution and after speaking with the main OEM’s on this, typically drives will ONLY be replaced when they suffer from a hard fail. With that said, if your machines are out of warranty, then this issue really doesn’t apply to you, as a replacement is going to cost money and its up to you to way up the cost of potential productivity loss versus the cost of a replacement drive.

With that in mind then, we can put this all together in PowerShell, and start monitoring!

Detection Script

The first thing we have to do is build our detection script, which should take into account all of the above, using the drive health and SMART values first as the primary failure detection methods, and thereafter using error and wear states. Of course the idea here is to do an “exit 1” triggering the remediation script should anything be out of normal when it comes to the disk health states.

<#	
	.NOTES
	===========================================================================
	 Created on:   	21/04/2021 11:00 AM
	 Created by:   	Maurice Daly
	 Organization: 	CloudWay
	 Filename:     	Invoke-DiskHealthCheck.ps1
	===========================================================================
	.DESCRIPTION
		Monitors WMI values for hard disk health, helping you predict or detect
		anomalies and be preactive about hard disk replacement.
#>

#region ScriptVariables

# Define variables
$Organisation = "YOUR ORG DETAILS"
$MaxWearValue = 90
$MaxRWErrors = 100
$RegistryBase = "HKLM:\SOFTWARE\$Organisation\Monitoring\Disk Health"

# Obtain physical disk details
$Disks = Get-PhysicalDisk | Where-Object { $_.BusType -match "NVMe|SATA|SAS|ATAPI|RAID" }

#endregion ScriptVariables

#region ScriptFunctions

function Write-RegistryEntries {
	
	# Set disk registry path
	$RegistryPath = Join-Path -Path $RegistryBase -ChildPath "Disk $($Disk.DeviceID)"
	
	# Create disk registry key if not present
	if (-not (Test-Path -Path $RegistryPath)) {
		New-Item -Path $RegistryPath -Force | Out-Null
	}
	
	# Set registry values and warning message
	New-ItemProperty -Path $RegistryPath -Name "Friendly Name" -Value $($Disk.FriendlyName) -PropertyType "String" -Force | Out-Null
	New-ItemProperty -Path $RegistryPath -Name "Health Status" -Value $DriveHealthState -PropertyType "String" -Force | Out-Null
	New-ItemProperty -Path $RegistryPath -Name "Media Type" -Value $DriveMediaType -PropertyType "String" -Force | Out-Null
	New-ItemProperty -Path $RegistryPath -Name "Wear" -Value $([int]($DiskHealth.Wear)) -PropertyType "String" -Force | Out-Null
	New-ItemProperty -Path $RegistryPath -Name "Read Errors" -Value $([int]($DiskHealth.ReadErrorsTotal)) -PropertyType "String" -Force | Out-Null
	New-ItemProperty -Path $RegistryPath -Name "Temperature Delta" -Value $DiskTempDelta -PropertyType "String" -Force | Out-Null
	New-ItemProperty -Path $RegistryPath -Name "Read Errors Uncorrected" -Value $($Disk.ReadErrorsUncorrected) -PropertyType "String" -Force | Out-Null
	New-ItemProperty -Path $RegistryPath -Name "Read Errors Total" -Value $($Disk.ReadErrorsTotal) -PropertyType "String" -Force | Out-Null
	New-ItemProperty -Path $RegistryPath -Name "Write Errors Uncorrected" -Value $([int]($DiskHealth.WriteErrorsUncorrected)) -PropertyType "String" -Force | Out-Null
	New-ItemProperty -Path $RegistryPath -Name "Write Errors Total" -Value $([int]($DiskHealth.WriteErrorsTotal)) -PropertyType "String" -Force | Out-Null
	New-ItemProperty -Path $RegistryPath -Name "Output" -Value $OutputMsg -PropertyType "String" -Force | Out-Null
	
}

#endregion ScriptFunctions

#region ScriptRunningCode

# Create root registry key if not present
if (-not (Test-Path -Path $RegistryBase)) {
	New-Item -Path $RegistryBase -Force | Out-Null
}

# Loop through each disk
foreach ($Disk in ($Disks | Sort-Object DeviceID)) {
	# Set initial output variable state
	$OutputMsg = $null
	
	# Obtain disk health information from current disk
	$DiskHealth = Get-PhysicalDisk -FriendlyName $($Disk.FriendlyName) | Get-StorageReliabilityCounter | Select-Object -Property Wear, ReadErrorsTotal, ReadErrorsUncorrected, WriteErrorsTotal, WriteErrorsUncorrected, Temperature, TemperatureMax
	
	# Obtain media type
	$DriveDetails = Get-PhysicalDisk -FriendlyName $($Disk.FriendlyName) | Select-Object MediaType, HealthStatus
	$DriveMediaType = $DriveDetails.MediaType
	$DriveHealthState = $DriveDetails.HealthStatus
	$DiskTempDelta = [int]$($DiskHealth.Temperature) - [int]$($DiskHealth.TemperatureMax)
	
	# Obtain SMART failure information
	$DriveSMARTStatus = (Get-WmiObject -namespace root\wmi -class MSStorageDriver_FailurePredictStatus -ErrorAction SilentlyContinue | Select-Object InstanceName, PredictFailure, Reason) | Where-Object { $_.PredictFailure -eq $true }
	
	# Create custom PSObject
	$DiskHealthState = new-object -TypeName PSObject
	
	# Create disk entry
	$DiskHealthState | Add-Member -MemberType NoteProperty -Name "Disk Number" -Value $Disk.DeviceID
	$DiskHealthState | Add-Member -MemberType NoteProperty -Name "FriendlyName" -Value $($Disk.FriendlyName)
	$DiskHealthState | Add-Member -MemberType NoteProperty -Name "HealthStatus" -Value $DriveHealthState
	$DiskHealthState | Add-Member -MemberType NoteProperty -Name "MediaType" -Value $DriveMediaType
	$DiskHealthState | Add-Member -MemberType NoteProperty -Name "Disk Wear" -Value $([int]($DiskHealth.Wear))
	$DiskHealthState | Add-Member -MemberType NoteProperty -Name "Disk $($Disk.DeviceID) Read Errors" -Value $([int]($DiskHealth.ReadErrorsTotal))
	$DiskHealthState | Add-Member -MemberType NoteProperty -Name "Disk $($Disk.DeviceID) Temperature Delta" -Value $DiskTempDelta
	$DiskHealthState | Add-Member -MemberType NoteProperty -Name "Disk $($Disk.DeviceID) ReadErrorsUncorrected" -Value $($Disk.ReadErrorsUncorrected)
	$DiskHealthState | Add-Member -MemberType NoteProperty -Name "Disk $($Disk.DeviceID) ReadErrorsTotal" -Value $($Disk.ReadErrorsTotal)
	$DiskHealthState | Add-Member -MemberType NoteProperty -Name "Disk $($Disk.DeviceID) WriteErrorsUncorrected" -Value $($Disk.WriteErrorsUncorrected)
	$DiskHealthState | Add-Member -MemberType NoteProperty -Name "Disk $($Disk.DeviceID) WriteErrorsTotal" -Value $($Disk.WriteErrorsTotal)
	
	# Check for health, read failures, or temperature issues
	If ($DriveHealthState -ne "Healthy") {
		$OutputMsg = "Disk $($Disk.DeviceID) / $($Disk.FriendlyName) is in a $([string]$DriveHealthState.ToLower()) state"
	} elseif ($DriveSMARTStatus -gt $null) {
		$OutputMsg = "SMART predicted failure detected with reason code $($DriveSMARTStatus.Reason)"
	} elseif ([int]($DiskHealth.Wear) -ge $MaxWearValue) {
		$OutputMsg = "Disk failure likely on disk $($Disk.DeviceID) with media type $DriveMediaType. Current wear value is reading as $([int]($DiskHealth.Wear)), above the set threshold of 90%."
	} elseif ([int]($DiskHealth.ReadErrorsTotal) -ge $MaxRWErrors) {
		$OutputMsg = "A high number of disk read errors $([int]($DiskHealth.ReadErrorsTotal)) on disk $($Disk.DeviceID) with media type $DriveMediaType"
	} elseif ([int]($DiskHealth.WriteErrorsTotal) -ge $MaxRWErrors) {
		$OutputMsg = "A high number of disk write errors $([int]($DiskHealth.WriteErrorsTotal)) on disk $($Disk.DeviceID) with media type $DriveMediaType"
	} elseif ($([int]($DiskHealth.Temperature)) -gt $([int]($DiskHealth.TemperatureMax)) -and ([int]($DiskHealth.TemperatureMax)) -gt 0) {
		$OutputMsg = "Disk $($Disk.NumDeviceIDber) is currently running $DiskTempDelta above the maximum temperature rating $($Disk.TemperatureMax) for the drive."
	} else {
		$OutputMsg = "Disk $($Disk.DeviceID) is in a healthy state. No action required."
	}
	
	# Write entries to Registry
	Write-RegistryEntries
}

# Set remiediation value based on disk issues
$DriveHealthIssue = [boolean](Get-ChildItem -Path $RegistryBase -Recurse | Get-ItemProperty | Where-Object { $_.Output -notmatch "No action required" })
if ($DriveHealthIssue -eq $true) {
	# Flag error value / mark for remediation
	Write-Output "$((Get-ChildItem -Path $RegistryBase -Recurse | Get-ItemProperty | Where-Object { $_.Output -notmatch "No action required" }).Output)";  exit 1
} else {
	# No issues found
	Write-Output "Disks are in a healthy state. No action required"; exit 0
}

#endregion ScriptRunningCode

Running this on our environment, we can view the output by adding the pre-remediation output value, as per the example below;

Remediation Script

In the event that a disk does report back a “suspect” state, the next thing we can do (but you don’t have to, this is purely optional), is notify the end user. The idea of course is to be proactive about getting an issue resolved, and hence if the user is also informed, for example, to contact the IT Service Desk, then it should help drive compliance for hard disk swap outs.

Here I am using elements from Ben Whitmore’s notification script, updated so the images are downloaded from URL’s provided within the script, and removing some of the hardcoding and external dependencies. The script also caters for the fact you will be running this in system context, displaying the prompt to the signed in user through a task scheduler event;

<#
	.NOTES
	===========================================================================
	 Created on:   	21/04/2021 11:00 AM
	 Created by:   	Maurice Daly / Ben Whitmore
	 Organization: 	CloudWay
	 Filename:     	Invoke-DiskHealthNotification.ps1
	===========================================================================
	.DESCRIPTION
		Monitors WMI values for hard disk health, helping you predict or detect
		anomalies and be preactive about hard disk replacement.
#>

Param
(
    [Parameter(Mandatory = $False)]
    [String]$ToastGUID
)

#region ToastCustomisation

#Create Toast Variables
$ToastTitle = "Please note that your hard drive is currently operating outside of healthy parameters. Please contact the IT service desk to arrange a replacement."
$Signature = "Monitored by Proactive Remediations"
$ButtonTitle = "IT Service Desk"
$ButtonAction = "YOUR IT HELPDESK URL"
$SnoozeTitle = "Snooze"

#ToastDuration: Short = 7s, Long = 25s
$ToastDuration = "long"

#Images
$BadgeImageUri = "https://YOUR STORAGE URL/Notifications/badgeimage.jpg"
$HeroImageUri = "https://YOUR STORAGE URL/Notifications/harddisk.jpg"
$BadgeImage = Join-Path $ENV:Windir -ChildPath "Temp\badgeimage.jpg"
$HeroImage = Join-Path $ENV:Windir -ChildPath "Temp\harddisk.jpg"

#endregion ToastCustomisation

#region ToastRunningValues

#Set Unique GUID for the Toast
If (!($ToastGUID))
{
	$ToastGUID = ([guid]::NewGuid()).ToString().ToUpper()
}

#Current Directory
$ScriptPath = $MyInvocation.MyCommand.Path
$CurrentDir = Split-Path $ScriptPath

#Set Toast Path to UserProfile Temp Directory
$ToastPath = (Join-Path $ENV:Windir "Temp\$($ToastGuid)")

$ToastPSFile = $MyInvocation.MyCommand.Name

#endregion ToastRunningValues

#region ScriptFunctions
# Toast function
function Display-ToastNotification
{
	
	#Fetching images from URI
	$WebClient = New-Object System.Net.WebClient
	$WebClient.DownloadFile("$BadgeImageUri", "$BadgeImage")
	$WebClient.DownloadFile("$HeroImageUri", "$HeroImage")
	
	#Set COM App ID > To bring a URL on button press to focus use a browser for the appid e.g. MSEdge
	#$LauncherID = "Microsoft.SoftwareCenter.DesktopToasts"
	$LauncherID = "{1AC14E77-02E7-4E5D-B744-2EB1AE5198B7}\WindowsPowerShell\v1.0\powershell.exe"
	#$Launcherid = "MSEdge"
	
	#Dont Create a Scheduled Task if the script is running in the context of the logged on user, only if SYSTEM fired the script i.e. Deployment from Intune/ConfigMgr
	If (([System.Security.Principal.WindowsIdentity]::GetCurrent()).Name -eq "NT AUTHORITY\SYSTEM")
	{
		
		#Prepare to stage Toast Notification Content in %TEMP% Folder
		Try
		{
			
			#Create TEMP folder to stage Toast Notification Content in %TEMP% Folder
			New-Item $ToastPath -ItemType Directory -Force -ErrorAction Continue | Out-Null
			$ToastFiles = Get-ChildItem $CurrentDir -Recurse
			
			#Copy Toast Files to Toat TEMP folder
			ForEach ($ToastFile in $ToastFiles)
			{
				Copy-Item (Join-Path $CurrentDir $ToastFile) -Destination $ToastPath -ErrorAction Continue
			}
		}
		Catch
		{
			Write-Warning $_.Exception.Message
		}
		
		#Set new Toast script to run from TEMP path
		$New_ToastPath = Join-Path -Path $ToastPath -ChildPath $ToastPSFile
		
		#Created Scheduled Task to run as Logged on User
		$Task_TimeToRun = (Get-Date).AddSeconds(30).ToString('s')
		$Task_Expiry = (Get-Date).AddSeconds(120).ToString('s')
		$Task_Action = New-ScheduledTaskAction -Execute "C:\WINDOWS\system32\WindowsPowerShell\v1.0\PowerShell.exe" -Argument "-NoProfile -WindowStyle Hidden -File ""$New_ToastPath"" -ToastGUID ""$ToastGUID"""
		$Task_Trigger = New-ScheduledTaskTrigger -Once -At $Task_TimeToRun
		$Task_Trigger.EndBoundary = $Task_Expiry
		$Task_Principal = New-ScheduledTaskPrincipal -GroupId "S-1-5-32-545" -RunLevel Limited
		$Task_Settings = New-ScheduledTaskSettingsSet -Compatibility V1 -DeleteExpiredTaskAfter (New-TimeSpan -Seconds 600) -AllowStartIfOnBatteries
		$New_Task = New-ScheduledTask -Description "Toast_Notification_$($ToastGuid) Task for user notification. Title: $($EventTitle) :: Event:$($EventText) :: Source Path: $($ToastPath) " -Action $Task_Action -Principal $Task_Principal -Trigger $Task_Trigger -Settings $Task_Settings
		Register-ScheduledTask -TaskName "Toast_Notification_$($ToastGuid)" -InputObject $New_Task
	}
	
	#Run the toast of the script is running in the context of the Logged On User
	If (!(([System.Security.Principal.WindowsIdentity]::GetCurrent()).Name -eq "NT AUTHORITY\SYSTEM"))
	{
		
		$Log = (Join-Path $ENV:Windir "Temp\$($ToastGuid).log")
		Start-Transcript $Log
		
		#Get logged on user DisplayName
		#Try to get the DisplayName for Domain User
		$ErrorActionPreference = "Continue"
		
		Try
		{
			Write-Output "Trying Identity LogonUI Registry Key for Domain User info..."
			Get-Itemproperty -Path "HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Authentication\LogonUI" -Name "LastLoggedOnDisplayName" -ErrorAction Stop | out-null
			$User = Get-Itemproperty -Path "HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Authentication\LogonUI" -Name "LastLoggedOnDisplayName" | Select-Object -ExpandProperty LastLoggedOnDisplayName -ErrorAction Stop | out-null
			
			If ($Null -eq $User)
			{
				$Firstname = $Null
			}
			else
			{
				$DisplayName = $User.Split(" ")
				$Firstname = $DisplayName[0]
			}
		}
		Catch [System.Management.Automation.PSArgumentException] {
			"Registry Key Property missing"
			Write-Warning "Registry Key for LastLoggedOnDisplayName could not be found."
			$Firstname = $Null
		}
		Catch [System.Management.Automation.ItemNotFoundException] {
			"Registry Key itself is missing"
			Write-Warning "Registry value for LastLoggedOnDisplayName could not be found."
			$Firstname = $Null
		}
		
		#Try to get the DisplayName for Azure AD User
		If ($Null -eq $Firstname)
		{
			Write-Output "Trying Identity Store Cache for Azure AD User info..."
			Try
			{
				$UserSID = (whoami /user /fo csv | ConvertFrom-Csv).Sid
				$LogonCacheSID = (Get-ChildItem HKLM:\SOFTWARE\Microsoft\IdentityStore\LogonCache -Recurse -Depth 2 | Where-Object { $_.Name -match $UserSID }).Name
				If ($LogonCacheSID)
				{
					$LogonCacheSID = $LogonCacheSID.Replace("HKEY_LOCAL_MACHINE", "HKLM:")
					$User = Get-ItemProperty -Path $LogonCacheSID | Select-Object -ExpandProperty DisplayName -ErrorAction Stop
					$DisplayName = $User.Split(" ")
					$Firstname = $DisplayName[0]
				}
				else
				{
					Write-Warning "Could not get DisplayName property from Identity Store Cache for Azure AD User"
					$Firstname = $Null
				}
			}
			Catch [System.Management.Automation.PSArgumentException] {
				Write-Warning "Could not get DisplayName property from Identity Store Cache for Azure AD User"
				Write-Output "Resorting to whoami info for Toast DisplayName..."
				$Firstname = $Null
			}
			Catch [System.Management.Automation.ItemNotFoundException] {
				Write-Warning "Could not get SID from Identity Store Cache for Azure AD User"
				Write-Output "Resorting to whoami info for Toast DisplayName..."
				$Firstname = $Null
			}
			Catch
			{
				Write-Warning "Could not get SID from Identity Store Cache for Azure AD User"
				Write-Output "Resorting to whoami info for Toast DisplayName..."
				$Firstname = $Null
			}
		}
		
		#Try to get the DisplayName from whoami
		If ($Null -eq $Firstname)
		{
			Try
			{
				Write-Output "Trying Identity whoami.exe for DisplayName info..."
				$User = whoami.exe
				$Firstname = (Get-Culture).textinfo.totitlecase($User.Split("\")[1])
				Write-Output "DisplayName retrieved from whoami.exe"
			}
			Catch
			{
				Write-Warning "Could not get DisplayName from whoami.exe"
			}
		}
		
		#If DisplayName could not be obtained, leave it blank
		If ($Null -eq $Firstname)
		{
			Write-Output "DisplayName could not be obtained, it will be blank in the Toast"
		}
		
		$CustomHello = "Disk Health Issue Detected"
		
		#Load Assemblies
		[Windows.UI.Notifications.ToastNotificationManager, Windows.UI.Notifications, ContentType = WindowsRuntime] | Out-Null
		[Windows.Data.Xml.Dom.XmlDocument, Windows.Data.Xml.Dom.XmlDocument, ContentType = WindowsRuntime] | Out-Null
		
		#Build XML ToastTemplate 
		[xml]$ToastTemplate = @"
<toast duration="$ToastDuration" scenario="reminder">
    <visual>
        <binding template="ToastGeneric">
            <text>$CustomHello</text>
            <text>$ToastTitle</text>
            <text placement="attribution">$Signature</text>
            <image placement="hero" src="$HeroImage"/>
        </binding>
    </visual>
    <audio src="ms-winsoundevent:notification.default"/>
</toast>
"@
		
		#Build XML ActionTemplate 
		[xml]$ActionTemplate = @"
<toast>
    <actions>
        <action arguments="$ButtonAction" content="$ButtonTitle" activationType="protocol" />
        <action arguments="dismiss" content="Dismiss" activationType="system"/>
    </actions>
</toast>
"@
		
		#Define default actions to be added $ToastTemplate
		$Action_Node = $ActionTemplate.toast.actions
		
		#Append actions to $ToastTemplate
		[void]$ToastTemplate.toast.AppendChild($ToastTemplate.ImportNode($Action_Node, $true))
		
		#Prepare XML
		$ToastXml = [Windows.Data.Xml.Dom.XmlDocument]::New()
		$ToastXml.LoadXml($ToastTemplate.OuterXml)
		
		#Prepare and Create Toast
		$ToastMessage = [Windows.UI.Notifications.ToastNotification]::New($ToastXML)
		[Windows.UI.Notifications.ToastNotificationManager]::CreateToastNotifier($LauncherID).Show($ToastMessage)
		
		Stop-Transcript
	}
}
#endregion RegionName

#region ScriptRunningCode

# Display notification for drive failure if present in the registy
$DriveHealthIssue = [boolean](Get-ChildItem -Path $RegistryBase -Recurse | Get-ItemProperty | Where-Object { $_.Output -notmatch "No action required" })
if ($DriveHealthIssue -eq $true)
{
	Display-ToastNotification
}

#endregion ScriptRunningCode

Reading the registry from the detection script, the remediation script will display a notice to the end user should the output state there is an issue;

Clicking on the IT Service Desk button is again also optional, but this can be configured to open your ticketing system for example.

Monitoring

Going back to Endpoint Analytics, we can see the output for failures by adding the Pre Remediation Output column and then examining those machines where the detection status is “failed”;

Configuration Manager

Not forgetting about those running Configuration Manager clients in standalone environments, the same scripts will of course also function well within a one time script, or within a configuration baseline.

Script run:

Conclusion

Although it might not be strictly possible to get your OEM to issue a replacement, through Proactive Remediations / Endpoint Analytics, we can at least try to predict hard disk failures before they occur.

Thanks for reading

Maurice Daly

Maurice has been working in the IT industry for the past 20 years and currently working in the role of Senior Cloud Architect with CloudWay. With a focus on OS deployment through SCCM/MDT, group policies, active directory, virtualisation and office 365, Maurice has been a Windows Server MCSE since 2008 and was awarded Enterprise Mobility MVP in March 2017. Most recently his focus has been on automation of deployment tasks, creating and sharing PowerShell scripts and other content to help others streamline their deployment processes.

Add comment

Sponsors

Categories

MSEndpointMgr.com use cookies to ensure that we give you the best experience on our website.