The Journey to Hard Disk Mastery!

This is my write up of how I now use Home Assistant to manage my external hard disks, manage their temperatures, uptime and overall extend their lifespans with some work arounds and cobbled together ideas. This might inspire you to do the same.

Firstly my setup. I run Home Assistant in a VM (Virtualbox), running on a windows host. Because of this, most of my challenges have been around how to get data into HA from data on the host.

Chapter One, hard disk SMART INFO.

I started with hard disk temperatures. On the host I run CrystalDiskInfo, which was showing some of my drives were getting warm. I had an old USB fan kicking around so plugged it in and the temperatures came right down. So my first thought was how to get those temperatures into HA, and control the on/off of the fan via a smart plug.

This is the route I took…

I downloaded GSmartControl which is a windows GUI on the excellent linux command line smartctl program that can output the SMART info for any hard disk that supports it.

Using the windows port of the command line version of the tool that comes with the windows GUI, I narrowed down the hard disk I wanted to monitor (you could do many if you wish).

Simply using an elevated windows command prompt, I was able to scan all the drives I had and create the following batch file:

@echo off
cd C:\utils\gsmartcontrol-1.1.4-win64
smartctl -A /dev/sde > c:\utils\RebexTinyWebServer\wwwroot\smarthddinfo.txt
REM Wait 10 seconds to make AlwaysUp stop complaining that the script has finished too quickly.
timeout /t 10
exit /b

I keep this batch file in the same folder as the smartctl executable, but you could put it where ever you like.

What this script does is outputs the SMART info of a specific drive (sde) and pipes that output into a text file in another folder. This is an example of the output:

smartctl 7.2 2020-12-30 r5155 [x86_64-w64-mingw32-w10-b22621] (sf-7.2-1)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0004   130   130   054    Old_age   Offline      -       108
  3 Spin_Up_Time            0x0007   213   213   024    Pre-fail  Always       -       293 (Average 324)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       13
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000a   100   100   067    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0004   128   128   020    Old_age   Offline      -       18
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       314
 10 Spin_Retry_Count        0x0012   100   100   060    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       8
 22 Helium_Level            0x0023   100   100   025    Pre-fail  Always       -       100
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       25
193 Load_Cycle_Count        0x0012   100   100   000    Old_age   Always       -       25
194 Temperature_Celsius     0x0002   147   147   000    Old_age   Always       -       44 (Min/Max 18/52)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0

Next, I had to make sure that data was up to date, so created a windows scheduled task to execute the batch file every 3 minutes. That did the trick. You’ll need to run the task with elevated permissions for smartctl to work properly.

Now I had the data, I needed to get it into HA.
I first looked at using a file sensor, or a command line sensor and copy the the file into an SMB share inside the VM that HA could get at, but both had limitations.

What I ended up with was using a very light weight webserver called Rebex, which you can download here:

You’ll notice that the batch file above, puts its output text file into a folder called wwwroot in the rebex folder. This means I can then grab that file via an HTTP call in HA. So I setup a rest sensor like so:

- platform: rest
  name: main_hdd_temperature_restful
  scan_interval: 180
  resource: http://192.168.0.200:1180/smarthddinfo.txt
  value_template: >
    {% if "Temperature_Celsius" in value %}
    {{ value.split("Temperature_Celsius")[1].split("(")[0].split("-")[1] }}
    {%else%}
    99
    {%endif%}

This grabs the data from the Temperature_Celsius line and (in the example above), extracts the value 44. It does this once every 3 minutes.

I now have the temperature updating regularly in HA!:

image

Next, I setup a pretty simple automation to control the fan pointed at the hard disk:

alias: Home Server - HDD Fan Control
description: ""
trigger:
  - platform: state
    entity_id:
      - sensor.main_hdd_temperature_restful
condition: []
action:
  - if:
      - condition: numeric_state
        entity_id: sensor.main_hdd_temperature_restful
        above: 45
    then:
      - type: turn_on
        device_id: 956725bc947f0414fa6e41ca785ee017
        entity_id: switch.tradfri_plug_hdd_fan_switch
        domain: switch
    else: []
  - if:
      - condition: numeric_state
        entity_id: sensor.main_hdd_temperature_restful
        below: 43
    then:
      - type: turn_off
        device_id: 956725bc947f0414fa6e41ca785ee017
        entity_id: switch.tradfri_plug_hdd_fan_switch
        domain: switch
    else: []
mode: single

You could use this method to grab pretty much any data and get it into HA.

Next Chapter…

Managing the power on the hard disks and spinning these up and down as required.

2 Likes

Chapter 2
Reducing power consumption and extending hard disk life.

The setup.
i have 2 x 10tb external usb hard disks. One acts as the primary, and the second as a backup disk.
In the past, both disks have been powered on 24/7, even tho the backup job to syncronise the data across both disks only runs once per day in the early hours of the morning.

The Disks themselves don’t hold any data relevant for HA and are used for other purposes (music, photos etc).

So I started looking at how I get HA to safely power down the backup disk and only bring it online when the backup job runs to copy the data across.

The simple answer was to put the backup disk on a smart plug and just cut the power, only powering the disk when needed. However, arbitrarily cutting the power to a disk is a really bad idea, and can lead to data loss/corruption, so I had to find a way for HA to tell the windows HOST to safely eject/disconnect the disk before I then cut the power.

The answer came from:

This windows client is installed on the host and can receive MQTT messages from HA.

In this case, I setup a satellite command (satellite can be executed regardless of whether the windows session is logged in or not).

The satellite command simply runs the following batch file to unmount the target drives and logs its actions to a file:

@echo off
cd C:\utils\HASS
echo Runtime is: %date% - %time% >> log.txt

@echo Unmounting drive F: >> log.txt
mountvol I: /p >> log.txt

@echo Unmounting completed. >> log.txt
exit

There is also a similar batch file, setup as another satellite command to remount the drives:

@echo off
cd C:\utils\HASS

echo Runtime is: %date% - %time% >> log.txt

echo Remounting drive F: >> log.txt
mountvol F: \\?\Volume{235345erg43646}\ >> log.txt
echo Remounting completed. >> log.txt
exit

Once the commands are setup in HASS Agent, they appear as entities in HA:

Now that i can unmount and mount the drives via HA, its a pretty simple job to run an automation to take advantage of this, and safely power up/down the drive:

alias: Home Server - Remount/Unmount BackUp HDD
description: ""
trigger:
  - platform: time
    id: remount
    at: "03:55:00"
  - platform: time
    id: unmount
    at: "05:00:00"
condition: []
action:
  - if:
      - condition: trigger
        id: remount
    then:
      - type: turn_on
        device_id: removed
        entity_id: switch.tz3000_1h2x4akh_ts011f_switch
        domain: switch
      - delay:
          hours: 0
          minutes: 0
          seconds: 30
          milliseconds: 0
      - service: button.press
        data: {}
        target:
          entity_id: button.lan_pc_satellite_custom_hdd_remount
    else:
      - service: button.press
        data: {}
        target:
          entity_id: button.lan_pc_satellite_custom_hdd_unmount
      - delay:
          hours: 0
          minutes: 0
          seconds: 10
          milliseconds: 0
      - type: turn_off
        device_id: removed
        entity_id: switch.tz3000_1h2x4akh_ts011f_switch
        domain: switch
mode: single

And the outcome ?

A single external USB disk takes around 7 watts on idle (according to my power monitoring smart plug).
So only having the backup disk powered up for 1 hour in every 24 gives me a power saving of around £12-15 a year on my electricity bills, and also DRAMATICALLY increases the life of that drive.

I hope this has inspired others to come up with their own energy saving ideas!

2 Likes

I like your thread! Have you tried to read disk status and send it via Webhook to HA? I find it more secure way if you are running HA in VM and reading data from host…

I would like to approach via Webhook however do not have knowledge :frowning:

I dont think there is much benefit is using a webhook for this project because of the way I’m grabbing the data and it would add a significant amount of complexity sending the webhook messages to HA.

Thank you! Just what I was looking for. This post is underrated!

Alot of happened since this post.
The HASS windows agent is now being maintained under a different repository and can be found here:-

Also, despite using ‘smartctl -n standby’ I’ve noticed my disks still spin up when the command is issued. Not sure why.

Update of my solution if anyone is interested. Goal was to read status of disks - when they idle/sleep and I am sure by updating the script you can do more. I am executing periodically the script over cron. Script sends the data as sensor to my HA instance. Sorry I am unable to copy the script as I am now on tablet. Let me know if you are interested.