Monitor your HDD SMART STATUS

Hi,

Thought I’d share my yaml tempalte to monitor HDD smart status.

Requirements

  • smartmontools v7.0 or above (because v7 and above has json output).
  • check your smartctl version by typing smartctl --version in terminal
  • you need to setup a cronjob to run the smartctl program to poll the HDD
  • use the smartctl option --nocheck standby so it doesn’t wake up any HDD that are sleeping.
  • the yaml code reads a .json file generated by smartctl (smartmontools) and turns out into a template sensor.

Screenshot:

This assumes you use packages and you have to white-list the directory where you will save the json files.

# configuration.yaml
homeassistant:

  packages: !include_dir_named packages

  whitelist_external_dirs:
    - /config/smartctl

The yaml expects the json files to be located in /config/smartctl so create this subdirectory off the home assistant config directory.

/config/smartctl

Create a script to run smartctl to get the smart data and use root crontab sudo crontab -e to execute the script regularly (see below).
Change sd{a..m} to suit the number of HDD you have eg. sd{a..c} will scan /dev/sda + /dev/sdb + /dev/sdc
Change the target directory to save the json files inside the home assistant directory
for me I use docker so it’s mapped to /home/mike/.docker/config/homeassistant/smartctl/
The sensor in homeassistant looks for the files at /config/smartctl

#!/bin/bash
for arg in sd{a..m}; do /usr/sbin/smartctl --info --all --json --nocheck standby /dev/$arg > /home/mike/.docker/config/homeassistant/smartctl/$arg.json; done

Duplicate the below yaml for each HDD you want to monitor and use a find + replace to replace “sda” with “sdb” or whatever your HDD is.

homeassistant:      
  customize:
    sensor.hdd_sda:
      friendly_name: PARITY-10TB1 # <-- GIVE YOUR HDD A FRIENDLY NAME IF YOU WANT
      icon: mdi:harddisk

#################################################
#################################################
#################################################

sensor:

  - platform: mqtt
    name: hdd_sda
    state_topic: 'smartctl/sda/state'
    json_attributes_topic: 'smartctl/sda/attributes'

#################################################################################################################################
#                                                                                                                               #
# #!/bin/bash                                                                                                                   #
# /usr/sbin/smartctl --info --all --json --nocheck standby /dev/sda > /home/mike/.docker/config/homeassistant/smartctl/sda.json #
#                                                                                                                               #
#################################################################################################################################

  - platform: command_line
    name: smartctl_sda_json
    command: "/bin/cat /config/smartctl/sda.json" # <--- THIS READS THE .JSON TXT FILE
    value_template: "{{ value_json.smartctl.exit_status }}"
    json_attributes:
      - smartctl
      - device
      - model_name
      - user_capacity
      - smart_status
      - ata_smart_attributes
      - temperature
      - ata_smart_self_test_log
    scan_interval: 30
          
automation:

  - alias: "smartctl_sda"
    trigger:
      - platform: state
        entity_id: sensor.smartctl_sda_json
      - platform: homeassistant
        event: start
    action:
      service_template: >-
        {% if is_state('sensor.smartctl_sda_json','0') %}
          script.smartctl_sda_awake
        {% else %}
          script.smartctl_sda_sleep
        {% endif %}

script:

  smartctl_sda_awake:
    sequence:
      - service: mqtt.publish
        data:
          topic: "smartctl/sda/state"
          payload: "Awake"
          retain: true
      - service: mqtt.publish
        data_template:
          topic: "smartctl/sda/attributes"
          # IF YOU HAVE PROBLEMS WITH THE SENSOR YOU CAN COPY+PASTE THE PAYLOAD INTO HOME ASSISTANT TEMPLATE EDITOR
          payload: >-
            {
              "last updated": "{{ states('sensor.date_time') }}",
              "model name": "{{ state_attr('sensor.smartctl_sda_json','model_name') | string }}",
              "device": "{{ state_attr('sensor.smartctl_sda_json','device').name | string }}",
              "size": "{{ (state_attr('sensor.smartctl_sda_json','user_capacity').bytes / 1000000000000) | round(2)}} TB",
              "temperature": "{{ state_attr('sensor.smartctl_sda_json','temperature').current }}",
              "smart status": "{% if states.sensor.smartctl_sda_json.attributes.smart_status.passed  %} Healthy {% else %} Failed {% endif %}",
              "power on time (hrs)": {% set ns = namespace(found=false) %}{% for i in state_attr('sensor.smartctl_sda_json','ata_smart_attributes').table %}{%- if i.id == 9 %}{% set ns.found = true %}"{{ i.raw.value }}"{% else %}{% endif -%}{% endfor %}{% if not ns.found %}"not available"{% endif %},
              "power cycle count": {% set ns = namespace(found=false) %}{% for i in state_attr('sensor.smartctl_sda_json','ata_smart_attributes').table %}{%- if i.id == 12 %}{% set ns.found = true %}"{{ i.raw.value }}"{% else %}{% endif -%}{% endfor %}{% if not ns.found %}"not available"{% endif %},
              "start stop count":{% set ns = namespace(found=false) %}{% for i in state_attr('sensor.smartctl_sda_json','ata_smart_attributes').table %}{%- if i.id == 4 %}{% set ns.found = true %}"{{ i.raw.value }}"{% else %}{% endif -%}{% endfor %}{% if not ns.found %}"not available"{% endif %},
              "SMART5": {% set ns = namespace(found=false) %}{% for i in state_attr('sensor.smartctl_sda_json','ata_smart_attributes').table %}{%- if i.id == 5 %}{% set ns.found = true %}"{{ i.raw.value }}"{% else %}{% endif -%}{% endfor %}{% if not ns.found %}"not available"{% endif %},
              "SMART187": {% set ns = namespace(found=false) %}{% for i in state_attr('sensor.smartctl_sda_json','ata_smart_attributes').table %}{%- if i.id == 187 %}{% set ns.found = true %}"{{ i.raw.value }}"{% else %}{% endif -%}{% endfor %}{% if not ns.found %}"not available"{% endif %},
              "SMART188": {% set ns = namespace(found=false) %}{% for i in state_attr('sensor.smartctl_sda_json','ata_smart_attributes').table %}{%- if i.id == 188 %}{% set ns.found = true %}"{{ i.raw.value }}"{% else %}{% endif -%}{% endfor %}{% if not ns.found %}"not available"{% endif %},
              "SMART197": {% set ns = namespace(found=false) %}{% for i in state_attr('sensor.smartctl_sda_json','ata_smart_attributes').table %}{%- if i.id == 197 %}{% set ns.found = true %}"{{ i.raw.value }}"{% else %}{% endif -%}{% endfor %}{% if not ns.found %}"not available"{% endif %},
              "SMART198": {% set ns = namespace(found=false) %}{% for i in state_attr('sensor.smartctl_sda_json','ata_smart_attributes').table %}{%- if i.id == 198 %}{% set ns.found = true %}"{{ i.raw.value }}"{% else %}{% endif -%}{% endfor %}{% if not ns.found %}"not available"{% endif %},
              {%- for i in state_attr('sensor.smartctl_sda_json','ata_smart_self_test_log') -%}
                {%- if i == "standard" -%}
                  {%- for x in state_attr('sensor.smartctl_sda_json','ata_smart_self_test_log').standard.table %}
              "TEST {{ loop.index }}": "{{ x.type.string }}, {{ x.status.string }} @ {{x.lifetime_hours }} hrs",
                  {%- endfor -%}
                {%- endif -%}
                {%- if i == "extended" -%}
                  {%- for x in state_attr('sensor.smartctl_sda_json','ata_smart_self_test_log').extended.table %}
                  "TEST {{ loop.index }}": "{{ x.type.string }}, {{ x.status.string }} @ {{x.lifetime_hours }} hrs",
                  {%- endfor -%}
                {%- endif -%}
              {%- endfor %}
              "SMART Key": "5=Reallocated_Sector_Ct 187=Reported_Uncorrect 188=Command_Timeout 197=Current_Pending_Sector 198=Offline_Uncorrectable",
              "SMART Ref": "https://www.backblaze.com/blog/what-smart-stats-indicate-hard-drive-failures/"
            }          
          retain: true
          
  smartctl_sda_sleep:
    sequence:
      - service: mqtt.publish
        data:
          topic: "smartctl/sda/state"
          payload: "Sleep"
          retain: true
9 Likes

reserved 1

reserved 2

Very nice and thanks for the share but it works only with local drives on machine where you are running HA right ? :frowning: have not seen anyway to use that remotely as monitoring tool for some servers for example :wink:

Are you using snapraid by any chance ?

Yes I’m setup for snapraid, but I don’t have it enabled as I’ve been changing some things.

Added some extra code to show the recent smart self test data as well.

Yes you could easily do that. there are some scripts that you can find on the internet that will do that and then email you the results. There are some scripts you could probably modify then use http sensor and curl command to get the data into home assistant.

Is the exit status the correct way to test if a drive is awake or asleep? I have a drive with exit status 128 and one with 32 which are both awake.

I think the automation should be like this:

automation:

  - alias: "smartctl_sda"
    trigger:
      - platform: state
        entity_id: sensor.smartctl_sda_json
      - platform: homeassistant
        event: start
    action:
      service_template: >-
        {% if states('sensor.smartctl_sda_json')|int|bitwise_and(2)>0 %}
          script.smartctl_sda_sleep
        {% else %}
          script.smartctl_sda_awake
        {% endif %}

According to the exit codes from here: https://linux.die.net/man/8/smartctl

Edit: Better to use a bitmask.

ahh you are right

Finally got around to taking a look at this for some data gathering I wanted.

You may have included it already but there is an interesting trick I’ve used to avoid duplicating the automation across all drives. It also lets you set the sensor based on drive and get the right data irrespective of what the underlying Linux OS does with the drive mount point - handy if a disk goes missing to stop the rest of the sensors getting the wrong data.

In short, have all smartctl data sent to the same queue and then the automation bounces it to a target sensor based on the disk serial number received rather than any assumptions.

For the example below you could swap the disk/motherboard connections around in the machine and the reporting would still be right without any changes.

sensor smartctlJson:
  name: "smartctl_json"
  platform: mqtt
  state_topic: "/home/sheridan/disks/smartctl_json"
  json_attributes_topic: "/home/sheridan/disks/smartctl_json"
  json_attributes_template: "{{ value_json | tojson }}"

sensor disk-OS-1:
  name: "Disk_OS_1"
  platform: mqtt
  state_topic: "/home/sheridan/disks/S4CJNJ0N303324F"
  json_attributes_topic: "/home/sheridan/disks/S4CJNJ0N303324F/attributes"

sensor disk-OS-2:
  name: "Disk_OS_2"
  platform: mqtt
  state_topic: "/home/sheridan/disks/S4CJNJ0N303308P"
  json_attributes_topic: "/home/sheridan/disks/S4CJNJ0N303308P/attributes"

automation:
  - alias: 'Publishing smartctl json'
    description: Take all smartctl json from the same topic and publish to the correct disk based on content
    trigger:
    - platform: mqtt
      topic: /home/sheridan/disks/smartctl_json
    condition: []
    action:
    - service: mqtt.publish
      data:
        topic: /home/sheridan/disks/{{ state_attr('sensor.smartctl_json', 'serial_number')
          }}
        payload: "{% if states.sensor.smartctl_json.attributes.smart_status.passed  %} Healthy {% else %} Failed {% endif %}"
        retain: true
    - service: mqtt.publish
      data:
        topic: /home/sheridan/disks/{{ state_attr('sensor.smartctl_json', 'serial_number') }}/attributes
        payload: >-
          {
            "last updated": "{{ states('sensor.date_time') }}",
            "device name": "{{ state_attr('sensor.smartctl_json','device').name | string }}",
            "model": "{{ state_attr('sensor.smartctl_json','model_name') | string }}",
            "smart status": "{% if states.sensor.smartctl_json.attributes.smart_status.passed  %} Healthy {% else %} Failed {% endif %}",
            "temperature": "{{ state_attr('sensor.smartctl_json','temperature').current }}"
          }
        retain: true
    mode: single
1 Like

Hello!
Nice work here :slight_smile:
How about creating a addon/integration?

Hello!
Me again :slight_smile:
How often should crontab poll the HDD? Polling it every minute prevents it from spinning down.
Spin down is set to 30 minutes of inactivity.

That polling command should still allow HDD to sleep and will not wake a HDD.

That would be good but I’m not a programmer I don’t know any programming language

Pooling command does not wake the HDD if it sleeps, but prevents it from entering sleep.
Made few tests and concluded: if pooling time is greater than spindown time, everithing is working as intended.
For some tests I used 5 min. for spindown and 10 min for pooling. Working good.
But if used 10 min. for spindown and 5 min. for pooling, there is no HDD spindown.

Any thoughts?

How about using 30-60 min. for spindown? Pooling should be greater than spindown time? That would be not accurate…

ahhh i see. yes, best to poll it less than the sleep interval then :smiley:

I made a modification in the for loop in the payload to avoid warnings when no smart tests have been performed yet:

            {%- for i in state_attr('sensor.smartctl_sda_json','ata_smart_self_test_log') -%}
                {%- if i == "standard" -%}
                  {%- if 'table' in state_attr('sensor.smartctl_sda_json','ata_smart_self_test_log').standard.keys() -%}                  
                    {%- for x in state_attr('sensor.smartctl_sda_json','ata_smart_self_test_log').standard.table %}
                "TEST {{ loop.index }}": "{{ x.type.string }}, {{ x.status.string }} @ {{x.lifetime_hours }} hrs",
                    {%- endfor -%}
                  {%- endif -%}
                {%- endif -%}
                {%- if i == "extended" -%}
                  {%- if 'table' in state_attr('sensor.smartctl_sda_json','ata_smart_self_test_log').extended.keys() -%}                  
                    {%- for x in state_attr('sensor.smartctl_sda_json','ata_smart_self_test_log').extended.table %}
                    "TEST {{ loop.index }}": "{{ x.type.string }}, {{ x.status.string }} @ {{x.lifetime_hours }} hrs",
                    {%- endfor -%}
                  {%- endif -%}  
                {%- endif -%}
              {%- endfor %}
              "SMART Key": "5=Reallocated_Sector_Ct 187=Reported_Uncorrect 188=Command_Timeout 197=Current_Pending_Sector 198=Offline_Uncorrectable",
              "SMART Ref": "https://www.backblaze.com/blog/what-smart-stats-indicate-hard-drive-failures/"

Otherwise home-assistant will throw warnings in the log because the table key does not exist.

1 Like

Working great, no more useless logs.
Thank you!

I don’t know how to do this, but if smartmontools (http://www.smartmontools.org) was running on other computers on the same network, could this HASS add-on collect SMART data from them and provide that through sensors?