The code appears to implement the correct fail-safe mode, namely when it’s unable to restore the automation’s previous state, it sets it to true
. The automation is turned on
. That’s definitely a better fail-safe mode than turning them all off
.
HOWEVER, all automations turned off
is the behavior people are reporting. The implication is that either the restoration feature has a faulty fail-safe mechanism or (long-shot) maybe the aborted startup process doesn’t even get to execute the restoration step.
What I find interesting is that before anyone pointed out the relevant fail-safe code, and how it defaults to on
, it was generally accepted that startup failure would result in automations set to off
not on
.
In fact, that behavior was documented. The following excerpt comes from an older version of the documentation:
If you don’t set this then the previous state prior to restart is restored. However, if you shut down Home Assistant again before it finishes starting, any automation that doesn’t have the initial state set to true
will be stored as being off, and those automations will be disabled at the next startup.
From an even older version:
If you don’t set this the previous state is restored. If you shut Home Assistant down before it finishes starting, the automation will be stored as being off, and your automations will be disabled at the next startup.
It was this observed behavior that led many users to wallpaper all their automations with initial_state: true
to ensure they wouldn’t be disabled after a failed startup.
The current documentation, with its false and misleading statements, only makes a bad situation worse. Yet it’s just the tip of the iceberg. There’s no ‘automation’ in ‘home automation’ if a restart glitch disables all automations.
I applaud you, and all others involved, for starting an investigation into this issue. If you need our help to collect more data on this undesirable behavior, I’m ready and willing to assist.
EDIT
Here’s a data-point: I have never experienced all automations disabled after a restart. My test system is restarted several times daily (I use it to experiment with new automations in the course of assisting community members).
However, the startup process takes less than 10 seconds and it doesn’t rely on any external devices and services (zwave, zigbee, Google, Amazon, Nest, etc) other than MQTT which runs on a separate machine. In other words, there’s very little that can go wrong during my test system’s startup other than me making configuration errors. So my ‘surface area’ for startup failure is probably much smaller than for other people.