Automations and Traces

This is an odd problem I’d like to understand more about.

I use a Home Assistant Dashboard as my bedside clock radio alarm. An Automation is triggered at the time set in an input_datetime helper and which plays a (Music Assistant) media_player and sets a Timer that stops the playing after a specified time.

This has been working for a few weeks although I’m still ironing out some wrinkles in the whole process. However, one thing that has always been 100% accurate and reliable is the triggering of the Automation.

Not so much this morning however as the alarm did not trigger. There was a power cut yesterday and I find that can ‘upset’ HA, but this was a specific failure and I checked a few things.

I can see in the Traces that Yesterday morning the media_player started playing and the Timer was started, all at the correct alarm time and triggered by the Automation - but that Automation has no record of that. Its Traces show the alarm of the day before and also this morning (more on that below), but nothing for yesterday morning, when it obviously did actually trigger and run.

Then this morning, the alarm did not trigger at the specified time. No Automation trigger, no Timer started and the media_player did not play. As if the Automation was disabled (which it was not). Looking at all the Traces etc. I can see no reason why it simply did not trigger when it should.

That last point being pertinent as 1 hour, 6 minutes and a few seconds after it should, the Automation DID trigger (at the set alarm time according to the Trace, but clearly that was incorrect), the player started playing and the Timer was started.

I do find that a power cut gives HA indigestion, but why does that actually occur (time is time, independent of whether HA lost power in the meantime) and how could it cause an Automation to lose its (then most recent) Trace and subsequently trigger at an odd time just over an hour after it should?

Just want to try and understand more about this, so hoping for some insight into HA’s inner workings. :confused:

Are you running HA on a VM? That’s often the underlying cause for time triggers not behaving.

Please post the properly-formatted YAML for your automation.

No, I’m running Supervisor on a dedicated Orange Pi.

I can post anything of relevance to help track this down and here’s the Automation:-

alias: CRA - MasterBedroom - Weekday
description: Activate {{ WHERE }} alarm on {{ WHEN }}
triggers:
  - trigger: time
    at: input_datetime.radio_alarm_masterbedroom_weekday
    weekday:
      - mon
      - tue
      - wed
      - thu
      - fri
conditions: []
actions:
  - action: script.cralarms
    data:
      WHERE: "{{ trigger.entity_id.split('_')[-2] }}"
      WHEN: "{{ trigger.entity_id.split('_')[-1] }}"
mode: single

Anything else be of help?

BTW, the input_datetime is simply set at 07:25 and has not changed.
The template in the description is just a note to me and not intended to be evaluated at all.
The reason for the templating in the action: is that I have several of these alarm Automations (different rooms, different days) and being able to simply copy the YAML between them makes configuring them much simpler.

In any case, that seems to have nothing to do with the triggering and this morning it simply did not trigger when it should, but then did just over an hour later and this makes no sense to me.

I do think it’s power-cut related, but how can that affect a time trigger?

Just had a thought.

I’m calling the script directly which means if there’s a problem in that called script, this one will simply wait. Which would prevent it triggering subsequently.

Maybe this had occurred and after it failed this morning and I was ferreting around in the all the config, I did something that freed up the ‘blockage’ and it immediately kicked off, even though an hour late.

Perhaps I should change to script_turn_on: to prevent a reoccurrence.

I would then need to figure what the problem was with script.cralarms.

Ok, another oddity. I’ve just noticed another Automation doing the same thing.

This one has multiple triggers, some of which seem to be triggering when they should, but just seen the Traces of a couple of triggers that occurred late - bizarrely the exact same 1 hour, 6 minutes and 42 seconds late.

I wondered if maybe the recent change from BST back to GMT (UTC) might be involved, but that’s an exact 1 hour and only some triggers have been affected.

What’s so special about 66 minutes?

Here’s a screenshot of the trace showing when it was triggered and in the background is the window showing the actual value of the input_datetime it is using for the trigger.

It is simply not being triggered when it should be and no script calling in this Automation so that’s not the issue.

alias: CRA - set volumes by time
description: Set lower volume for evening/nightime listening
triggers:
  - trigger: time
    at: input_datetime.radio_volume_masterbedroom_day
  - trigger: time
    at: input_datetime.radio_volume_masterbedroom_night
  - trigger: time
    at: input_datetime.radio_volume_masterbathroom_day
    enabled: false
  - trigger: time
    at: input_datetime.radio_volume_masterbathroom_night
    enabled: false
  - trigger: time
    at: input_datetime.radio_volume_mainbathroom_day
    enabled: false
  - trigger: time
    at: input_datetime.radio_volume_mainbathroom_night
    enabled: false
  - trigger: time
    at: input_datetime.radio_volume_frontbedroom_day
    enabled: true
  - trigger: time
    at: input_datetime.radio_volume_frontbedroom_night
    enabled: true
  - trigger: time
    at: input_datetime.radio_volume_kitchen_day
  - trigger: time
    at: input_datetime.radio_volume_kitchen_night
conditions: []
actions:
  - if:
      - condition: template
        value_template: "{{ states(MPENTITY) not in ['playing', 'buffering'] }}"
    then:
      - action: media_player.volume_set
        data:
          volume_level: "{{ VOL2USE }}"
        target:
          entity_id: "{{ MPENTITY }}"
variables:
  WHERE: "{{ trigger.entity_id.split('_')[-2] }}"
  MPENTITY: media_player.airplayer_{{ WHERE }}
  VOL2USE: "{{ states(trigger.entity_id | replace('datetime', 'number')) | int /100 }}"
mode: parallel
max: 10

These are the last few traces of this Automation:-


The latest one is the correct time, but all the earlier ones are 66 mins and 42 secs late. That’s 4,002 seconds.

WTF?

How long did the power outage last ? 1h 6m 42s?

Simplest solution might be a UPS for the Pi.

Funny you should say that but my wife just suggested the same thing and it is very possible that the power was indeed out for precisely that length of time.

However, I am still puzzled as to why it then caused that delay to some Automation triggers. The internal clock was not incorrect so it’s not that. It’s almost as if Home Assistant queues up several future triggers and so they got put back by the power out time. Still makes no sense though as I would expect HA to simply be watching the clock NOW and the fact power had been out should be irrelevant.

It would however possibly explain why I do seem to get erratic behaviour from HA after there has been a power cut (annoyingly not that uncommon here). Not everything, but some events are ‘off’’ and it takes a day or so to settle down.

Indeed, this morning the alarm triggered at precisely the correct time. :grinning:

So the same Automation and same trigger, like clockwork for weeks before the power cut, then delayed by that 66 mins. and 42 secs. (very possibly the length of time the power was out) for 2 days and now correct again. All without any changes to any code.

Anyone knowing enough about the internal workings of HA able to explain this?

A UPS would of course avoid this problem, but we’re planning on installing a solar panels and battery storage system with instant switchover gateway that will solve the problem for the entire house. So I’m trying to avoid wasting time and money on short term solutions in the meantime.

I can’t really shed light on the reason although I thought Pis didn’t have an RTC. They get time from the cloudflare ntp server.
Adding a UPS or restarting HA should fix it.

  • Does your model of Orange Pi have an RTC?

  • Is the the RTC’s battery installed?

  • Is the battery functional?

Try this experiment:

  1. Login to the Orange Pi and check its system time.
  2. Disconnect the Orange Pi from the Internet (assuming it uses an Internet-based NTP server as a fallback for time reference and not a local one, like maybe on your router).
  3. Reboot the Orange Pi (shutdown, wait a few seconds, restart).
  4. Login again (ensure it’s still disconnected from the Internet) and check the system time.

If there’s a significant discrepancy from the previous time then it means the RTC is not functional. Your Orange Pi relies on an Internet-based NTP server (unavailable when the Internet is down).

1 Like

I’ll look into that further, but it still makes little sense to me.

The HA dashboard in question shows a big digital clock, as well as the media_player that is set to play when the alarm is triggered.

No matter the power cut or anything else, all HA dashboard clocks are always correct and I assumed Automations trigger from that same clock. Yet when that (HA’s) clock showed 07:25 (the specified trigger time), the Automation did not trigger. When the clock showed 08:31 (and 42 secs. later but secs not displayed) the 07:25 alarm did trigger.

So the clock/time being used to trigger the alarm Automation is NOT the same as it was actually displaying on screen.

Does HA use 2 independent clocks for different purposes?

In fact, is the displayed clock in the dashboard actually provided purely by the card, which might get its own time from the Internet, while HA is using a different internal clock for its triggering and that halts when power is cut off (obviously), but due to the lack of proper shutdown procedures, just carries on from where it stopped when power is restored, rather than immediately getting a fresh and correct time.

So Automations will be triggered late, by the length of time the power was out - until HA updates its clock. Hence there can be several late triggers until it sorts itself out by correcting its internal clock. I experienced 2 mornings with late triggers, then on the third day, it had corrected itself. I did not restart HA in the meantime so I could hopefully get some idea what was occurring.

I don’t believe the Orange Pi has a battery, so no permanent RTC. However in that case, don’t such devices simply grab time from the Internet ASAP and then run from that. So my understanding is that any internal clock should have been correct from restart.

Actually, in the early days, I seem to recall that with no Internet, a batteryless device required its internal clock to be be manually set. Nowadays, software can do that as soon as the Internet connection is established.

The remaining question is what clock does HA use to trigger Automations.

None of this is crucially important of course and future installation of home battery storage etc will eliminate the problem in any case. However, I’m just interested in trying to figure out what might cause it.

The vast majority of time-related issues reported on this forum have been caused by the system clock (or misconfiguration of the location/timezone).

The odds of the cause being some unique bug in Home Assistant are very low.

A system clock problem.

Yes, but it’s successful only if it’s reachable at the moment it attempts to access it.

If you have a means of logging in to your Orange Pi while it’s disconnected from your local network:

  • Disconnect it from your local network
  • Shut it down for an hour
  • Turn it back on, log in and check its system clock time (it has no access to your LAN, the Internet, or an RTC).

The observed system time will be used by Home Assistant (for all time-based operations, notably automations). It will not be adjusted until your system can establish a connection to an NTP server (time reference).

I checked and the Orange Pi has an RTC, but no battery for backup.

It runs chronyd which acts as an NTP client (and server, but that’s not relevant) and is configured to sync the RTC with the system clock every 11 minutes. They certainly are currently in sync and it knows the timezone and current daylight savings.

I do know that this time, when power was restored the O-Pi started before my local DNS so wouldn’t have been able to access external resources and so clocks could have been out of sync and just plain wrong.

However, that could only have lasted for 11 minutes, then it should have grabbed accurate time from the configured NTP server and sync’d the clocks. So still doesn’t explain why it’s out of wack for more than a day.

Actually that’s not completely true. it syncs the 2 clocks every 11 mins, but I don’t know how often it uses NTP to correct the local clocks. I seem to recall that is based on how accurate is the local timekeeping? So maybe it wouldn’t hit a configured NTP server for over a day and during that time its clocks would be wrong. But why would they be slow by the time the power was out. When starting up and no access to the 'Net, the clocks would be unset. It has no way of knowing what the time was when power shut off.

Also, you’re saying that HA uses system time for e.g. automation triggers which makes sense. But if that was out by that 66 mins, how could the dashboards show the correct time - unless clock type cards get their own time from the 'Net independently and don’t use system time, that HA uses. I don’t know enough about HA dashboard’s internal workings to determine that either way.

Like I said, this is no big problem, just an interesting intellectual exercise to understand what happens and I guess I will have to wait until it happens again and then immediately check the O-Pi’s system and RTC clocks to see what they show.

… from the client displaying it.