Home Assistant down

I am currently away from home and since a few hours home assistant is unreachable.
No, I did not tinker with any settings, but yes, automatic updates for add-ons are turned on.

HA is running on an RPi 4 with HA OS, and it has been running reliably for quite some time now. It is installed on a SSD; no SD card is part of the system.

I have access to my local network via VPN and I can still ping the RPi. I guess Nginx/some router is still running since the Android app shows a 502 error.

Was any erroneous update released in the last few days?

What are my possibilities to access the system? Already tried App, direct IP, SSH, samba. None of these are working.

Unfortunately I never set up debug access (Debugging the Home Assistant Operating System | Home Assistant Developer Docs).

How are you accessing HA? nabu casa? something else?

I use port forwarding and letsencrypt for SSL encryption.

I can’t access HA locally (via VPN) either, though.

core_2023.7.2 came half a day ago, thou i wouldn’t call it "erroneous "

Haven’t you got an option to reboot your RPI4 through VPN ?

Not that I am aware of. There were a few addon updates. Rather brave of you to update them automatically while you are away.

So it is on the network but no services appear to be running.

Your only option in this case is to connect a keyboard and monitor to see what is going on.

a hard boot might help sometimes in those cases. Had it happen… especially on OS updates. (btw, there was a new OS beta release the other day )

for that reason only I have a power plug that connects over another system than HA and is reachable from out of the network… (being away from home)

aware this doesnt help you now though… maybe ask someone taking care of your home to pull the plug…

Already ordered a Wifi plug to be able to power cycle the RPi without HA :slight_smile:

Yes, connecting a keyboard and monitor now would be helpful…

I guess I’ll leave it be now and will see in a week if I lost data. Neither my nor the life of my home depend on HA, so everything will be fine.

Future self will turn off automatic updates!
Will report here as soon as I know what happened. As mentioned HA was running for 1.5 years now without any hickups.

Came back home today. Connected a screen and a keyboard to the RPi - no output whatsoever.

The RPi was quite hot. Did a power cycle and the RPi incl. HA came up again like nothing happened.

Any chance in still finding out what happened?
I now have a HA-independent shelly plug in front of the RPi but I don’t want to ever have to use it and I’m curious to find out what caused the so-far-rock-solid HA to fail :slight_smile:

Could be related to Homeassistant container memory leak - Out of Memory · Issue #93713 · home-assistant/core · GitHub - updated to 2023.7.3 now where this is fixed: Fix task leak on config entry unload/retry by bdraco · Pull Request #96981 · home-assistant/core · GitHub

Look in the log.

Checked the logs of supervisor & core using journalctl (with help of How to get to your log after restart/restore & Debugging the Home Assistant Operating System | Home Assistant Developer Docs).

Following are the last few log messages from Supervisor & Core + all log messages preceeding the hickup.

Supervisor

Seems to be irrelevant, since the system still ran 1 min later (see “preceeding”)

Jul 21 09:36:54 homeassistant hassio_supervisor[569]: 23-07-21 11:36:54 INFO (MainThread) [supervisor.homeassistant.api] Updated Home Assistant API token
Jul 21 09:39:26 homeassistant hassio_supervisor[569]:
Jul 21 09:39:34 homeassistant hassio_supervisor[569]: 23-07-21 11:39:34 INFO (MainThread) [supervisor.homeassistant.secrets] Request secret backup_password
Jul 21 09:39:47 homeassistant hassio_supervisor[569]: 23-07-21 11:39:47 INFO (MainThread) [supervisor.updater] Fetching update data from https://version.home-assistant.io/stable.json
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state CoreState.RUNNING
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.MULTIPLE_DATA_DISKS/ContextType.SYSTEM
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.PWNED/ContextType.ADDON
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.NO_CURRENT_BACKUP/ContextType.SYSTEM
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.DNS_SERVER_IPV6_ERROR/ContextType.DNS_SERVER
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.DNS_SERVER_FAILED/ContextType.DNS_SERVER
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.IPV4_CONNECTION_PROBLEM/ContextType.SYSTEM
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.FREE_SPACE/ContextType.SYSTEM
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.DOCKER_CONFIG/ContextType.SYSTEM
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.TRUST/ContextType.SUPERVISOR
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.SECURITY/ContextType.CORE
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.check] System checks complete
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.RUNNING
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state CoreState.RUNNING
Jul 21 09:42:42 homeassistant hassio_supervisor[569]: 23-07-21 11:42:42 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete
Jul 21 09:44:26 homeassistant hassio_supervisor[569]:
Jul 21 09:44:43 homeassistant hassio_supervisor[569]: 23-07-21 11:44:43 INFO (MainThread) [supervisor.homeassistant.secrets] Request secret backup_password
-- Boot 6fc9a46b0f764aa78a1ee9315dcb89a4 --
Jul 29 18:24:25 homeassistant hassio_supervisor[553]: s6-rc: info: service s6rc-oneshot-runner: starting

Core

Seems to be completely unrelated

Jul 21 09:36:41 homeassistant homeassistant[569]: 2023-07-21 11:36:41.986 ERROR (MainThread) [homeassistant.components.automation.getter_heizungsmodus] Heizung: Modus get: Error executing script. Error for call_service at pos 1: Invalid option: unknown (possible options: aus, nur WW, Heizen u
Jul 21 09:36:41 homeassistant homeassistant[569]: 2023-07-21 11:36:41.995 ERROR (MainThread) [homeassistant.components.automation.getter_heizungsmodus] Error while executing automation automation.getter_heizungsmodus: Invalid option: unknown (possible options: aus, nur WW, Heizen und WW)
Jul 21 09:37:42 homeassistant homeassistant[569]: 2023-07-21 11:37:42.879 ERROR (MainThread) [homeassistant.components.automation.getter_heizungsmodus] Heizung: Modus get: Error executing script. Error for call_service at pos 1: Invalid option: unknown (possible options: aus, nur WW, Heizen u
Jul 21 09:37:42 homeassistant homeassistant[569]: 2023-07-21 11:37:42.888 ERROR (MainThread) [homeassistant.components.automation.getter_heizungsmodus] Error while executing automation automation.getter_heizungsmodus: Invalid option: unknown (possible options: aus, nur WW, Heizen und WW)
-- Boot 6fc9a46b0f764aa78a1ee9315dcb89a4 --
Jul 29 18:25:51 homeassistant homeassistant[553]: s6-rc: info: service s6rc-oneshot-runner: starting

Preceeding

Jul 21 09:45:29 homeassistant addon_core_configurator[569]: INFO:2023-07-21 11:45:29,281:hass_configurator.configurator:127.0.0.1 - "GET / HTTP/1.1" 200 -
Jul 21 09:45:31 homeassistant hassio_dns[569]: [INFO] 127.0.0.1:48945 - 8213 "PTR IN XXX.XXX.XXX.XXX.in-addr.arpa. udp 56 true 2048" NXDOMAIN qr,rd,ra 45 0.025264369s
Jul 21 09:45:31 homeassistant hassio_dns[569]: [INFO] 172.30.32.1:54066 - 8213 "PTR IN XXX.XXX.XXX.XXX.in-addr.arpa. udp 45 false 512" NXDOMAIN qr,rd,ra 45 0.040048083s
Jul 21 09:45:31 homeassistant hassio_dns[569]: [INFO] 127.0.0.1:48945 - 21260 "PTR IN XXX.XXX.XXX.XXX.in-addr.arpa. udp 55 true 2048" NXDOMAIN qr,rd,ra 44 0.024549987s
Jul 21 09:45:31 homeassistant hassio_dns[569]: [INFO] 172.30.32.1:39321 - 21260 "PTR IN XXX.XXX.XXX.XXX.in-addr.arpa. udp 44 false 512" NXDOMAIN qr,rd,ra 44 0.038033026s
Jul 21 09:45:31 homeassistant hassio_dns[569]: [INFO] 127.0.0.1:48945 - 49743 "PTR IN XXX.XXX.XXX.XXX.in-addr.arpa. udp 55 true 2048" NXDOMAIN qr,rd,ra 44 0.007673401s
-- Boot 6fc9a46b0f764aa78a1ee9315dcb89a4 --
Apr 04 10:55:35 homeassistant systemd-timesyncd[544]: System clock time unset or jumped backwards, restoring from recorded timestamp: Sat 2023-07-29 18:21:40 UTC

do you have an automated backup that would be asking for that secret?

Yes, this is a secret used by Home Assistant Google Drive Backup. A backup is done daily at 3 AM.

That seems to be the last thing executed logged.

1 min later hassio-dns did still run (see “preceeding”)

I guess I was fooled by the term “preceeding”.

hassio-dns is a separate container though.

This happened again, and again when I was away from home.

Did some more research (i.e. HA on Rasperry Pi 4 keeps hanging up) and found out it might be connected to me running HA on a Raspberry Pi with a SSD connected via USB.

Turns out this setup does not always deliver enough power to the SSD and then HA (or whatever else would be running on the Pi) is failing in weird ways - even when using the official RPi 3A power supply.
Still kinda funny that this always happened when I wasn’t home :smiley:

So I reactivated an old Intel Nuc, turned its TDP down and moved my HA setup over.

Now

  • power consumption is roughly the same
  • speed is much faster than on the RPi
  • HA is rock solid - let’s see when I go away for a while again…
  • HA is running in a VM and I have other stuff running in separate VMs now - finally concerns are separated and HA is solely used for home automation
  • I have a RPi available for tinkering

So not a ha problem after all. :slight_smile:

Not at all :slight_smile:

I guess the old adage is still true: check power first!