HA 'crashing' Connection lost. Reconnecting how to determine the issue

bnoyes1190 · December 4, 2024, 1:02am

Hello, I’m have Home Assistant installed in a Proxmox VM and yesterday something happened and Home Assistant has stopped working. The only change I made was adding a Zigbee device to my zigbee2MQTT instance. All automations have stopped and connections are down. The web UI goes non-responsive about 20-30 seconds after a reboot, but I think I was able to stop & disable zigbee2mqtt but no help. since the web UI ‘crashes’ so quickly, I’m unable to get logs. The “Home Assistant observer” states: Supervisor: Connected Supported: Supported Healthy: Healthy. SSH attempts are rejected. I can access the CLI in the proxmox console and I will note that “CIFS: VFS: cifs_mount failed w/return code = -2” is printed every few minutes, but I think this is a ‘Music Assistant’ config issue and predates the HA issue. I’ve tried many different restart commands and I tried swapping to the ‘other slot’ but the issue occurs on both boot slots and also occurs when restarting in safe mode. Is there any way to figure out what is causing the issue and fix it?
Thanks

WallyR · December 4, 2024, 7:38am

How much ram does your VM have?
I suggest at least 4Gb.

bnoyes1190 · December 4, 2024, 12:38pm

8GB Ram, 2CPUs, 32GB boot disk

One of my automations ran this morning so it is partially working or partially recovered overnight. I was able to pull up the logs this morning and in the supervisor logs I see in red text:

2024-12-04 06:34:23.609 ERROR (MainThread) [asyncio] Task exception was never retrieved
future: <Task finished name='Task-49929' coro=<HomeAssistantWebSocket.async_supervisor_event() done, defined at /usr/src/supervisor/supervisor/homeassistant/websocket.py:322> exception=AttributeError("'NoneType' object has no attribute 'close'")>
Traceback (most recent call last):
  File "/usr/src/supervisor/supervisor/homeassistant/websocket.py", line 267, in async_send_message
    await self._client.async_send_command(message)
  File "/usr/src/supervisor/supervisor/homeassistant/websocket.py", line 95, in async_send_command
    return await self._futures[message["id"]]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
supervisor.exceptions.HomeAssistantWSConnectionError: Connection was closed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/supervisor/supervisor/homeassistant/websocket.py", line 327, in async_supervisor_event
    await self.async_send_message(
  File "/usr/src/supervisor/supervisor/homeassistant/websocket.py", line 269, in async_send_message
    await self._client.close()
          ^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'close'
2024-12-04 06:34:23.610 ERROR (MainThread) [asyncio] Task exception was never retrieved
future: <Task finished name='Task-49930' coro=<HomeAssistantWebSocket.async_supervisor_event() done, defined at /usr/src/supervisor/supervisor/homeassistant/websocket.py:322> exception=AttributeError("'NoneType' object has no attribute 'close'")>
Traceback (most recent call last):
  File "/usr/src/supervisor/supervisor/homeassistant/websocket.py", line 267, in async_send_message
    await self._client.async_send_command(message)
  File "/usr/src/supervisor/supervisor/homeassistant/websocket.py", line 95, in async_send_command
    return await self._futures[message["id"]]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
supervisor.exceptions.HomeAssistantWSConnectionError: Connection was closed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/supervisor/supervisor/homeassistant/websocket.py", line 327, in async_supervisor_event
    await self.async_send_message(
  File "/usr/src/supervisor/supervisor/homeassistant/websocket.py", line 269, in async_send_message
    await self._client.close()
          ^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'close'
2024-12-04 06:34:23.620 ERROR (MainThread) [supervisor.homeassistant.api] Error on call http://172.30.32.1:8123/api/core/state: Cannot connect to host 172.30.32.1:8123 ssl:False [Connect call failed ('172.30.32.1', 8123)]

For the last bit “Error on call http://172.30.32.1:8123/api/core/state” that IP appears to have something to do with Docker? Hassio supervisor: Cannot connect to host 172.30.32.1:8123 ssl:None [Connection refused] but if I run hostnamectl it says I’m running Home Assistant OS? I’m going to take a backup and try updating home assistant since it looks like it is out of date anyway.

Can logs be accessed from the CLI since the WebUI is inaccessible? I’ve tried looking in /usr/share/hassio/homeassistant/homeassistant.log and /var/lib/docker/containers/*/*json.log (where * is my docker ID) but these do not exist.

After updating it is still crashing but I see these lines in the logs before it crashes: 2024-12-04 08:19:23.370 INFO (MainThread) [supervisor.core] Supervisor is down - 0 2024-12-04 08:19:23.371 INFO (MainThread) [__main__] Closing Supervisor [13:19:23] WARNING: Halt Supervisor [13:19:23] INFO: Supervisor restart after closing

WallyR · December 4, 2024, 1:49pm

The NoneType is usually when something is making a reference to an object that is not yet defined.
It can be an integration, but it can also be an automation, script or card with Jinja2.

The common mistake is to make a script that reference the earlier value of something, which will work, when HA is already running, but once you restart HA, then there are no earlier version and then it fails with the error you describe.

bnoyes1190 · December 4, 2024, 11:24pm

Ok, I think that error is before the supervisor crashes, I was able to get logs from the CLI with the command “supervisor logs” and what I get is the above errors but then at the bottom is several lines of:
ERROR (MainThread) [supervisor.homeassistant.api] Timeout on call http://172.30.32.1:8123/api/core/state.
inter mingled with
WARNING (MainThread) [supervisor.misc.tasks] Watchdog missed an Home Assistant Core API response.
then
ERROR (MainThread) [supervisor.misc.tasks] Watchdog missed 2 Home Assistant Core API responses in a row. Restarting Home Assistant Core!
followed by
INFO (SyncWorker_2) [supervisor.docker.manager] Restarting homeassistant
then it repeats. using that I found this open bug report on github My supervisor stop working several times #5448 I wonder if this is the same issue I’m having.

I also ran ‘docker ps’ from the CLI (after login) and I noticed “homesassistant/amd64-addon-configurator:5.8.0” has its status listed as (unhealthy). Is there a way to reinstall this ‘add on’?

WallyR · December 5, 2024, 7:06am

I think you need to try and extract the homeassistant.log.1 from the config library.
It is the log file from the previous run.

bnoyes1190 · December 6, 2024, 12:40pm

Ok, I did find the homeassistant.log.1 file but i couldn’t figure out a way to move it to my desktop computer since you can’t move files between a Proxmox vm and the desktop and the Home Assistant OS apparently doesn’t allow mounting a USB drive (at least not easily, I tried mounting it in bash but it kept saying access denied or read-only file system when I tried)

I did however manage to get a full backup downloaded before the webui crashed.

I tried restoring the backup to a new vm with a fresh Home Assistant OS install (which took several hours to restore) and experienced the same exact ‘crashing’. once again “docker ps” listed “unhealthy” next to ‘homesassistant/amd64-addon-configurator:5.8.0’.

Next I opened the backup in 7zip and removed the “core_configurator.tar.gz” from the archive and again restored that to a new vm with a fresh HAOS install and after several more hours of restoring (i just let it go overnight) I am 75% back up and running. The only issue I have now is zigbee2mqtt refuses to start, it says it cannot find the zigbee dongle but it is there when I run ‘ls -l /dev/serial/by-id’ so I’ll have to dig into that more.

NathanCu · December 6, 2024, 12:46pm

If you’re in HAOS install samba share adon to get smb access to the HA filesystem through SMB. Easy peasy file access.

Yes I know you’re troubleshooting but you cN activate it and deactivate it at will and only turn it on when you’re trying to transfer files.

WallyR · December 6, 2024, 2:36pm

The backup file is just a compressed file, so WinZip, WinRAR, 7-Zip should be able to open it.
I use 7-zip for it.

The way you can then extract the homeassistaby.log.1 file for further error hunting.

bnoyes1190 · December 6, 2024, 9:55pm

can the smb addon be installed from the command line? the Web UI on that HAOS vm is basically unusable.

I think everything is up and running on the ‘new’ HAOS VM.