after many tests it came out (at least in my case, my setup) that the problem was caused by MQTT …
Removed it and the system is again much much more stable/it doesn’t restart spontaneously so often
Hi Stefano, is your issue completely solved?
I have the same issue in my installation, with container being constantly mounted again by the watchdog, and the system is really slow and unresponsive.
Did you find any root cause?
Thanks
Yes… now my system is absolutelly much more stable (no reboot in 24 hours…)
I just disabled (removed) MQTT broker and integration
I think this is just a symptom of the lack of resources a rpi3 has, for what hass currently needs.
I don’t have mqtt broker running and still have occasional hangs and reboots
I bet nobody has this issue when running rpi4 with 4gb or more
Sure… but now that error about Api call is not happening… so there must be something wrong thar was causing that error (and then the reboots)
I am having the same issue for 2 months now.
I thought it was a minor fix that would be adressed in a following update of supervisor but no supervisor update after 2023.12.0
On my side i also get multiple other errors/warnings
search online points me to the scenario that my network is misbehaving… i dont know if that should cause HA to produce so many errors in supervisor logs
pasting some logs here:
return await self._method(obj, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/supervisor/supervisor/store/git.py", line 178, in pull
raise StoreGitError() from err
supervisor.exceptions.StoreGitError
24-01-04 08:59:31 ERROR (MainThread) [asyncio] Task exception was never retrieved
future: <Task finished name='Task-1147821' coro=<Repository.update() done, defined at /usr/src/supervisor/supervisor/store/repository.py:104> exception=StoreGitError()>
Traceback (most recent call last):
File "/usr/src/supervisor/supervisor/store/git.py", line 136, in pull
await self.sys_run_in_executor(
File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/git/remote.py", line 1014, in fetch
res = self._get_fetch_info_from_stderr(proc, progress, kill_after_timeout=kill_after_timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/git/remote.py", line 853, in _get_fetch_info_from_stderr
proc.wait(stderr=stderr_text)
File "/usr/local/lib/python3.11/site-packages/git/cmd.py", line 600, in wait
raise GitCommandError(remove_password_if_present(self.args), status, errstr)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
cmdline: git fetch -v --update-shallow --depth=1 -- origin
stderr: 'fatal: unable to access 'https://github.com/allenporter/stream-addons/': Could not resolve host: github.com'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/src/supervisor/supervisor/store/repository.py", line 108, in update
await self.git.pull()
File "/usr/src/supervisor/supervisor/jobs/decorator.py", line 296, in wrapper
raise err
File "/usr/src/supervisor/supervisor/jobs/decorator.py", line 289, in wrapper
return await self._method(obj, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/supervisor/supervisor/store/git.py", line 178, in pull
raise StoreGitError() from err
supervisor.exceptions.StoreGitError
24-01-04 09:07:32 INFO (MainThread) [supervisor.homeassistant.core] Home Assistant Core state changed to RUNNING
24-01-04 09:07:32 INFO (MainThread) [supervisor.homeassistant.core] Detect a running Home Assistant instance
....
....
24-01-04 10:20:59 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete
24-01-04 10:31:26 INFO (MainThread) [supervisor.api.middleware.security] /supervisor/info access from cebe7a76_hassio_google_drive_backup
24-01-04 10:31:30 INFO (MainThread) [supervisor.api.middleware.security] /backups access from cebe7a76_hassio_google_drive_backup
24-01-04 10:32:02 ERROR (MainThread) [supervisor.homeassistant.api] Error on call https://172.30.32.1:8123/api/core/state:
new logs from today when my system rebooted again:
24-01-05 12:24:19 INFO (MainThread) [supervisor.api.middleware.security] /backups access from cebe7a76_hassio_google_drive_backup
24-01-05 12:28:51 INFO (MainThread) [supervisor.api.proxy] Home Assistant WebSocket API connection is closed
24-01-05 12:30:28 WARNING (MainThread) [supervisor.misc.tasks] Watchdog miss API response from Home Assistant
24-01-05 12:31:00 INFO (MainThread) [supervisor.api.proxy] Home Assistant WebSocket API request initialize
24-01-05 12:31:00 INFO (MainThread) [supervisor.api.proxy] WebSocket access from a0d7b954_nodered
24-01-05 12:31:01 INFO (MainThread) [supervisor.api.proxy] Home Assistant WebSocket API request running
24-01-05 12:34:00 ERROR (MainThread) [supervisor.homeassistant.api] Error on call https://172.30.32.1:8123/api/core/state:
24-01-05 12:34:00 ERROR (MainThread) [asyncio] Task exception was never retrieved
future: <Task finished name='Task-9397' coro=<WSClient.start_listener() done, defined at /usr/src/supervisor/supervisor/homeassistant/websocket.py:97> exception=ConnectionResetError('Cannot write to closing transport')>
Traceback (most recent call last):
File "/usr/src/supervisor/supervisor/homeassistant/websocket.py", line 104, in start_listener
await self._receive_json()
File "/usr/src/supervisor/supervisor/homeassistant/websocket.py", line 113, in _receive_json
msg = await self._client.receive()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/aiohttp/client_ws.py", line 280, in receive
await self.pong(msg.data)
File "/usr/local/lib/python3.11/site-packages/aiohttp/client_ws.py", line 160, in pong
await self._writer.pong(message)
File "/usr/local/lib/python3.11/site-packages/aiohttp/http_websocket.py", line 709, in pong
await self._send_frame(message, WSMsgType.PONG)
File "/usr/local/lib/python3.11/site-packages/aiohttp/http_websocket.py", line 675, in _send_frame
self._write(header + mask + message)
File "/usr/local/lib/python3.11/site-packages/aiohttp/http_websocket.py", line 702, in _write
raise ConnectionResetError("Cannot write to closing transport")
ConnectionResetError: Cannot write to closing transport
24-01-05 12:34:00 ERROR (MainThread) [asyncio] Task exception was never retrieved
future: <Task finished name='Task-12556' coro=<WebSocketWriter.ping() done, defined at /usr/local/lib/python3.11/site-packages/aiohttp/http_websocket.py:711> exception=ConnectionResetError('Cannot write to closing transport')>
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/aiohttp/http_websocket.py", line 715, in ping
await self._send_frame(message, WSMsgType.PING)
File "/usr/local/lib/python3.11/site-packages/aiohttp/http_websocket.py", line 675, in _send_frame
self._write(header + mask + message)
File "/usr/local/lib/python3.11/site-packages/aiohttp/http_websocket.py", line 702, in _write
raise ConnectionResetError("Cannot write to closing transport")
ConnectionResetError: Cannot write to closing transport
24-01-05 12:34:01 INFO (MainThread) [supervisor.api.proxy] Home Assistant WebSocket API error: Cannot write to closing transport
24-01-05 12:34:01 INFO (MainThread) [supervisor.api.proxy] Home Assistant WebSocket API connection is closed
24-01-05 12:34:14 INFO (MainThread) [supervisor.api.proxy] Home Assistant WebSocket API request initialize
24-01-05 12:34:15 INFO (MainThread) [supervisor.api.proxy] WebSocket access from a0d7b954_nodered
24-01-05 12:34:15 INFO (MainThread) [supervisor.api.proxy] Home Assistant WebSocket API request running
24-01-05 12:39:35 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state running
24-01-05 12:39:35 INFO (MainThread) [supervisor.resolution.checks.base] Run check for multiple_data_disks/system
24-01-05 12:39:35 INFO (MainThread) [supervisor.resolution.checks.base] Run check for ipv4_connection_problem/system
24-01-05 12:39:35 INFO (MainThread) [supervisor.resolution.checks.base] Run check for trust/supervisor
24-01-05 12:39:36 INFO (MainThread) [supervisor.resolution.checks.base] Run check for dns_server_ipv6_error/dns_server
24-01-05 12:39:36 INFO (MainThread) [supervisor.resolution.checks.base] Run check for docker_config/system
24-01-05 12:39:36 INFO (MainThread) [supervisor.resolution.checks.base] Run check for security/core
24-01-05 12:39:36 INFO (MainThread) [supervisor.resolution.checks.base] Run check for dns_server_failed/dns_server
24-01-05 12:39:36 INFO (MainThread) [supervisor.resolution.checks.base] Run check for pwned/addon
24-01-05 12:39:36 INFO (MainThread) [supervisor.resolution.checks.base] Run check for free_space/system
24-01-05 12:39:36 INFO (MainThread) [supervisor.resolution.check] System checks complete
24-01-05 12:39:36 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state running
24-01-05 12:39:41 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
24-01-05 12:39:41 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state running
24-01-05 12:39:41 INFO (MainThread) [supervisor.resolution.fixups.store_execute_reset] Reset corrupt Store: 3d360630
24-01-05 12:39:44 INFO (MainThread) [supervisor.store.git] Cloning add-on https://github.com/allenporter/stream-addons repository
24-01-05 12:39:49 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete
I’ve managed to mitigate the issue this way:
disable_watchdog: "curl -sSL -H \"Authorization: Bearer $SUPERVISOR_TOKEN\" -H \"Content-Type: application/json\" -d '{\"watchdog\": false }' http://supervisor/core/options"
triggered by an automation that kicks in at ha boot
- alias: HA startup
trigger:
platform: homeassistant
event: start
action:
- delay: 00:01:00
- service: shell_command.disable_watchdog
data: {}
After have had again the multiple reboot evey day again for many days… starting from yesterday, after the latest updates, my PI3A+ is NOT rebooting from over ONE entire day (which is a miracle)…
I have to assume that the latest version really solved the problem…
Now the question is: should I stop any future update for not risking to get again the problem ?
Current version is
- Core 2024.1.6
- Supervisor 2024.01.1
- Operating System 11.5
- Frontend 20240104.0
Updated to these versions and it is still very very stable
- Core 2024.2.2
- Supervisor 2024.01.1
- Operating System 11.5
- Frontend 20240207.1
Very happy…
I hope it will continue like this…
sometimes I only see these … but the system is not restarting so frequently as in the past
Login attempt or request with invalid authentication from supervisor (172.30.32.2). Requested URL: ‘/api/core/state’. (HomeAssistantSupervisor/2024.01.1 aiohttp/3.9.3 Python/3.12)
and
24-02-19 09:42:38 INFO (MainThread) [supervisor.homeassistant.api] Updated Home Assistant API token
24-02-19 09:50:15 ERROR (MainThread) [supervisor.homeassistant.api] Timeout on call https://172.30.32.1:8123/api/core/state.
I have a similiar problem as you guys facing. It got worse when I played around with dashboards today. I was creating and modifying a markdown template dashboard item, where I looped over something like this
{{
expand(states.sensor)
|selectattr('state', 'eq', 'unavailable')
|map(attribute='entity_id')
| join('\n')
}}
when I saw the UI freeze and the supervisor log saying
24-02-25 16:23:04 ERROR (MainThread) [supervisor.homeassistant.api] Timeout on call http://172.30.32.1:8123/api/core/state.
So I guess for the state something is wrong which makes it freeze. I try to fix it now by deleting the part in the dashboard.
I recently had the same issue and the root cause CALDav integration connected to an icloud email. Removing that fixed everything
In my case… it is happening again a frequent reboot (:-() and it seems it’s caused by core_mosquitto…
I already had in the past this feeling (that core mosquitt was the cause of reboot) but then with latest updates it did not create any problem…
Now it started again…
24-03-01 10:54:37 ERROR (MainThread) [supervisor.homeassistant.api] Timeout on call https://172.30.32.1:8123/api/core/state.
24-03-01 10:54:42 ERROR (MainThread) [supervisor.homeassistant.api] Timeout on call https://172.30.32.1:8123/api/core/state.
24-03-01 10:54:43 WARNING (MainThread) [supervisor.misc.tasks] Watchdog missed an Home Assistant Core API response.
24-03-01 10:55:33 WARNING (MainThread) [supervisor.misc.tasks] Watchdog found a problem with core_mosquitto application!
24-03-01 10:55:45 INFO (SyncWorker_7) [supervisor.docker.manager] Stopping addon_core_mosquitto application
24-03-01 10:56:24 INFO (SyncWorker_7) [supervisor.docker.manager] Cleaning addon_core_mosquitto application
24-03-01 10:56:47 ERROR (MainThread) [supervisor.misc.tasks] Watchdog missed 2 Home Assistant Core API responses in a row. Restarting Home Assistant Core API!
@spanzetta Hai risolto? Io non ne riesco a venire a capo da settimane ormai.
I miei problemi iniziano quando avvio Node-red (che utilizza core-mosquitto). Ma se uso solo core-mosquitto (per un altro add-on che ne fa uso) non ho di questi errori.
Purtroppo la situazione è estremamente variabile… e non credo avra’ mai una soluzione definitiva.
Ad ogni aggiornamento la situazione può migliorare o peggiorare … e quando sembra stabile (es dopo update Core a 2024.3) poi magari ad un certo punto senza motivo inizia a non rispondere piu e inizia a fare reboot uno dietro l’altro… poi magari torna ad essere stabile…(mi e’ successo proprio ieri, e non era la prima volta).
Dai log l’unica cosa che ho notato è che i due add-on che piu di altri sembrano provocare i blocchi (e quindi i reboot dovuti al watchdog) sono Core Mosquitto (che pero mi serve per cui e’ ON) e File Editor (che tengo spento e accendo solo se serve)…
Ma può anche essere fuorviante… che i problemi sono causati da altro… difficile scoprirlo…
Sino ad un riavvio ogni uno/due giorni lo considero normale (per un RPI con 512Mb di Ram) … quando si riavvia 10 volte in due ore anche no…
Io non ho node-red…
Novità di qualche giorno fa (e successo di nuovo oggi) e che dopo restart vedo che la memoria utilizzata è intorno al 75% invece che 85/88% che sono i valori soliti…
Quando accade ciò è ancora più instabile … e si riavvia… non si capisce come mai ha un utilizzo di memoria inferiore… ma è più instabile…
Solo dopo ulteriori riavvi… la situazione diventa “normale” con utilizzo di memoria del 85/88%…
Misteri!!
@spanzetta Please remember the HA community is English only
(Recuerde que la comunidad HA solo está disponible en inglés.)
Yes… sorry…
No worries we just want everyone to be able to understand the answer