Trying to work out why HA is restarting every few minutes

I’m trying to debug why my home assistant has started restarting regularly.
It is running on a pi4, 4GB RAM, with a 500GB SSD

Home Assistant 2022.12.8
Supervisor 2022.12.1
Operating System 9.4
Frontend 20221213.1 - latest

But currently it is restarting up to every few minutes.

The logs seem to show:

Logger: homeassistant.components.websocket_api.http.connection
Source: components/websocket_api/http.py:132
Integration: Home Assistant WebSocket API (documentation, issues)
First occurred: 23:45:51 (333 occurrences)
Last logged: 23:45:53

[548337876272] Client exceeded max pending messages [2]: 2048

Looking in the logs, with the http api in debug mode, there are a lot of messages:

2022-12-29 23:35:01.796 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection] [547422689776] Sending {"id":218,"type":"event","event":{"event_type":"state_changed","data":{"entity_id":"sensor.rubicson_42_01_temperature","old_state":{"entity_id":"sensor.rubicson_42_01_temperature","state":"4.2","attributes":{"state_class":"measurement","event":"0a520a1e4201002a150269","unit_of_measurement":"°C","assumed_state":true,"device_class":"temperature","friendly_name":"Fridge"},"last_changed":"2022-12-29T23:34:04.774128+00:00","last_updated":"2022-12-29T23:34:04.774128+00:00","context":{"id":"01GNG3RN76QCQ24VG3CYAM2VBD","parent_id":null,"user_id":null}},"new_state":{"entity_id":"sensor.rubicson_42_01_temperature","state":"4.2","attributes":{"state_class":"measurement","event":"0a520a224201002a150279","unit_of_measurement":"°C","assumed_state":true,"device_class":"temperature","friendly_name":"Fridge"},"last_changed":"2022-12-29T23:35:01.765385+00:00","last_updated":"2022-12-29T23:35:01.765385+00:00","context":{"id":"01GNG3TCW5YNTA5RS6NTN2AEA4","parent_id":null,"user_id":null}}},"origin":"LOCAL","time_fired":"2022-12-29T23:35:01.765385+00:00","context":{"id":"01GNG3TCW5YNTA5RS6NTN2AEA4","parent_id":null,"user_id":null}}}
2022-12-29 23:35:01.797 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection] [547422689776] Sending {"id":218,"type":"event","event":{"event_type":"rfxtrx_event","data":{"packet_type":82,"sub_type":10,"type_string":"Rubicson","id_string":"42:01","data":"0a520a224201002a150279","values":{"Temperature":4.2,"Humidity":21,"Humidity status":"normal","Humidity status numeric":2,"Battery numeric":9,"Rssi numeric":7},"device_id":"fbd0b476735c0ec363eb99e836811bb6"},"origin":"LOCAL","time_fired":"2022-12-29T23:35:01.785305+00:00","context":{"id":"01GNG3TCWSETTASRXXMWMS6RTX","parent_id":null,"user_id":null}}}
2022-12-29 23:35:01.798 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection] [547377908608] Sending [{"id":21,"type":"event","event":{"c":{"sensor.rubicson_42_01_battery":{"+":{"lc":1672356901.763172,"c":"01GNG3TCW33FS79TNW8A86ACG2","a":{"event":"0a520a224201002a150279"}}}}}},{"id":21,"type":"event","event":{"c":{"sensor.rubicson_42_01_signal_strength":{"+":{"s":"-64","lc":1672356901.764008,"c":"01GNG3TCW4RYA777R0EFFMH1ZN","a":{"event":"0a520a224201002a150279"}}}}}},{"id":21,"type":"event","event":{"c":{"sensor.rubicson_42_01_humidity":{"+":{"lc":1672356901.764475,"c":"01GNG3TCW46686WR7M3A1N0DGN","a":{"event":"0a520a224201002a150279"}}}}}},{"id":21,"type":"event","event":{"c":{"sensor.rubicson_42_01_humidity_status":{"+":{"lc":1672356901.764951,"c":"01GNG3TCW441Y3RGTK6KPG5647","a":{"event":"0a520a224201002a150279"}}}}}},{"id":21,"type":"event","event":{"c":{"sensor.rubicson_42_01_temperature":{"+":{"lc":1672356901.765385,"c":"01GNG3TCW5YNTA5RS6NTN2AEA4","a":{"event":"0a520a224201002a150279"}}}}}}]
2022-12-29 23:35:01.836 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection] [547422689776] Sending {"id":218,"type":"event","event":{"event_type":"call_service","data":{"domain":"homeassistant","service":"restart","service_data":{}},"origin":"LOCAL","time_fired":"2022-12-29T23:35:01.835752+00:00","context":{"id":"01GNG3TCYB0PYB29XQFH38BRJV","parent_id":null,"user_id":"09f6687f64a04b91afb8a1f69d609b0a"}}}
2022-12-29 23:35:01.992 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection] [547422689776] Sending {"id":218,"type":"event","event":{"event_type":"homeassistant_stop","data":{},"origin":"LOCAL","time_fired":"2022-12-29T23:35:01.991781+00:00","context":{"id":"01GNG3TD37H50FZHQV0TTAWFWA","parent_id":null,"user_id":null}}}

But nothing obviously invalid

The supervisor logs show the restart, but I can’t so far track down what has triggered it:

22-12-29 23:10:07 INFO (MainThread) [supervisor.api.middleware.security] /backups access from cebe7a76_hassio_google_drive_backup

Then nothing for 5 minutes, then: 

22-12-29 23:15:02 INFO (SyncWorker_7) [supervisor.docker.interface] Restarting ghcr.io/home-assistant/raspberrypi4-64-homeassistant

The vast majority of the API calls in the log are state updates for all the parameters of all the devices in the house

In the first log section, there is a mention of calling home assistant restart service.

Any automation or script, would cause this

Yes - that was my first thought. I checked all my automations (I don’t have very many), and none call a restart - or were shown as having been called recently.

There are no scripts defined when I check the web UI scripts section

Most of the ~30 devices in the house run tasmota and had a status update interval of 10 seconds. That feels like it should be fine - but I dropped it to the default of 300 seconds to see if that had any effect. It doesn’t look like it helped though.

The supervisor logs show:

22-12-29 23:56:30 INFO (MainThread) [supervisor.api.proxy] Home Assistant WebSocket API request running
22-12-30 00:05:02 INFO (SyncWorker_6) [supervisor.docker.interface] Restarting ghcr.io/home-assistant/raspberrypi4-64-homeassistant
**22-12-30 00:05:02 INFO (MainThread) [supervisor.api.proxy] Home Assistant WebSocket API error: Received message 8:1000 is not str**
22-12-30 00:05:02 INFO (MainThread) [supervisor.api.proxy] Home Assistant WebSocket API connection is closed
22-12-30 00:05:34 INFO (MainThread) [supervisor.homeassistant.core] Wait until Home Assistant is ready

Not sure if the highlighted line is relevant…

Ah - hang on…

I have a watchdog on another machine that can, in theory, restart HA via the API if certain sensors are not responding. I have disabled it’s ability to do that, and increased it’s logging verbosity so that I can see if it would otherwise have tried to…

And yes. I’m almost certain now that was indeed the problem. The remote script was being much too trigger happy, and restarting HA via the API when it really should not have.