Home assistant randomly crash

daschmidt · June 22, 2022, 8:01pm

Good evening,

I run home assistant on a raspberry pi 4 with a SSD

settings

System Health

version: core-2022.6.6
installation_type: Home Assistant OS
dev: false
hassio: true
docker: true
user: root
virtualenv: false
python_version: 3.9.12
os_name: Linux
os_version: 5.15.32-v8
arch: aarch64
timezone: Europe/Berlin

GitHub API: ok
GitHub Content: ok
GitHub Web: ok
GitHub API Calls Remaining: 4888
Installed Version: 1.25.5
Stage: running
Available Repositories: 1061
Downloaded Repositories: 18

logged_in: true
subscription_expiration: 25. Juni 2022 um 02:00
relayer_connected: true
remote_enabled: true
remote_connected: true
alexa_enabled: true
google_enabled: false
remote_server: eu-central-1-0.ui.nabu.casa
can_reach_cert_server: ok
can_reach_cloud_auth: ok
can_reach_cloud: ok

host_os: Home Assistant OS 8.2
update_channel: stable
supervisor_version: supervisor-2022.05.3
agent_version: 1.2.1
docker_version: 20.10.14
disk_total: 222.8 GB
disk_used: 26.5 GB
healthy: true
supported: true
board: rpi4-64
supervisor_api: ok
version_api: ok
installed_addons: File editor (5.3.3), Log Viewer (0.14.0), Home Assistant Google Drive Backup (0.108.2), Z-Wave JS (0.1.62), Terminal & SSH (9.4.0), Check Home Assistant configuration (3.10.2), Samba share (9.7.0), MariaDB (2.4.0), Vaultwarden (Bitwarden) (0.17.0), Nginx Proxy Manager (0.12.0), InfluxDB (4.5.0), Duck DNS (1.15.0), ESPHome (2022.6.1), Mosquitto broker (6.1.2), Studio Code Server (5.1.1), Node-RED (12.0.2), Traccar (0.17.0), Network UPS Tools (0.11.0), Zigbee2MQTT (1.25.2-1)

dashboards: 1
resources: 12
views: 2
mode: storage

oldest_recorder_run: 12. Juni 2022 um 09:16
current_recorder_run: 22. Juni 2022 um 19:06
estimated_db_size: 1457.46 MiB
database_engine: sqlite
database_version: 3.34.1

api_endpoint_reachable: ok

The problem is, randomly (every 2-3 weeks) the “server” goes down. Then the raspberry respond to a ping and if I go to the ip without a port then the Congratulation from the nginx proxy server shows up but the webinterface doesent load when I add the port. A connection with ssh also doesen’t work.
For the onlinecontrolling I use uptimerobot this said today that home assistant goes off at 15:22 (3:22pm). Then there is only 1 solution, unplug the power and plug it in again. At this time I saved the home-assistant.log.1

log

2022-06-22 01:15:24 ERROR (MainThread) [homeassistant.components.shelly] Timeout fetching Kühlschrank data
2022-06-22 03:02:57 ERROR (MainThread) [homeassistant.components.shelly] Timeout fetching Kühlschrank data
2022-06-22 03:19:33 ERROR (MainThread) [homeassistant.components.xiaomi_miio] Timeout fetching Roborock S6 data
2022-06-22 03:50:13 ERROR (MainThread) [homeassistant.components.shelly] Timeout fetching Tiefkühler data
2022-06-22 04:22:58 ERROR (MainThread) [homeassistant.components.shelly] Timeout fetching Tiefkühler data
2022-06-22 09:17:17 ERROR (MainThread) [homeassistant.components.shelly] Timeout fetching Zirkulationspumpe data
2022-06-22 14:38:43 ERROR (MainThread) [custom_components.hacs] <Integration fred-oranje/rituals-genie> GitHub returned 404 for https://api.github.com/repos/fred-oranje/rituals-genie

This log says nothing special or anything becaus the last messange was about 1 hour ago.

How can I find the problem?

tom_l · June 23, 2022, 12:21am

Watch your cpu and ram usage.

daschmidt · June 23, 2022, 4:03am

This looks normal

mhoogenbosch · June 24, 2022, 6:24am

I had a similar problem. My solution was to replace the power supply. Now it is really stable again.

daschmidt · June 24, 2022, 1:29pm

thanks, but for the pi it’s the original 5,1v 3a power supply
I try now another 3a powersupply maybe it work’s better.

Tuomas · July 6, 2022, 3:53pm

Hi! I’m having similar issues. Did you find a solution to the issue? Before OS 8.0 and later have been rock solid, but now been having these random crashes on Pi4 with SSD.

daschmidt · July 6, 2022, 4:26pm

I replaced the power supply, till now there was no crash. But can’t say it 100% that the new power supply was the solution.

j.parker · July 6, 2022, 4:43pm

I have been quite frustrated as well. I have two identical setups and both exhibit the “crash” but not always at the same time. My issue is that the timers are not working when the system is in its hiatus state. Not good. I discovered through portainer that the ha container was down. using the docker restart command, things go back working; until the next failure, which could be 24 hours later or maybe even 3 weeks. So, the solution i came up with was to write a script and initialize it through crontab: basically testing port 8123 and if it was non-responsive, do a docker container restart. it essentially puts my systems back online but doesnt solve the problem of why it fails. I have a suspician it has something to do with the backend engine doing system analysis with the mother ship. (the failures seem to occur around the same time of the day)

here is the script: test_port.sh
#-----------------start-------------------
#script to test port satus
HOST=$1
PORT=$2
if [ -z $HOST ];
then echo “host needs to be established”;
exit 1;
fi
$(echo > /dev/tcp/${HOST}/${PORT}) #> /dev/nul 2>&1;
if [ $? -eq 0 ];

uncomment this next line; however, it fills up the log too fast - all we really care about is failure

#then echo “the port $2 is open on:” $(date) #>> /usr/share/hassio/homeassistant/test_8123.log;
exit
else
echo “the port $2 is closed on:” $(date) >> /usr/share/hassio/homeassistant/test_8123.log;
docker restart $(docker ps -a -q);

this puts the log data in the directory available for the file editor for easy troubleshooting

echo “docker restart ran:” $(date) >> /usr/share/hassio/homeassistant/test_8123.log;
exit 0
fi
#----------------end----------------

next, do a crontab -e and schedule to run this script (however you want) for (in my case) every 15 minutes.

Tuomas · July 6, 2022, 6:10pm

I realized later that today’s second crash was at least due to power outage, but weird that HA didn’t automatically come back when power turned on. Seems that the problem infact occurs at random intervals as well. My instance downgraded automatically to HA OS v8.0 so sticking with that for a while to see if the problem reoccurs with that older version.

svante · November 28, 2022, 4:57pm

My problem was probably due to an extensive logging setting. Found a very large home-assistant.log.1. Can’t specify exact size since it was rotated on subsequent HA restart but I’ve reduced log level since.

Johnny911 · October 31, 2023, 8:55pm

Hi, thank for sharing the script. As a new user question# how can you set it up with HAOS? (Flashed a raspberry pi 3 with it)
Would the script work the same in this case versus a core installation with docker (more advanced I guess)?

Regards

j.parker · December 18, 2023, 12:34am

Hi Johnny911. in an effort to answer you question, I doubt it would. This script looks for the presence of an open port and if its not there (it assumes the container(s) are not running - consequently issuing a docker restart command.

The HAOS version is not containerized.

good luck.
jp