How to troubleshoot crashing Hass.io on Pi

Hello,

i have been running HA on an Raspberry Pi for over a year. Before hass.io came out, I could just ssh onto the Pi in case of a an error when HA was not responding anymore as well as check the logs. Now that I have the Hass.IO with SSH-Plugin etc. all that is gone. When I cant reach my homeassistant, all I can do is crawl under the TV and reconnect the powersupply to restart it. I cant even see in the logs what could have gone wrong since they are gone after every reboot…

My Hass.io seems to become unreachable everytime it loses Wifi connectivity (aka everytime my son pulls out the wifi extender even for a very short time) as well as unregularly at other times (but all in all its sadly not very stable :/).

Any ideos how to handle this?

The SSH plugin allows you to see the HA logs. If you enable SSH for the host itself you can also monitor that.

https://developers.home-assistant.io/docs/en/hassio_debugging.html

The whole purpose of hassio is to hide the host system. If you want more control you could run hass in a docker on Raspbian or install hassbian. Or, even more elaborate, install a virtualenv yourself. Choices, choices…

The problem is i cannot reach the Pi anymore when the system is gone, cant ping it, cant SSH into it since using Hass.io. I have been thinking about switching to a docker setup, but dont know if the Pi 3 has enough ressources for that and if its worth the hassle… I am going to try to use a LAN connection in a few days, maybe that helps a bit. Multiple hard restarts per month is not really satisfying…

My hassio is wired and I don’t have any issues connecting to it.

I once had a failed update, but in normal running mode no problems whatsoever. (Knock on wood)

I am having the exact same issue with my Pi, Lights are on but no one is home. SMB SSH both drop off i need to power cycle my pi to get it going. it lasts a few hours then does it again. Issue only started for me since .70 update, I’m on .71 now same issue it was rock solid before then it seemed but could be a coincidence.

I am having the same problem. Start with a fresh install. Everything works perfect for a day or a few hours, then becomes unresponsive. The only choice is to “recycle” the power. But something is damaged and it does not work well until another fresh install. I think the root of the problem is that I show two ip addresses for my Pi. One is the static one that I created. The other seems to be one generated by DHCP. Seems that hassio uses the one ip for some stuff and the other for other stuff. After a while, it gets very confused.
Before the latest release it ran 24/7 with no issues. Also, it can not use my RING doorbell anymore.

Same problem here. I appreciate the benefit of hassio “hiding” the underlying host from us. But if it isn’t reliable that’s not a help, but a hindrance. I would like some advice / recommendations as to how to troubleshoot hassio to see if there’s something I can fix or workaround.

I can confirm having the same issue. I have to have a sonoff plug timer to restart it once per day. It seems to happen around 12:00-2:00 everyday.

Strange? I’m using a good sd card and power supply. Before it used to be rock solid.

Since updating to the new HASSOS is seems like it started happening.

As far as I can tell, HassOS (v 1.12) as configured has no ability to re-connect to WiFi after a dropped connection. I’ve spent a few hours researching it this morning and haven’t found a viable workaround yet. Here’s an issue I posted in the hassos repo:

I initially thought I would just hack up a quick ping script to restart the network interface (like an if-down script) on loss of ping-ability of the router, but found out that HassOS has nothing like crond running on it.

I then spent a few hours reading through the NetworkManager and nmcli documentation, trying to figure out a way to get NetworkManager to, you know, manage the network… but I couldn’t figure out any way to do that and neither could a lot of googling disclose anyone else who had done so without a cron script.

I think that HassOS is running systemd and so there may be a way to create a systemd script that could monitor network connectivity and issue an nmcli network restart command if it goes down, but HassOS also seems to have a lot of the filesystem locked down as read-only and you can’t simply create new systemd scripts.

As of now, if you are running HA with HassOS and you need it to be reliably on your LAN, it appears that the only workaround for this issue is to plug the pi into your router via an Ethernet cable. For an embedded system that is supposed to be the hub of a home automation system, that seems like a really bad constraint to me.

All my dumb “smart” devices know how to reliably reconnect to WiFi after they lose connectivity. That seems like a pretty basic thing for an embedded smart device.

2 years later. Im on hassio and my entire rpi seems to lose connectivity every now and then (1-2 weeks although it might be related to power outs which are common). Every other device connected wifi to the router comes back online just fine because they never have an issue. But the rpi seems to lose connectivity (no ping and no dhcp client list entry even though it has a mac-ip reservation and an ethernet connection directly to the router.

I’ve also run into this problem. It started happening around v2020.12. Either that or it got corrupted when I let the battery run down it is running off.

I’m running Hassio on a hardwired Pi 3 B+ and a USB SSD.

My wifi router is scheduled to reboot itself once a week.

I doubt it was the battery corrupting the install as I already reverted to a very old backup 0.117 onto a new USB SSD with the same issue after I upgraded to the 2020/2021 versions.

I’ll try a proper power supply and see if perhaps my buck converter has become unstable.

Edit1:
Using the approved power supply did not make any difference.

I also added a grandfather clock automation to send me a Telegram message every hour. What I can see is that the Telegram messages arrive late a few hours before I would lose connectivity. (However this could just be Telegram and not necessarily HA.)

The telegram messages allow me to catch the issue quicker and what I noticed is that the ip is still responding to pings, but does not respond to ssh and of course the HA web interface and app is unresponsive.

I’m going to start uninstalling some unsupported/custom add-ons even though the instability started before I added them.

Edit2:
Removing the unsupported addons did also not make any difference. It still crashed a few days later.

If it persist I will have to build the install from scratch again.

While having the same type of issues I stumbled on this older thread.
Once in a while my PI/HA “locks-up” completely, not even possible to ping the PI anymore.

But how to troubleshoot? After reload (power off/on) all the logs I can reach from HA are clean.

My only suspect now is the powersupply, as I do see sometimes an “undervoltage” message, however I’m not sure how serious these messages are. I read that a stable power is crucial for the PI but not sure if the original power supply will never show these messages? Also reading this thread it does not solve the issue (for some).

Ofcourse I can (and probably will) just replace the powersupply but I would like to check some logs to get some real feeling whats happening.

This post did not reply to the fundamental question ask at the beginning. HOW to troubleshoot hass.io ? I see lots of post like this, lots of people have issues and are trying to fix it blind . Would be good that HA team take care of this issue.