Home Assistant just stops working!

I have the latest version of Hassio loaded.
Tried both Pi 2B and 3B+
Different Ethernet Cables
Different Power Supplies
Different SD Cards (32GB)
It runs for a few days and then NOTHING. No errors, nothing. It either stops responding to web requests and SSH logins, or says
Unknown Error
RELOAD UI
Which doesn’t do anything.
I have to unplug power and then plug it back in. And then it works again for a few days.
This has been going on for Months and I’m getting really frustrated.
What is wrong?

PS. The Addons I am running (all current are):
File Editor
MariaDB
Mosquitto Broker
Samba Share
Terminal & SSH

On this latest reboot, I have disabled File Editor; Samba Share and Terminal & SSH just to see if it makes any difference

So you have a version that is 14 months old?

That was the last time it was called hassio. If so, I suggest you update.

If you mean Home Assistant OS (HA OS for short), it would help if you provide full logs after the error occurs. The system log will be overwritten by a restart so obtain it from the SD card before restarting using a linux PC or Windows PC running Disk Internals Linux Reader (free).

“I think there are CLI commands for the core and supervisor logs that can look at the previous journaled logs but I don’t know what they are.” tom_l

What is CLI and where are these log files that you keep telling me to look at?

Yes, I’m 0.0.2 release behind on the Core, but I’ve been updating for months now with every new minor update and nothing has fixed this. So really not interested in doing it again. And the release notes don’t say anything about fixing a freeze or lockup

CLI is command line interface, like when you log into the host machine. The suggestion is not to take a look at the HA or supervisor logs, but rather the Linux syslog, to see if there was something like a kernel panic that took the host down

Ok, great. Where are the Linux syslogs?

in /var/log, it will have the current log, as we as logs from previous boots

1 Like

Thank you! Since they are not overridden (as previously stated in another post), I will take a look as soon as my system is done rebooting from an updated I just did to core-2021.4.5 just in case anyone suggests I update it…again…sigh.

There are other logs that are overwritten, such as the main HA python process log. I was considering modifying the main HA python init code to make a copy of the previous log before it erases it… I am confounded why that is not the default behavior

1 Like

Strange. There is nothing in var/log

Do you get ever increasing CPU / Memory Usage?
You can track that with a simple integration or addon (System monitor - Home Assistant)
Worth to check this out.

Happend to me some time ago, and seems to be happening to some people as well currently:

In my experience, the load was every increasing, until the core just failed.
It was always possible to bring it back up manually without a restart of the full machine by logging into the CLI via SSH and restart the Core.

Several reasons, that may happen… it wasn’t an easy troubleshooting. I ended up making a VERY minimal config.yaml just to test. Also take note in your config of the line:

default_config:

This includes several components which get activated by default. If you delete this line, you may have to add them one by one. Check the linked thread for some more insight.
For me it was once an issue with my network config, and then another time i had issues with the rtsp stream component.

Examples:

automation:
history:
logbook:
map:
mobile_app:
person:
ssdp:
sun:
system_health:
updater:
zeroconf:

Hope that helps!

You are looking at the filesystem of the container, not the host operating system

When it freezes, SSH doesn’t respond, Web doesn’t respond. And from the RPI sensor, it does not appear anything was being overtaxed.

If the system is actually frozen, it will not log stats during that time period, so it will probably show the last known value, meaning the cpu might actually be at 100% that whole time (regardless it is not the issue of increasing resource usage until crash)

Just got same issue on rpi4 with SSD, latest updates and sqlite DB.


Cannot connect thorough ssh :frowning: I am not at home so cannot physically restart.
Will need to setup remote restart for such cases.

Update #1: now looking at router. HA is active in network

Last 7 days shows increased activity on 12 of June. I was not at home on 12 of June :frowning: and I have not enabled any auto updates.
On 10th of June I did a update to latest version and even changed user configuration to show 24 format time. Maybe it’s related?

Update #2: plugging out and plugin in raspberry pi 4 power source solved issue.
Processor and memory use before(~08:00)/during/after(~13:45) outage
image

@DaHai did you resolve this issue? Now
I am having this issue around weekly for the last two months.
Setup: Rpi4 with SSD, always latest updates and sqlite DB. DB size currently ~600 mb
I don’t know even were to start. As power cycling losses all logs. When it’s having outage I cannot access system via samba and other add-ons for example plex is also not working anymore.

And I am sometimes getting one notification from my custom automation reporting conbee deconz connected IKEA light going offline. But logbook does not have such event
Any thoughts?

Had this issue again. But looks like I am the only one experiencing it

I have a similar issue almost daily with sd card. Moved to ssd to try to fix it, but now it happened again.

Can’t access anything to check logs and logs are overwritten when rebooting to fix it.

2 Likes

I have the same, also moved to SSD to try to fix it, also happens

1 Like

Mine happens sporadically. Sometimes two days a week sometimes all good for the whole two weeks.
How I can troubleshoot this?
RPI4 with SSD, + plex server addon so also acting as media server.
Always applying newest updates

Hi,
Has someone find a solution? I have the same problem for about a month now. Every three days I have to reboot (disconnecting power supply) because the systems freeze. I can load the frontend, sometimes still can switch lights, but that’s it. Configuration.yaml cannot be found, Supervisor cannot be loaded. Tried reinstalling every addon, no result. Basic configuration, same result.