Some experiences about fixing HA randomly restart

HI all,
When I’m writing for this post is after I fixed 2 HA randomly restart issues. I took me several weeks to find the root cause. It was so stressful and waste me a lot of time, because I dont have any experience about CLI and Google cannot help. I would like to share my experience and how I fixed the 2 issues, I hope that it can help someone who may help the similar issue like me. If you have any experience about fixing the restart issue, you can share in the comment.
FYI that I install my HA on the NUC with VirtualBox (Debian). Now I can start sharing the issue:

1st randomly restart issue after more than 1 year using and it suddenly got mad : I cannt not find any abnormality from the log to fix this issue, and I hadn’t installed/updated the HA version or change any setting before. I tried to read all the issue related the restart issue from Google and none of them can help. When I almost gave up, I accidently find the abnormal from my VirtualBox setting in Storage.

At that time I suspect the issue may come from the storage volume issue, after using for a long time the size of log file and database may increase and when it exceed the Virtual volume and HA may restart. I tried to google how to increase the Virtual Size and BAM Eureka issue fixed. You can refer the picture below to increase the Virtual volume
Root cause: Actual storage size is reach to Virtual size

2nd randomly restart issue. This time the issue is more complicated that the 1st issue. Before I faced this issue, I have multiple action from the Hardware (change the CPU fan) and Upgrade the Debian OS, Update HA. Actually, I can’t detect the issue, until I accidently see my HA Uptime like below and when I tried to update the HA OS/host/Supervisor, it automatically restarted but didn’t update to new version.
image

I tried to google again and the result like the 1 issue, none of them can help. I changed config and restarted, some time after restarting, it showed some errors related config and even Hardware not supported (WTH!!! I have used this NUC for more than 2 years and now it said the hardware not supported :rofl::rofl::rofl:). I tried to do everything to make sure the issue not come from the hardware and I started to focus only on the Software. I check the to get the latest update from the Debian to Virtualbox and they are all the latest version but the issue was still there :((. I felt hopeless and I decided to backup everything and installed the fresh HA. But the issue didn’t let me go easily, and the nightmare was still there. I cannt install the new HA, it automatically restarted during the installation and I even cannt go to the welcome to login for the new HA. I want to give up and think that I will destroy and rebuild everything by reinstalling the Debian and HA. But I still felt unsatisfied, and hesitated if I reinstalled everything, it can help but if in the future it may happen again and I have to reinstall everything like this time??? So I made a last try by uninstall the VirtualBox, and reinstall it. Luckily, this is my proper action, the issue was fixed.
—> Root cause may come from the Debian updated but VirtualBox was not updated. But the tricky is I already used the command line to update and check the VirtualBox (before I uninstall) and it said this is the latest version (Version 6.1.22xx) but after I removed and reinstall again the version is 6.1.44xx.

1 Like