Struggling to make my system stable (HASS.IO on RPi)

I have been playing around with Home Assistant for more than a year by know. I’m using HASS.IO on Raspbery Pi 3 and have connected two TP-Link switches, four Sonoff devices with tasmota and 5 Homematic IP radiator thermostats, Chromecast and Google Assistant Mini. I’m using Samba, SSH, mosquito add-ons.

All in all I’m very happy with my setup, and so is my family. Since the all got used to it I have responsibility to wife and kids keep it running smoothly. And this is increasingly becoming a problem, as my setup is just not stable enough. Every couple of days I wake up and RPi/HA combo is down. No connectivity, cannot use HTTP/SSH/Samba to access it. Simple reset and all is back to normal.

I have gone through forums and Reddit so I know it might be an SD Card problem, that turning of history might help or it could simply be that one of the components I’m using is causing this issue. I have excluded particular HA version as this has been going on for quite a while and I’m through couple of updates of HA itself with no difference in my system stability.

What I still don’t know is how to go about tracking the isse. The log files are empty when I reset. so I have no idea what happen when system went down. Should I push my log files somewhere so I can spot the issue brewing up? Should I simply change the SD card and hope it goes away? Is this normal for RPi/SD card so I should move to Linux PC and host HA there? What are your thoughts?

As is mentioned in a number of other threads on here, a good quality SD is highly recommended as well as a good quality power supply.

The thing that kills SD cards quickly is lots of writes (log files) and a sub par power supply and/or cutting the power often (blackouts, pulling the cord). Having the “offical” RasPi PSU does not guarantee you won’t have issues, many others have found that.

If you have a component that is not set up correctly and is constantly spamming your logs, it “could” be causing your SD an issue. Check your log files while it is running periodically and see if there is anything unusual spamming it.

With Pi’s it is almost always the PSU or the SD card that is causing your issue. Take regular backups of your config, store them on Google Drive, and thumb drive etc.

Thanks, kanga_who.

I do have official PSU so I wasn’t paying attention to that. I have another official one connected to RPi 2 running Kodi 24/7 that is rock solid stable, I’ll swap and see if it makes any difference. Thanks for the tip!

Quality of SD should also be fine, but it is a used one so this might be it. Again I can easily try another one and see the difference. I’ll do it after I run on other PSU for couple of days.

My log is not bombed by any of the components, just couple of events per days. Another thing is history. That thing is huge with weather info, heating data etc. Do you guys turn this off? It too is bomarding SD card.

Also thanks for suggestion, I do keep good care of configuration, it’s neatly stored on github.com.

1 Like

If you don’t have the recorder set up to purge, that could help. This is mine.

recorder:
  purge_interval: 7
  purge_keep_days: 15
  include:
    domains:
      - sensor
      - binary_sensor
      - switch
      - light
      - automations
      - device_tracker
      - vacuum
      - group
      - input_boolean
2 Likes

I recently pointed recorder to a postgres instance on my NAS. Logbook, History and HA restart times are blazing fast now. It might not help you with your problem directly, but it definetly offloads the PI.

1 Like

I like this idea! Actually, I just provisioned MariaDB instance in my Azure tenant, I’m gonna give this a try and report back.

MariaDB made a world of difference for me. I had a really wonky setup for long but finally it is working great.
Things that made a difference for me:
I got rid of DuckDNS since that made my router and connections go bananas,
I moved from ResionOS to HassOS and more specifically “recently” updated HassOS and it’s very stable.
Disabled bluetooth. (was interfering with my zigbee stick)
Switched to MariaDB.
Bought a full Unifi setup (USG, PoE switch, 2 AC Lite APs) and got rid of my ISP router.

Last step might have been overkill, but I suspect it being the culprit that made DuckDNS not work well. Especially whenever a devices DHCP lease was renewed my whole network could tank for an hour so so. Hassio would be unresponsive etc.

Good luck!

1 Like

How do you do presence detection? Bluetooth proved to be the most reliable for me, and I intend to add Zigbee stick soon. I wouldn’t be happy if they were mutually exclusive.

If you are asking me, I have tried MyTracks on Android but I found it lacking and buggy. On iOS I use the iOS app.

For MQTT I use local mosquitto. How is local MQTT server for SD card health? Does it use much writes or is this mainly memory stored? I know, I should google it, but this just popped in my mind.

@davidv no, sorry. I was asking @wrenchse because he mentioned turning off bluetooth to avoid interference with zigbee.

Oh, sorry.

But my question still stands - if best practice is to move history DB out, does it make sense to have a local MQTT erver on Raspberry Pi or is this yet another thing that is hammering SD card and it’s better to move it?

It would be best to contact the maintainer for MQTT addon you intend to use.
MQTT uses publish/subscribe pattern.
It should be completely ran in memory. Only if number of active messages exceeds the number/size allowed in RAM should it write to a persistent storage.
Then, and if it uses logging extensively.

1 Like

Presence is done via the iOS app and UniFi and working great so far. I don’t allow outside access so I can’t someone’s exact whereabouts when away though.

Deard @davidv, have you solved your issue? It looks that most people running HA on Rpi with SD card have the same issue. It started for me with upgrade to HA OS beyond 4.8 but I cannot go back. I am suspicious that when I could go back for a while as there was a bug in 4.9 causing the same, it worked. However, since then, I have been updating regularly and the freezing is happening between minutes to two days. I also added a ton of new integrations and devices, so I think that it is linked to recorder/database.

I can try to make MariaDB on my NAS but unfortunately I don’t have it in the same flat. Nevertheless, if moving the DB is the solution, I would love to know that. Thanks.

I no longer run HASS on Raspberry Pi. I moved to running it as a virtual machine on my home server. But what I understand from other posts in the forum, a lot of progress has been made with using and SSD on Raspberry Pi to host the data and that should be a solution in itself.

1 Like

Thanks for a quick reply I am trying to point HA db to MariaDB server on my NAS, which is 500km away through a VPN tunnel. We will see. If that does not work, I will move with external SSD. However, I do not like the idea of buying an external SSD. Do you think a usb stick do the same?

So I redirected recorder to MariaDB server on my NAS 500km away and the whole HA instance is speedy like a rocket. I will come back in several days to announce whether I am also stable.

similar approach to mine can be found here: RPi Hassio + Synology NAS Maria DB For logging

I suspect you will have same issues with USB stick as with SD card.