Home assistant hangs after few hours

Hi,
same issue here.

RPI4 8GB Ram.

Few entities, few automation, deconz and conbee setup.

Hangs every 2 weeks. May be some memory leak going on?

This is the type of issue that make me think that HA is “not ready for production”.

Same problem over here:
Raspberry 4 w 4G RAM, boot from SSD.
Hang occurs daily :frowning:
Minimal addons (SMB, Grafana, InfluxDB).

Have you guys checked your logs for entries along the lines of ‘malformed database’? Have you tried stopping HA, deleting the database file, restarting HA?

I checked the logs ( ha core logs ) and grepped for malformed.
Nothing there …
(tried top copy/paste the output, but the terminal addon doesnt facilitate copy/paste)

I added the logger config. Think there’s more info over there.

Check out https://github.com/home-assistant/operating-system/issues/1119
Lots of people with similar issues. When it freezes it looks like most of the time the logs don’t give a clue. Most in the 1119 issue can run fine on os below 5.5. This has been an issue for a lot of people after the kernel was change to make 5.5.

Yesterday I changed the recorder backend to MariaDB. Since then no hang :slight_smile:
A good video on how to achive this: https://youtu.be/0Nf70avId0w

And one day later, HA is dead again :unamused:
This is not good. Stability is crucial.

Have your tried the OS downgrade to 5.4. I have yet to have this crash. I just tried this morning an update to the firmware on my StarTech controller. I rebooted, went to 5.12 and 4 hours later it crashed. 5.4 is very stable for me.

RPI-4 clean install, no temp og memory problems - basically a decent installation but still experiencing hangs once a day for the last 2 months appx.
Any hints og news if someone is looking into this rather common issue?

I have the same issue now for several months, Systems freezes after 1 day - simple installation, new RaspberryPI with SD card, original power supply 2% CPU load, 58 degrees celcius, Disk 8%, Mem use 12%

recorder:
  purge_keep_days: 1
  commit_interval: 10

im still trying to resolve this issue unresponsive HA once everyday. I have set automation to reboot my HA every 6hours hoping this will solve the issue. I will update this ticket once it eliminates the issue.

Same here - tried almost everything. SD->SSD, cooling, disabled integration etc. etc.

I’m wondering if the devs are aware of this thread. I would post logs if I knew what they needed to diagnose the issue.

I’ve tried excluding sensors from the recorder, and I now have a sneaking suspicion, that it is an input_text value that is tripping up the recorder process.

My setup is now working again, after excluding it, purging the database, and restarting HA.
Will post again, once a day has past, to let you know it that worked.

Same here :man_facepalming:

Everything startet after switching from RPI3 to 4

Meanwhile i changed to the 3rd SSD adapter
I tested the SSD with blackmagic and crystaldiskmark nearby to the death
I use the original rpi power supply
I can’t see anything in the logs at all

Reboot / Fresh Install → 24 - 72h :: FREEZE !

actual i switched over to a A2 SD card instead of SSD
the next step would a try with 32Bit instead of 64Bit
lets see

really annoying :sleepy:

I had the same problem and a completely fresh install (no snapshot restore) seems to have fixed it. Details here: HassIO stops responding every so often - #75 by bradymholt

Hi,
This sounds just like my case. Did you find a fix to this?
Thanks in advance!

Worked for me to exclude that custom input_text entity from the recorder!

1 Like

Hi,

Could you share an example on how to apply this workaround?

Sure!

I started with excluding all domain types from being recorded (see below), restarting HA, and checking the logs to see if stuff was being recorded again.

Then I progressively enabled domains again, by commenting out those lines (see below).

Finally, I figured out it had to do with one of my custom input_text fields. So leaving that excluded got everything stable and working again.

recorder:
  [...]
  exclude:
    domains:
      - automation
      # - binary_sensor
      - camera
      # - climate
      - device_tracker
      # - group
      # - input_boolean
      - input_text
      # - light
      # - person
      # - sensor
      - sun
      # - switch
      - water_heater
      - weather
      - weblink
      - updater
      - zone

Thx, I had the same issue and it’s better now with your trick.