HA crashes every few days because of Netatmo camera in HomeKit

My HA crashes after a few days. No idea why. If I force restart HA every day at night it “never” crashes.

Any idea what that could be or any advice on how to handle this?

Btw. I guess homeassistant.reload_all is not enough to fix that and I need to use homeassistant.restart.

Please provide (a lot more) info about your system and hardware.

I am using HA Core on a Mac mini M1.

Installed latest versions of “everything”. macOS, HA, etc.

Log files?

They are overwritten when restarting. Do I need to change something for that?

There should be a home-assistant.log.1 file which is from the last boot.

2 Likes

This :point_up:.

But in addition, if rebooting fixes your problem then it could be a bad integration. I’ve had this before and had to resolve it in exactly the same way, by rebooting daily or weekly to fix it. Even now I have a gremlin I’m trying to track down that jacks my CPU up to 70% after a few days and currently I auto reboot weekly to resolve it. You can disable anything you can do without for a day and see if your problem goes away, if it does then you know an integration is not playing well with your system.

2 Likes

Thanks, I restarted in the meantime, so this is gone as well.

It might be possible, that it is an integration because the system sometimes became a bit slow. Especially HomeKit answered very slowly. I will keep restarting once a day. I guess there is little disadvantage to that. Even if I find out it is an integration, I probably still want to use it anyway. :slight_smile:

I get the error “Too many files open” that are referring to some shell scripts I run every minute. I assume it gets into trouble if the script doesn’t run (although I don’t know why it shouldn’t - it just passes some values from HA and makes an image using imagemagick).

I now set the type to single instead of restart. In that case if the scripts hangs it doesn’t just open another one. I will check if that works better. For now it is.

Does this make any sense?

Is there something specific I should see in the logs? I have multiple errors appearing. Most of them regarding too many files open, from scripts and HACS.

It now crashes very often after it was ok for days.

Any ideas what could cause this?

2024-06-13 18:39:31.501 ERROR (MainThread) [homeassistant] Error doing job: socket.accept() out of system resource (None)
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/selector_events.py", line 178, in _accept_connection
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/socket.py", line 295, in accept
OSError: [Errno 24] Too many open files
2024-06-13 18:39:32.205 ERROR (MainThread) [homeassistant] Error doing job: socket.accept() out of system resource (None)
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/selector_events.py", line 178, in _accept_connection
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/socket.py", line 295, in accept
OSError: [Errno 24] Too many open files

Its likely one of the following, but certainly not limited to these cases:

  • You have an integration that is leaking file descriptions
  • Your custom script is never finishing so many copies keep running in the background until it runs out of resources
  • your ulimits are set too low

I do find hundreds of log entries like that for watchman and xiaomi-miot:

Detected blocking call to open inside the event loop by custom integration 'xiaomi_miot' at custom_components/xiaomi_miot/__init__.py, line 208: with open(os.path.dirname(__file__) + '/core/miot_specs_extend.json') as file: (offender: /config/custom_components/xiaomi_miot/__init__.py, line 208: with open(os.path.dirname(__file__) + '/core/miot_specs_extend.json') as file:), please create a bug report at https://github.com/al-one/hass-xiaomi-miot/issues

This might be the cause. I didn’t disable the integrations together so it still crashed. Now I will check with both disabled.

I experienced a very slow HomeKit responsiveness after some runtime of HA. So for now I have disabled all plugins that creates the Homebridge instances (camera, devices).

No crashes so far. Could it be related? I get lots of messages telling me the instances have not been paired before.

So maybe repairing or reinstalling might help. But if I reinstall all of them I have to set the devices up again. And I suspect it won’t help me anyway. Anyone having some experience with that?

Today I got an error installing/upgrading zwavejs2mqtt telling me that node is not the correct version. I don’t find it anymore but the error message included homebridge for some reason.

I downgraded from the latest node.js to node.js LTS. The error is gone and it seems stable even using the cameras in HomeKit.

So maybe this is related. For now it is good. Maybe this helps someone or someone could explain what happened.

Seems like the Netatmo cameras using Homebridge were the issue.

No crash since I disabled their Homebridge entities. Once I enable them it crashes once a day at least.

Update after two days:
Still no crashes and HomeKit is lightning fast.

Note: Netatmo cameras are still integrated in HA, but I am not using Homebridge for these cameras. Other cameras are working with Homebridge though.

My guess is that the stream of the Netatmo Presence or the Smart Doorbell causes the crash.