I have a regular issue with my HA instance becoming unavailable which is frustrating as it needs a core restart to get the instance back up and running again. Whilst in this state, automations do not trigger and integrations such as Z2M do not work, effectively crippling my house.
I’m trying to diagnose what is causing this but I don’t really know where to start as this usually happens in the middle of the night or whilst we are out. I have a theory that it is the Synology integration causing this - I use this integration to start and stop the NAS based on occupancy but I suspect when the NAS is turned off, the Synology integration is doing something which then leads to HA failing.
Where do I begin trying to diagnose this issue? Once HA crashes/becomes unresponsive, there is no way to access to the logs so where do I look? Are there any pointers for how to track down exactly when HA becomes unresponsive.
I wonder also if there’s a way to monitor HA externally and trigger a core restart if it crashes but this is a sticking plaster rather than a fix.
Any advice would be greatly appreciated.