I recently upgraded to 0.64.0 on a RPi2 and started getting frequent Timer got out of sync errors. I only made minor changes to my config to get it running after the upgrade. I had been running with this config for 6 months without any Timer errors.
I also tried a fresh image on a RPi3 using version 0.65.6. I am having the same issue. The errors take about 8 hours to start happening and then happen about every minute. The issue causes a lot of lag. I am using the following components:
Floorplan
Appdaemon
MQTT Switches
MQTT Sensors
MQTT Light
Automations
Generic Mjpeg Camera
Orvibo s20 Switch
Shell Command
Chromecast
Denon Media Player
Does anyone know how I can troubleshoot this without disabling components and waiting 8 hours?
This sounds like a memory or other resource leak that causes the ram to start swapping after 8 hours. If you have a method of tracking these sorts of things you might be able to tell if you have stopped it before then.
I use a combination of telegraf/influxdb/grafana to keep a track of system resources for situations like this, but you might be able to do the same with simply keeping an eye on a running top command.
@gpbenton@jwelter I set up a cron job to restart HA every 6 hours. It has been working well for the past week. This is definitely not ideal but it works.
That’s one way to solve it… but if you have a big zwave network the discovery time can be significant where your are running “offline” due to the restart.
@gpbenton@jwelter I solved the Timer out of sync errors by installing the Mosquito MQTT broker. I have not had a single error since, the system is much more responsive, uses little memory and processor, and I have not rebooted HA in 265 hours.