Appdaemon crashing

tormagj · June 6, 2020, 10:48am

Hi all,

For the last couple of weeks I’ve experienced that AppDaemon crashes completely and hangs, seems like everything stops and so far the only thing I’ve been able to do is restart the AD plugin and everything returns to normal. It does not happen daily, maybe more like weekly.

Not sure if this is an AD-issue, if it is one or several of my apps or if it is a Home Assistant issue. But it is really annoying, because it makes my system unreliable and unstable.

I’ve just updated everything today, but was on HA v110.3, Hassos 3.13, AD 0.2.5 when this happened. I’m on Rpi3. A bit hard to give exact log info, but this is what happens in AD log when it happens:

2020-06-06 09:34:57.259916 INFO AppDaemon: --------------------------------------------------
2020-06-06 09:34:57.261254 INFO AppDaemon: Threads
2020-06-06 09:34:57.262315 INFO AppDaemon: --------------------------------------------------
2020-06-06 09:34:57.265195 INFO AppDaemon: Currently busy threads: 0
2020-06-06 09:34:57.267564 INFO AppDaemon: Most used threads: 3 at 2020-06-04 07:05:01+02:00
2020-06-06 09:34:57.270415 INFO AppDaemon: Last activity: 2020-06-04T07:05:01+02:00
2020-06-06 09:34:57.273238 INFO AppDaemon: Total Q Entries: 409
2020-06-06 09:34:57.275270 INFO AppDaemon: --------------------------------------------------
2020-06-06 09:34:57.278314 INFO AppDaemon: thread-0 - qsize: 0 | current callback: idle | since 2020-06-06T09:05:00+02:00, | alive: True, | pinned apps: ['Garbage']
2020-06-06 09:34:57.280501 INFO AppDaemon: thread-1 - qsize: 0 | current callback: idle | since 2020-06-06T09:18:21+02:00, | alive: True, | pinned apps: ['RSS']
2020-06-06 09:34:57.283109 INFO AppDaemon: thread-2 - qsize: 204 | current callback: idle | since 2020-06-06T06:15:08+02:00, | alive: True, | pinned apps: ['HeatPump']
2020-06-06 09:34:57.285532 INFO AppDaemon: thread-3 - qsize: 0 | current callback: idle | since 2020-06-06T09:18:20+02:00, | alive: True, | pinned apps: ['Yr']
2020-06-06 09:34:57.290411 INFO AppDaemon: thread-4 - qsize: 205 | current callback: idle | since 2020-06-06T06:15:13+02:00, | alive: True, | pinned apps: ['AlarmClock']
2020-06-06 09:34:57.292129 INFO AppDaemon: --------------------------------------------------
2020-06-06 09:34:57.294597 CRITICAL AppDaemon: Thread thread-2 has died
2020-06-06 09:34:57.296758 CRITICAL AppDaemon: Pinned apps were: ['HeatPump']
2020-06-06 09:34:57.298434 CRITICAL AppDaemon: Thread will be restarted
2020-06-06 09:34:57.300516 INFO AppDaemon: Adding thread 2

and so it continues telling me thread by thread that the they died, trying again and so forth. It’s filling up my log with these messages until I restart the AD plugin. I’ve got 5 apps running and they all seem to be doing fine before everything crashes at the same time, that makes me believe it is not related to a specific bug in my apps since they’re all different. It may very well be though, and I have seen some “random” concurrent.futures._base.TimeoutError errors every now and then.

I’ve also experienced that HA seems to having issues, but I didn’t find anything specific in the logs except warnings and errors from “all” of my stuff in there at a specific time, but then it seems to recover. I noticed the CPU- and memory sensors were also showing a sudden peak in load, but not sure if it is a cause or an effect. At least twice when this happened it was more or less the same time around 06:15 in the morning or so (and my system or apps are not setup to do anything particular at that time). Not sure if this is relevant, but… SD-card issues came to my mind, but isn’t it funny it seems to be working fine after a restart then?

Anyone that could guide me in the right direction for further troubleshooting or have had similar issues?

Thanks in advance.

javerre · July 12, 2020, 10:26am

Same here. Everything runs fine for a few days and then I get:

2020-07-12 11:20:21.186659 CRITICAL AppDaemon: Thread thread-0 has died
2020-07-12 11:20:21.187663 CRITICAL AppDaemon: Pinned apps were: ['front_border_moisture']
2020-07-12 11:20:21.190158 CRITICAL AppDaemon: Thread will be restarted

filling up my log

tormagj · July 15, 2020, 6:18am

Hi,

I didn’t see the problem since I wrote this, but now it happened once yesterday morning and the same this morning. I don’t understand what is wrong - working for a month and then crashing twice in a row.

tormagj · August 27, 2020, 3:58pm

Hi @javerre! Did you figure anything out? Still having the issue?

I am. And it’s gotten worse lately. I’m not at all sure, but I feel I’m onto something. My system is on a Rpi3b+ with a 32GB SD card. So quite basic hardware in other words. Do you have the same or similar, or are you on a more advanced hardware?

I’ve seen my memory usage sometimes spike, and if it doesn’t spike it slowly builds up until it seems way too high. Swap usage is also very high, and in my case it seems like the major consumer here is influxdb. Things seem nicer if I temporarily disable it.

I can’t confirm it, but I suspect this may be causing AppDaemon to fail as well? At least it is causing lots of other things to fail, including access to the web interface, and in my case it’s happened quite often lately. Perhaps my influxdb is getting too big…

javerre · September 8, 2020, 7:32pm

I’m actually running on a QNAP NAS in Docker so quite different. Lots of resources so not likely to be a memory issue.

I’ve disabled one of the three instances of this app that I was running and it has not crashed since. Not really any nearer understanding what the issue was, I’m afraid

For what it’s worth I am doing a lot of database logging in the app (I have a sensor that records every minute but only connects to Wi-Fi once an hour to save on battery).