Appdaemon not receivng events after HA restart

TheEggi · August 18, 2018, 1:29pm

I have the latest HA and Appdaemon versions running in docker and it all works fine, but the issue is that as soon as I restart HA it gets detected by AD - which is also perfectly fine. Afterwards it tells me that it called terminate on all apps and the initialize methods get executed. Only issue is now that AD has no real connection to HA anymore - no events/… is received anymore.

Is there anything I have to do to make AD work correctly after a HA restart?

ReneTode · August 18, 2018, 3:40pm

normally AD just should reconnect to HA when HA is up and running again.
can you show me the logs from the point you restart HA until there is a new connection?

TheEggi · August 18, 2018, 4:40pm

Here is a log of a restart:

gist.github.com

https://gist.github.com/TheEggi/08bbf5a2af4cf4915d25900b3833fbad

restart.log

Aug 18 18:37:08 :  2018-08-18 18:37:08.228587 WARNING AppDaemon: HASS: Disconnected from Home Assistant, retrying in 5 seconds
Aug 18 18:37:13 :  2018-08-18 18:37:13.229784 WARNING AppDaemon: HASS: Disconnected from Home Assistant, retrying in 5 seconds
Aug 18 18:37:18 :  2018-08-18 18:37:18.231338 WARNING AppDaemon: HASS: Disconnected from Home Assistant, retrying in 5 seconds
Aug 18 18:37:23 :  2018-08-18 18:37:23.233106 WARNING AppDaemon: HASS: Disconnected from Home Assistant, retrying in 5 seconds
Aug 18 18:37:28 :  2018-08-18 18:37:28.234770 WARNING AppDaemon: HASS: Disconnected from Home Assistant, retrying in 5 seconds
Aug 18 18:37:33 :  2018-08-18 18:37:33.238357 INFO AppDaemon: HASS: Connected to Home Assistant 0.76.0
Aug 18 18:37:33 :  2018-08-18 18:37:33.258088 INFO AppDaemon: Processing restart for HASS
Aug 18 18:37:33 :  2018-08-18 18:37:33.258678 INFO AppDaemon: Calling terminate() for climate_automation_living_room
Aug 18 18:37:33 :  2018-08-18 18:37:33.258613 INFO AppDaemon: Terminating climate_automation_living_room
Aug 18 18:37:33 :  2018-08-18 18:37:33.258840 INFO AppDaemon: Terminating bathroom_window_alert

This file has been truncated. show original

ReneTode · August 18, 2018, 6:00pm

seems like a normal restart from an very active AD
are none of your apps working after restart, or just some?

you say no events. can you be more specific.

TheEggi · August 18, 2018, 6:16pm

All of them. Everything that is not connected seems to work (like scheduling), but events are not getting triggered or calls to get_state return None. As soon as I restart AD everything is back to normal.

ReneTode · August 18, 2018, 6:27pm

when did this start to happen?
did you update HA when it started to happen? or anything else?

i think that this is something that @aimc will ask you some more questions about.

TheEggi · August 19, 2018, 7:28am

This started with the upgrade from the latest 2.x to 3.x. Only other thing that also changed was that I had 2.0 directly on my machine and with 3 I changed to a docker setup (AD and HA both use host networking).

ReneTode · August 19, 2018, 9:39am

which version from AD are you running?

aimc · August 19, 2018, 3:39pm

Seems a little weird - the terminates should have occurred before the reconnecting messages - can you show me the whole log from before the point where HASS restarted please?

TheEggi · August 19, 2018, 4:51pm

Sure - here is another log (this one also shows the error because of get_state returning None).

https://gist.github.com/TheEggi/dbe6395fd4740a5b7a558d5d670e6625

ReneTode · August 19, 2018, 7:32pm

that shows that the sensor sensor.angela_fahrzeit isnt initialised in HA at the moment that AD is reinitialising.
so probably a race condition.

i suspect from lines like this:

Aug 19 18:48:08 :  2018-08-19 18:48:08.142242 INFO climate_dressing_room: Changed temperature of Heizkörper Ankleidezimmer to 4.5 -> adjust slider Heizkörper Ankleidezimmer

that the app climate_dressing_room does change the value from an entity in HA.
but it doesnt give an error so the entity must exist.

it seems to me that you do all kind of checks in your initialise without errors.
why do you think there is no connection to HA?

and still the question, what version from AD are you running?

TheEggi · August 19, 2018, 7:51pm

Thank you for the analysis - AD version is the latest one published to docker. Guess I will add some code in all my scripts to get some kind of a state value and see which automations are affected. Will post the results when I have them.

ReneTode · August 19, 2018, 8:05pm

dont get me wrong, but in lots of cases where people say they have the latest version, it end up not beiing the latest version, thats why i ask again, which version?

you can find out which version by looking at the docs directly after a restart from AD.

it seems like you have a lot going on in AD, so i suspect you also have a lot going on in HA. it can be that some HA components are to slow. but there have been some changes in the last versions from AD that did help against race situations, thats why i keep asking about the version.

the problem could also be in your mqtt.

but if its just this 1 error you base your question on then i can say that AD is just working as suspected.
add a testapp with a listenstate to several devices, set it to priority 1 and let it just log the state and you know if entities exist and what values they have at the start from the reinitialising.

TheEggi · August 19, 2018, 8:27pm

Aug 19 18:51:45 : 2018-08-19 18:51:45.261445 INFO AppDaemon Version 3.0.1 starting

That is what I see in the logs. And yes I will probanly have to analyze it a bit more and see if I can get any results to improve the situation.

ReneTode · August 19, 2018, 8:57pm

ok, that is indeed the last version

TheEggi · August 20, 2018, 5:19pm

Following code seems to have resolved it…

def initialize(self):
count = 0
while (not self.check_state()) and count < MAX_RETRY:
time.sleep(0.3)
count += 1
def check_state(self):
    state = True
    if not self.get_state('sensor.test'):
        self.error('>> ERROR: SENSOR NOT FOUND!!')
        state = False
    else:
        self.log('>> OK: FOUND!!')

    return state

Interesting thing is that since I added it, it seems to not even go into the “ERROR”-Part … and I already tried it like ten times.

ReneTode · August 20, 2018, 7:25pm

could be that that sensor was slow just once.
but your code is wrong.

in your log you got the error while the state was None.
None would be returned as value in this case also.

and also could a sensor have the state False, and then it would return sensor not found.

a better way to go would be:

def check_state(self):
    state = True
    if not entity_exists('sensor.test'):
        self.error('>> ERROR: SENSOR NOT FOUND!!')
        state = False
    else:
        self.log('>> OK: FOUND!!')

    return state

but you would still get the error you got, because it seems that the sensor you used normally gives an INT, but in some cases returns None and you didnt check for that.

TheEggi · August 21, 2018, 2:39am

Thank you for the improvement - will change it.

JKW · June 16, 2020, 5:54am

Hi, I’m also seeing the same issue at the moment for quite some time (~3 weeks?).

Log file:

gist.github.com

https://gist.github.com/KoljaWindeler/25165921fb8a9180f655834ea7b1ad72

log

16.06.2020 04:37:15 WARNING HASS: Disconnected from Home Assistant, retrying in 5 seconds
16.06.2020 04:37:20 WARNING HASS: Disconnected from Home Assistant, retrying in 5 seconds
16.06.2020 04:37:25 WARNING HASS: Disconnected from Home Assistant, retrying in 5 seconds
16.06.2020 04:37:30 WARNING HASS: Disconnected from Home Assistant, retrying in 5 seconds
16.06.2020 04:37:35 WARNING HASS: Disconnected from Home Assistant, retrying in 5 seconds
16.06.2020 04:37:40 INFO HASS: Connected to Home Assistant 0.111.3
16.06.2020 04:37:40 INFO HASS: Evaluating startup conditions
16.06.2020 04:37:40 INFO AppDaemon: Processing restart for HASS
16.06.2020 04:37:40 INFO AppDaemon: Terminating owntracks
16.06.2020 04:37:40 INFO AppDaemon: Terminating cube

This file has been truncated. show original

Version
HA 0.111.3
16.06.2020 07:27:21 INFO AppDaemon: AppDaemon Version 4.0.3 starting
16.06.2020 07:27:21 INFO AppDaemon: Python version is 3.8.2

I’m running HA and AD in docker containers, which are updated via watchtower.
For the last year or so this worked flawless. Whenever HA was updated AD reconnected
and send me a message via pushbullet that one of my sensors timed out (that sensor is dead for a long time … but that was kind of how I saw that HA was updated ).

This stopped about 2-3 weeks ago … instead many automations (AD scripts) stopped working at the same time. I’ve restarted the AD container and all was good. Tonight same behavior. Reading the log made my digg up this very old thread … but it’s the same issue.

I’ll get a lot of errors after the init, looking like this:

gist.github.com

https://gist.github.com/KoljaWindeler/fbc389f8922dbf76075baa3cedbdbd51

log2


16.06.2020 04:39:04 WARNING xiaomi_vac: Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/appdaemon/threading.py", line 777, in worker
    funcref(entity, attr, old_state, new_state,
  File "/conf/apps/xiaomi_vac.py", line 40, in cleaning
    self.log("new vacuum ("+str(entity)+") status: "+new+". old was "+old+". tct: "+str(self.g_tct(self.vacs.index(entity))))
TypeError: can only concatenate str (not "NoneType") to str

This file has been truncated. show original

pretty much every script goes crazy. about a minute later it all works fine again … well at least all scripts which are still alive and are doing some processing on a timebase will report regular behaviour, meaning that they can read the state of the sensor again …

Is it possible to delay the start of AD once it reconnects without rewriting all scripts? I guess I could add a “try to read state of sun.sun for 5 min in a loop until that works” in each init function … but that not very elegant …

any ideas?
Thanks, JKW

ReneTode · June 16, 2020, 7:53am

HA did rearrange.
HA is up and running before all integrations are up and running.

so the only way to work with this is delay.