Auto restart app on failure

I have a couple of applications that fail when they start, but it you go out and just touch the file to re-run it, it runs fine. I think it’s due to some timeout’s that are just a sideeffect of having to go over the internet for information. Is it possible to get a auto-restart limit set in the appdaemon config file

something like
[myapp]
module=myappmodule
class=myappclass
auto-start= number of attempts to start ( 0 - do not try to start app at all, 1 try once and no more.2 or more times to restart as a result of a failed restart.)

This way we could suspend an app like Christmas automations without just commenting out the app, or we could try restarting twice or more if we know it sometimes takes our app more than once to actually get started.

the counter become irrelevant once the app completed it’s initialize function so apps that fail after initialize would fail and not be restarted.

how about a failsafe in your initialize?
apps that i dont want to start during a period (like Christmas) get a constrain boolean.

yea but that doesn’t take care of the error restarts that happen randomly. Just thought it would be a nice feature. I could check for the failures and loop in my initialize, but since the initialize is single threaded, that increases the time spent in the initialize function slowing things down. If we handled it outside where the app is kicked off, the app could be added to the end of the stack so that it gets restarted clean without holding up other things.

the total intialize from all apps wouldnt be faster by that.
i already once asked andrew for a possibility to decide in which order the apps start.

then you could place those apps at the end of the stack.

Ok, let me approach this from a different direction. Is there a way in HA that I can re-launch a program from it’s error handler as it exits?
try:
soemthing that might blow up
except
<schedule app to relaunch
raise

or alternatively is there an error that we could raise from our app that would tell AD to kill, cleanup and relaunch the app?

in your init you could try:

initialize(self):
  started = false
  while not started:  
    try:
      do all your init statements
      started = true
    except:
      self.log("restarting init")

you could even add a max amount of tries by putting in a counter and stop if counter is above your max.

I believe all app initialization is single threaded, so this would stop all apps from starting until this completed?

does it make a difference which app takes longer or shorter?
appdaemon is first completely started if all apps are completely started.

if you want the holdup to be at the end, i guess you also want to be able to set a startup sequence :wink:

edit: but its also possible to take the functions to a next level and take them out of the init.

But Rene, I may not be in the init when I error and want to restart it. If it’s a data error that should clean itself up or let’s say my internet goes down so I can’t get to google calendar. I don’t want to have to go and restart everything that relies on a internet connection. It would be nice if they could just exit with a state that tells AD to restart them.
It seems like you are very opposed to this. Can I understand the source of your resistance or am I just mis-understanding it?

And in answer to your question about what difference does it matter as to whether the holdup is at the end or in the middle. If I understand correctly, the initializations are all done in a single thread so anything that slows down that thread keeps other apps from kicking off. I would rather have a problem app, get rescheduled for the end, allowing the others to kick off and be running while the app that is having problems tries to restart itself. That way my other automations would be running and only the one having problems would be delayed longer.

for that we have the terminate function.

I would rather have a problem app, get rescheduled for the end

if we can schedule the apps you can put the problem app at the end to begin with :wink:

Rene,
This is getting us nowhere. I think what we are saying is getting lost in translation somehow. Let’s wait for Andrew to chime in on this and see what his thoughts are.

Sorry - I missed this.

So you mean you want a way to recover from a failure in your initialize? You can do that right from the App. AppDaemon itself doesn’t know if an App is running or not, in fact they never “run” as such, they just get callbacks. Try something like this: (pseudo code)


def initialize():
  do the real initialization()

def do the real initialization():
  #do all your init stuff

 except:
  #oh noes, we have an error lets retry in 10 seconds
  run_in(do_the_real_initialization, 10)

I did this instead

  ###########################
  #
  # Restart application by touching the app
  ###########################
  def restart_app(self):
    import os
    import fnmatch
    import subprocess

    matches=[]
    module=self.args["module"]                           #  Who am i
    module=module+".py"
    path=self.config["AppDaemon"]["app_dir"]             # find py file
    for root,dirnames,filenames in os.walk(path):
      #self.log("root={} dirnames={} filenames={}".format(root,dirnames,filenames))
      for filename in fnmatch.filter(filenames, module):
        matches.append(os.path.join(root,filename))
    self.log("restarting {}".format(matches))            # found matching files
    subprocess.call(["touch",matches[0]])                # touch the files

Do you run that periodically?

If you want a way to re-initialize apps on a schedule I think we can find a cleaner way than that to do it, but I think that is a bandaid - I am struggling to understand why you need this - can you give me a for instance?

There are a couple of errors that really screw up my app (particularly from google calendars). When those happen, the easiest way to deal with it is just to re-launch the app. Yes It would be nice to figure out why google is bombing out (I think it’s actually a timeout) and work around it. But for now, and for use at home, this is the easy way to make sure everyone wakes up in the morning.

OK, well you have a solution for now but I’ll give you a cleaner way of doing it at some stage along with some introspection :slight_smile:

Eventually cleaning up and figuring out the source of the error or how to trap and handle it better will move up the priority list to. But right now my priority is playing with dashboards :slight_smile: :slight_smile: :slight_smile:

1 Like

I’m down with that!

I think @aimc’s suggestion is the cleanest way, and allows you to log your error appropriately. I did a little test … version one of my app to test the theory below:

import appdaemon.appapi as appapi
import requests

class HelloWorld(appapi.AppDaemon):

    def initialize(self):
        self.utils = self.get_app('utils')
        self._initalize()

    def _initalize(self):

        for web_address in ['google', 'golllllllllogle']:

            try:
                r = requests.get('http://www.{}.com'.format(web_address))
            except Exception as e:
                self.log('Something went wrong, but I\'m not sure what...\n{}'.format(e))

            self.log('[Response {0.status_code}] {0.text:.15}...'.format(r))

I don’t use requests often, so I honestly had no idea what kind of Exception to look for. Here’s my log output…

2017-03-04 19:18:29.025283 INFO Reloading Module: /home/homeassistant/.homeassistant/appdaemon/conf/apps/hello_world.py
2017-03-04 19:18:29.031341 INFO Loading Object hello_world using class HelloWorld from module hello_world
2017-03-04 19:18:29.198727 INFO hello_world: [Response 200] for google <!doctype html>...
2017-03-04 19:18:29.214649 INFO hello_world: Something went wrong, but I'm not sure what...
HTTPConnectionPool(host='www.golllllllllogle.com', port=80): Max retries exceeded with url: / (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x6fef3390>: Failed to establish a new connection: [Errno -2] Name or service not known',))

Great! So it looks like with a bit of research, I could be catching ConnectionError and handling appropriately. Let’s do that.

import appdaemon.appapi as appapi
import requests

class HelloWorld(appapi.AppDaemon):

    def initialize(self):
        self.utils = self.get_app('utils')
        self._initalize()

    def _initalize(self):

        for web_address in ['google', 'golllllllllogle']:

            try:
                r = requests.get('http://www.{}.com'.format(web_address))
            except requests.ConnectionError:
                self.log('I\'m not sure "{}" exists... :('.format(web_address))
                continue
            except Exception as e:
                self.log('Something went wrong, but I\'m not sure what...\n{}'.format(e))
                continue

            self.log('[Response {0.status_code}] for {1} {0.text:.15}...'.format(r, web_address))

and my output is …

2017-03-04 19:25:09.025929 INFO Reloading Module: /home/homeassistant/.homeassistant/appdaemon/conf/apps/hello_world.py
2017-03-04 19:25:09.032042 INFO Loading Object hello_world using class HelloWorld from module hello_world
2017-03-04 19:25:09.319005 INFO hello_world: [Response 200] for google <!doctype html>...
2017-03-04 19:25:09.333225 INFO hello_world: I'm not sure "golllllllllogle" exists... :(

So, now that we have a new entry point, that doesn’t keep AppDaemon from loading the app, you could always essentially redirect your entry point and handle Exceptions appropriately. I used a simple for-loop to debug something I know would error out, but you can very much something similar. This way you’re not dealing with an os calls, which can be messy. :confused:

1 Like