[AppDaemon] Tutorial #2 Errorlog Notifications

Hi everybody!

Recently I gave a user some feedback who was looking to convert their YAML Automation file to the app-based Python structure of AppDaemon. It dawned on me that a lot of users who are using AppDaemon are seeking the extra flexibility that Python can afford to offer, but may not necessarily be familiar with how to write Python in the first place! Some may have a CS background, while others may not. This post is part of a tutorial series of sorts wherein I tackle a problem and show both a simple and complex way to write an AppDaemon app! The simple version will be ready-to-use right out of the box. The complex version will also run “out of the box”, but you’ll likely want to tweak it to work specifically with your own setup! I will go over more advanced concepts and style guide mentions here.

##Tutorials

  1. Tracker-Notifier - monitor devices’ on state, and receive a notification after a user-specified length of time
  2. Errorlog Notifications - have a persistent notification appear on the dash any time AppDaemon errors out

In the easy version, I will try to break the solution down in simple terms so that someone who knows absolutely nothing about Python can understand them. The more complex example will also have a brief description explaining general logic, but will be targeted towards those of you who have a better grasp of the fundamentals.

I hope you enjoy!


#Tutorial #2 - Errorlog Notifications

— simple —

import appdaemon.appapi as appapi
import os
from datetime import datetime

#
# App to display a Persistent Notification on the Front End whenever AppDaemon has encountered
# an error
#
# Args: (set these in appdaemon.cfg)
# path_to_errorlog = full path of location of errorlog
# refresh_interval = time in seconds to check for new errors
#
#
# EXAMPLE appdaemon.cfg entry below
# 
# # Apps
# 
# [error_notifier]
# module = error_notifier
# class = ErrorNotifier
# path_to_errorlog = /home/homeassistant/.homeassistant/appdaemon/conf/errfile.log
# refresh_interval = 5
#

class ErrorNotifier(appapi.AppDaemon):

    def initialize(self):
        self.last_update = datetime(2013, 9, 13)
        
        self.run_every(self.get_last_file_update, start=self.datetime(), 
                                                  interval=int(self.args['refresh_interval']))

    def get_last_file_update(self, kwargs):
        last_update_seconds = os.stat(self.args['path_to_errorlog']).st_mtime
        last_update = datetime.fromtimestamp(last_update_seconds)

        if last_update > self.last_update:
            self.last_update = last_update

            if os.stat(self.args['path_to_errorlog']).st_size > 0:
                self.notify_frontend()

    def notify_frontend(self):
        pretty_timestamp = self.last_update.strftime('%A, %x @ %X')

        self.call_service('persistent_notification/create',
            title="[AppDaemon] Something went wrong! :(",
            message=("On {}, we found at least one new error in the error log!\nYou should go check to"
                     "see what went wrong.".format(pretty_timestamp)))

In all AppDaemon apps, the first thing you always want to do is import the AppDaemon api! This is core to AppDaemon and allows us to access values within HomeAssistant. You’ll also notice there are two other import related statements here. The really cool thing about AppDaemon is that it allows you to import any code into your apps that you could import into any old python file. This really allows us to push the limits of what “an AppDaemon app” can be. os* and datetime* are both “standard” libraries in Python. This means you do not need to go and download or install them, they come with EVERY version of Python that’s out on that market, so your users can essentially plug’n’play your app! Very cool. So let’s get started! :slight_smile:

As always, it’s great to put some documentation into your apps so your users knows how to use them. These should be general purpose notes aimed at answering most questions your users might have. We’ll give users a description of what to expect that app to do, and then define our arguments, or args. In this app, we’ll look to the user to specify the location of their error logfile, as well as how often they’d like to check for updates. I always like to end my documentation with an example of how the configuration of this app should look in the user’s appdaemon.cfg file.

###What are we trying to do here?
Just like in maths, there are many many ways to come to a final solution, however it’s helpful to write out how you’ll get to that solution. In very broad, general terms, the steps we’ll need to take to bring this app alive are…

  • Find the time the last time the errorlog was modified
  • Keep track of and compare these changes to each other at a regular interval
  • Notify the user of this occurence, possibly multiple occurrences if they occur less frequently than our refresh_interval)

def initialize

The first thing we need to do in our app is set an app-level variable that keeps track of the last time we saw the error file. So we’ll set last_update to a default value of datetime(2013, 9, 13). Well wait a second SN, what the heck is a datetime? To explain it suuuuper simply, you should remember the rule “Everything in Python is an Object”. Essentially, this is a date-time Python Object, and it has some fancy properties that we can play with. We’re actually setting the date and time of this specific instance of a datetime object to September 9th, 2013! For now, take it at face value and let’s move on. It’s important for us to have a “default value”, so to speak, and this is what we’ll use.

We’ll then want set a timer on the schedule to “run every” so often. In this case, we’ve asked the user to specify a refresh_interval and you’ll see that run_every has a spot for us to put this value. It also asks us for when the timer should start. For this, we’ll put in a helper function that our good friend @aimc has coded up for us. It’s helpful to understand (although confusing… maybe for another lesson!) that self.datetime() and datetime() are two different things in our script. Again, take it at face value and let’s move on. :wink:

The callback for this timer is going to help us update our self.last_update variable.

def get_last_file_update

So great! We’ve accomplished some of our initial goals already: keep track of ... the last time the errorlog was modified ... at a regular interval. This next function will help us get the last time the error logfile was updated, and it is so aptly named!

This first line is a bit long, so let’s break it down: os.stat('/path/to/error.log').st_mtime is a function from the os library, much like our get_last_file_update is a function in the app ErrorNotifier. What stat() does is it gets “the status of a file or a file descriptor.” You’ll see that we have st_mtime trailing after it. This little bit simply means “time of most recent content modification expressed in seconds.”. So all together: "Get the time of the most recent content modification from the file self.args['path_to_errorlog']". Hey great! We’ll set that result to the variable last_update_seconds. See? That wasn’t so bad. :slight_smile:

Next up? We have this variable that is essentially a timestamp of the number of seconds since epoch, which isn’t very useful us in that format. Luckily, that datetime library we used earlier? It has just the function to help us out! fromtimestamp() will turn our seconds-since-epoch timestamp into a datetime object (hey, remember self.last_update??) – so we’ll do just that.

datetime is a really neat library, and you’re about to find out why: it can actually do arithmetic on datetime objects. For that reason, we’ll compare our new last_update datetime object to the last recorded self.last_update datetime object, if and the new object is greater than, or “later in time” than the last recorded time, we’ll update self.last_update with the new value! Sweet! We’ve just accomplished all but the last bullet on our list above.

We come back to another os.stat() function call, but this time, we’re looking to see what the size of the error logfile is (in bytes). This is just a nice-to-do, because if for some reason we updated the file by deleting all the contents in the log, we’d have a new update on the file, but no content and thus no error. So by evaluating the size of the file as “greater than zero”, we’re ensuring that there is an error in our file. So again, if our errorlog has something in it, we’ll want to send a notification to our user!

def notify_frontend

For this app, I figured I would teach you something new instead of just repeating the notification bit. :slight_smile: First off though, our while our datetime object can be “looked at” as a string, we want to make it a bit prettier. Here’s a nice little tool to help you understand what the heck %A, %x @ %X could possibly mean, and what you could customize that value to be! strftime is a handly little function that “formats a time object into a string” str - f - time. :slight_smile:

After that, we’ll simply call a persistent_notification/create service through @aimc handy helper function self.call_service. This works identically to if you filled out the data in your Developer Tools on the Front End itself.

Here’s an examples of the output! That means our app checked the file and recognized that the last error written to the app happeend at 21:08:54 and then pushed a notification to the frontend. How cool is that?? :smiley:

**Editorial Note - I wrote this post much quicker than I did with Tutorial#1, and as with anything done quickly, it might not exactly be your best work. This is evident by @gpbenton’s comment about my complex example app. It’s important to note that part of the developing process is looking back on code you’ve written in the past and improving it. Sometimes you miss something obvious, sometimes you could simplify or refactor it, and that’s exactly what we’ve done. It’s important to remember not to be ashamed of your oversight, take them as an opportunity to learn and grow! Thanks @gpbenton :slight_smile:

#####*You can read up more about os and datetime at their respective links below.
os - provides a portable way of using operating system dependent functionality; more specifically: get the status of a file or a file descriptor.
datetime - supplies classes for manipulating dates and times in both simple and complex ways.


— complex —

import appdaemon.appapi as appapi
import os
import re
from datetime import datetime

#
# App to display a Persistent Notification on the Front End whenever AppDaemon has encountered
# an error
#
# Args: (set these in appdaemon.cfg)
# path_to_errorlog = full path of location of errorlog
# refresh_interval = time in seconds to check for new errors
#
#
# EXAMPLE appdaemon.cfg entry below
# 
# # Apps
# 
# [error_notifier]
# module = error_notifier
# class = ErrorNotifier
# path_to_errorlog = /home/homeassistant/.homeassistant/appdaemon/conf/errfile.log
# refresh_interval = 5
#

class ErrorNotifier(appapi.AppDaemon):

    def initialize(self):
        self.seen_errors = []
        self.last_update = datetime(2013, 9, 13)

        self.run_every(self.get_last_file_update, start=self.datetime(), 
                                                  interval=int(self.args['refresh_interval']))

    def get_last_file_update(self, kwargs):
        last_update_seconds = os.stat(self.args['path_to_errorlog']).st_mtime
        last_update = datetime.fromtimestamp(last_update_seconds)

        if last_update > self.last_update:
            self.last_update = last_update

            if os.stat(self.args['path_to_errorlog']).st_size > 0:
                self.compare_errors()
            else:
                self.seen_errors = []

    def compare_errors(self):
        re_time = re.compile('(.*). WARNING Traceback .*')
        re_app = re.compile('apps/(.*).py')
        re_func = re.compile(', in (.*)')
        re_line = re.compile('line (\d{1,})')
        re_error = re.compile('(.*(Error|Iteration|Warning)): (.*)')

        for error in self.get_errorlog_content():

            dt_string = re_time.search(error).group(1)
            dt_object = datetime.strptime(dt_string, '%Y-%m-%d %H:%M:%S.%f')

            app = re_app.search(error).group(1)
            func = re_func.findall(error)[-1]            
            line = re_line.findall(error)[-1]
            exc = re_error.search(error).group(1)
            desc = re_error.search(error).group(3)

            full_error = '{}.{}:{} {}: {}'.format(app, func, line, exc, desc)

            if full_error not in self.seen_errors:
                self.seen_errors.append(full_error)
                self.notify_frontend(dt=dt_object,
                                     app=app,
                                     func=func,
                                     line=line, 
                                     exc=exc,
                                     desc=desc)

    def get_errorlog_content(self):
        with open(self.args['path_to_errorlog']) as f:
            all_errors = f.read()
            all_errors_list = all_errors.split('\n\n')

            for error in all_errors_list[:-1]:
                if error is not None:
                    yield error

    def notify_frontend(self, dt, app, func, line, exc, desc):
        pretty_timestamp = dt.strftime('%X on %x')

        self.call_service('persistent_notification/create',
            title="[AppDaemon] {}".format(exc),
            message=("At {}, in app {}, in def {}, on line {}, {}"
                     .format(pretty_timestamp, app, func, line, desc)))

If you read my last example, then you got see how much fun you can have with RegEx. Well, this app will take that to the next level. :wink: Everything in the complex app is functionally the same up until we get to initialize so I will not go over it.

###def initialize
Here we’ll instantiate a list called seen_errors. This will be important so that we don’t spam the front-end if we’ve got 10 of the same exact error at different times. We’re trying to let the user know something has gone wrong, not beat it into them that they’re a bad programmer! :stuck_out_tongue:

###def get_last_file_update
Get-last-file sees a small improvement as well. If the filesize is zero, we’ll blank out the errors that we’ve seen. This just makes sense to do, doesn’t it?

###def compare_errors
This function has a LOT of different RegEx strings in it. I try to label them by what they are searching for or finding all-of.

  • re_time is searching for a line that has the time the Traceback occurred.
  • re_app is searching for a python file that is in the \apps\ directory.
  • re_func is finding all of the in-functions in the Traceback, but you’ll see when we assign func a value, we’re actually taking the last-found value.
  • We employ the same line of thought to find the infringing line.
  • re_error is actually searching for multiple groups, which we eventually set to exc for Exception and desc for the description.

We then compile a unique identifier called full_error for this type of error, and if we don’t find it in the app-level variable seen_errors, we’ll append it and shoot off a notification to the Front End.

###def get_errorlog_content
… I missed explaining this initially, so here we go :blush:. We’ll access the contents of the errorlog file and split up the log entries by a double new-line. This can be done as it’s conveniently how logs are written. Whenever AppDaemon encounters an error, it writes the full Traceback to file, throws a double new-line at the end, and begins the next line with a WARNING statement. You could certainly split your errors up by this warning statement, but I find it easier to use a double new-line and then cast off the last entry (again, which would be this warning statement). You’ll see this is exactly what we’re doing by the slice of all_errors_list[:-1]. If we’re not running into a “blank” error (for whatever reason, as I was debugging, I ran into this situation - this is not None evaluation might actually be superfluous) and the error hasn’t already been seen, we’ll yield that error back to the loop that’s calling get_errorlog_content.

###def notify_frontend
The function also saw significant improvements! It’s entirely possibly to track that source of the error through the notification on the Front End now. This should give your users a way to help identify what the heck is going on and communicate this back to you, the developer -or- in the case that the developer and user are the same person, this should help you more quickly identify where you need to be looking. The full Traceback is certainly still in your errorlog, but this should give you plenty clues anyway. :slight_smile:

Here’s an example of the complex output!


Notes on reading through your error log… @awitty

2017-02-28 21:08:33.060896 WARNING ------------------------------------------------------------
2017-02-28 21:08:54.029595 WARNING ------------------------------------------------------------
2017-02-28 21:08:54.030111 WARNING Unexpected error during loading of at_sunset:
2017-02-28 21:08:54.030418 WARNING ------------------------------------------------------------
2017-02-28 21:08:54.031577 WARNING Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/appdaemon/appdaemon.py", line 996, in read_app
    init_object(name, class_name, module_name, config[name])
  File "/usr/local/lib/python3.4/dist-packages/appdaemon/appdaemon.py", line 690, in init_object
    conf.objects[name]["object"].initialize()
  File "/home/homeassistant/.homeassistant/appdaemon/conf/apps/at_sunset.py", line 10, in initialize
    test2
NameError: name 'test2' is not defined

2017-02-28 21:08:54.031992 WARNING ------------------------------------------------------------

Throughout the course of writing apps, you will undoubtedly come across errors. While all aspects of reading a Traceback might be a bit out of scope for our discussion today, I can ceratinly help you decipher a little bit of what you’ll see!

I have an app called at_sunset where I intentionally caused an error, in order to test today’s lesson. :slight_smile: The most relevant part of the Traceback is the part with the error.

NameError: name 'test2' is not defined

Here we see the error type NameError and a short description. 'test2' is not defined. What does that mean? Well, the variable test2 doesn’t actually exist in at_sunset. Well shoot. That could be anywhere, couldn’t it? Well, luckily if you trace back through the error … :wink: you’ll see "/home/homeassistant/.homeassistant/appdaemon/conf/apps/at_sunset.py", line 10, in initialize … well shoot, that looks familiar!

Start here. This is where your root cause is. Fix this error first. Google what the heck NameError even means. Learn how to ask a question. If all else fails, come back and ask us a question here in the AppDaemon forum.


I would like to give a small shoutout to user @awitty for giving me the idea and inspiration that has brought you our second tutorial! I am open to anyone shooting out ideas for new apps for me to develop and explain. I want this series to benefit you, the user! The more we learn together, the cooler ideas you all will come up with which will in turn help the community grow! :slight_smile:

Another shout out to @aimc for developing this awesome platform we can use to write our complex automations in, and @yawor for helping to clean up and standardize the codebase. What we’re doing here would not be possible without these two, so thank you greatly.

I’ve gone back and forth on what medium to use, including various blogging platforms as well as YouTube. I might experiment with solving a problem “live” and thinking my way through the problem out loud as well.

Feedback is important! This series will only be as successful as you all make it to be! Let me know your thoughts and what you all would like to see, and I’ll consider all my options.

Happy Automating!

  • SN
9 Likes

can you explain why you need idx here? It looks to me like you could just do

for error in all_errors_list

but there may be something about enumerate that I’m missing.

Sure! Let’s use the following log record as an example. Take this example as the full entry in the errorlog.

2017-02-28 21:08:33.060896 WARNING ------------------------------------------------------------
2017-02-28 21:08:54.029595 WARNING ------------------------------------------------------------
2017-02-28 21:08:54.030111 WARNING Unexpected error during loading of at_sunset:
2017-02-28 21:08:54.030418 WARNING ------------------------------------------------------------
2017-02-28 21:08:54.031577 WARNING Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/appdaemon/appdaemon.py", line 996, in read_app
    init_object(name, class_name, module_name, config[name])
  File "/usr/local/lib/python3.4/dist-packages/appdaemon/appdaemon.py", line 690, in init_object
    conf.objects[name]["object"].initialize()
  File "/home/homeassistant/.homeassistant/appdaemon/conf/apps/at_sunset.py", line 10, in initialize
    test2
NameError: name 'test2' is not defined

2017-02-28 21:08:54.031992 WARNING ------------------------------------------------------------

To frame my explanation a bit… in the complex example, I made the assumption that you, the dev, know what a generator is. For the sake of brevity, it’s basically an on-demand list. So when we iterate through the generator with for error in self.get_errorlog_content():, what we’re doing is getting all the instances of anything that looks like an error.

The full errorlog is above, and we’ll split it on a double-new line via all_errors_list = all_errors.split('\n\n'). So what does all_errors_list “look” like?

all_errors_list = [
    "2017-02-28 21:08:33.060896 WARNING ------------------------------------------------------------
2017-02-28 21:08:54.029595 WARNING ------------------------------------------------------------
2017-02-28 21:08:54.030111 WARNING Unexpected error during loading of at_sunset:
2017-02-28 21:08:54.030418 WARNING ------------------------------------------------------------
2017-02-28 21:08:54.031577 WARNING Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/appdaemon/appdaemon.py", line 996, in read_app
    init_object(name, class_name, module_name, config[name])
  File "/usr/local/lib/python3.4/dist-packages/appdaemon/appdaemon.py", line 690, in init_object
    conf.objects[name]["object"].initialize()
  File "/home/homeassistant/.homeassistant/appdaemon/conf/apps/at_sunset.py", line 10, in initialize
    test2
NameError: name 'test2' is not defined",
    "2017-02-28 21:08:54.031992 WARNING ------------------------------------------------------------"
]

all_errors_list[0]
>>> "2017-02-28 21:08:33.060896 WARNING ------------------------------------------------------------
2017-02-28 21:08:54.029595 WARNING ------------------------------------------------------------
2017-02-28 21:08:54.030111 WARNING Unexpected error during loading of at_sunset:
2017-02-28 21:08:54.030418 WARNING ------------------------------------------------------------
2017-02-28 21:08:54.031577 WARNING Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/appdaemon/appdaemon.py", line 996, in read_app
    init_object(name, class_name, module_name, config[name])
  File "/usr/local/lib/python3.4/dist-packages/appdaemon/appdaemon.py", line 690, in init_object
    conf.objects[name]["object"].initialize()
  File "/home/homeassistant/.homeassistant/appdaemon/conf/apps/at_sunset.py", line 10, in initialize
    test2
NameError: name 'test2' is not defined"

all_errors_list[1]
>>> "2017-02-28 21:08:54.031992 WARNING ------------------------------------------------------------"

Great, we see the errors! But one thing we don’t like … all_errors_list[1] isn’t ACTUALLY a real error. What we do know is that it always comes last in this list though. Why does @aimc do it this way? Who cares, we’ll account for it but grabbing the length of all_errors_list and stopping one index (or idx) before it.

See: if error is not None and idx < len(all_errors_list):

So it is a bit hacky? Sure. But it works for now. This is normally where I would leave myself a comment and say something to the effect of… #last idx of all_errors_list is always a non-error, so ignore it.

GREAT question!

  • SN

I do it that way so you know where the error ends, this was supposed to be a logfile, not a super-duper notifying system :slight_smile:

If you are interested in this kind of error handling I could add code to call back into apps as well as logging the error if you like,

I don’t see it as necessary [yet] :wink: An app is a workable solution and not too difficult to implement. It also doesn’t require anything additional from the user.

The app is actually separate of the logging itself - the process shown is that you can essentially tail a file and send out notifications based on updates. If the user wanted to extend more functionality to the logger itself, that can certainly be done as well.

1 Like

Good explanation, I get it now, which leads to another question. Could this be done with a slice?

for error in all_errors_list[0..-2]:

which I think would be easier to understand.

Certainly! See this is where my background as someone not in CS comes into play. Your solution is certainly more explicit… however it is technically wrong the way you have it. Use this instead…

for error in all_errors_list[:-1]:

I’m actually going to update the example as you’re right, it’s much easier to understand.

ed/

This actually led me to see that we are tracking an error twice, which seems a bit silly and overdone. So I’ve cleaned up get_errorlog_content() and compare_errors() to fit more appropriately. To all the users who are less familiar with process of developing – debugging and refactoring happen constantly! There’s certainly never a reason to be afraid of improving! Great catch @gpbenton!

I added the following function to clear these persistent notifications after five minutes.

But I couldn’t figure out how to get notification id.

Any ideas?

def clear_messages(self, kwargs):
    self.log('timer passed..clear messages')
    time = datetime.fromtimestamp(appdaemon.conf.now)
    self.log("Time now : {}".format(time))

    # oop through all persistent_notification messages
    #need  notification_id ..RESEARCH
    self.call_service('persistent_notification/dismiss', notification_id="notification")

I tried to clear notification from developer tools but that didn’t work either.

image

there are 2 ways:

  1. you can look at developers tools>states if you want to find it manually
  2. you can define the number when you create the notification, so that its always the same number

The first screenshot is from Developer Tools only. There was no information displayed there regarding notification id. I will try your second suggestion.

The reason for clearing out messages was the other day appdaemon app threw bunch persistent messages from error notifier. That was due to some sensor didn’t report proper information and I didn’t catch in that code. I thought it would be better to clear any persistent notifications after a predefined time.

i am still on HA 0.40 and have no dismiss.

in my version it is like this:


notification2

Yikes :open_mouth: I am up to date on HA versions. Earlier I upgraded to 0.57.1

There are lots of changes in last few months.

i know, ill update soon, but i dont want to update every 2 weeks.
i wouldnt have a problem with that if they would chose to avoid breaking changes.
i have a working environment with lots of stuff running.
and if i create a problem by updating, i upset my wife :wink:

Looks like latest HA 0.57.1 broke persistent_notification. It’s displaying ‘state’ information instead of ‘message’

I opened up a ticket.

Never mind. I had to clear browser cache, site data and refreshed to work again.

There was PR in 0.57 release relevant to persistent_notification.

Will this work in Hassio or are changes required?

changes are in this case not required because of hassio, but you are probably running appdaemon 3 (because appdaemon 2 is deprecated) and this is written for appdaemon 2.
so you need to take care of the breaking changes from AD 2 to AD 3.