Hooking HA Database Transactions

Overview

I’m working on an integration in which I would like to respond to transactions on the Home Assistant database. More specifically, I would like to be able to run a function every time a particular event is written to the database (i.e., once the event is committed to the database or at least flushed to the database session).

I’m not sure if this is even possible.

I have examined the source code of the recorder component which is responsible for writing events to the database. And I have read about sqlalchemy event listeners which allow one to hook database transaction events. So now I think I may have a plan for achieving my goal. However, there are a few pieces of my plan that I am unsure about - I’ll get to these in a moment.

But first, my plan so far is to use sqlalchemy.event.listen to hook INSERTs or COMMITs to the database using the sqlalchemy session from the recorder component thread. I think this approach should allow me to call code from a hook whenever an INSERT/COMMIT transaction contains the event that I am tracking.

Hypothetical Examples

Hook INSERT Transactions

from sqlalchemy.event import listen
from sqlalchemy.orm import object_session
from homeassistant.components.recorder.models import Events

tracked_event = # This is the event whose insertion I want to track

def hook(mapper, connection, db_event):
    session = object_session(event)
    event = Events.to_native(db_event)
    if event == tracked_event:
        # Do stuff with the event now that I know it has been written to the database

# Register a listener for database INSERTs
listen(Events, 'after_insert', hook)

Hook COMMIT Transactions

from sqlalchemy.event import listen
from homeassistant.components import recorder

def hook(session):
    # If possible, examine the session to see if tracked_event was inserted
    # Then do stuff with the event

session = self.hass.data[recorder.DATA_INSTANCE].get_session()
listen(session, 'after_commit', hook)

So this might be workable up to this point, but here are the things I’m unsure of…

Questions

Can I use the Recorder’s database session from a different thread?

The Recorder runs in its own thread and uses sqlalchemy.orm.scoping.scoped_session which is apparently thread-safe. However, the documentation makes it sound like a scoped_session won’t contain the same transaction data if it’s used by a different thread.

The scoped_session object by default uses [thread local storage], so that a single Session is maintained for all who call upon the scoped_session registry, but only within the scope of a single thread. Callers who call upon the registry in a different thread get a Session instance that is local to that other thread.

If I’m reading this correctly, data that is flushed (but not committed) will not be available in the shared session of a different thread - it is only visible in the Recorder thread’s session. :confused: If this is the case, then I don’t think it’s possible to hook transactions made by the Recorder.

What is the scope and thread of execution for sqlalchemy.event.listen ?

I’ve read through a ton of sqlalchemy docs and examples, and it’s still no clearer:

  • Which thread will run the function hook below?
  • Can hook modify state outside of its local scope (like foo.bar below)?
foo = # say foo is some object that lives in the main thread

def hook(mapper, connection, db_event):
    foo.bar = 123

listen(Events, 'after_insert', hook)

If hook can’t modify state outside of its local scope, then I can’t use sqlalchemy.event.listen to do anything interesting when a database transaction event occurs.

Summary

I want to be able to react to changes in the HA database, but I’m not sure how to do it or if it’s possible. There seems to be at least one potential solution to my problem, but I don’t know quite enough about sqlalchemy to be sure.

If anyone could shed some light on this topic, I would appreciate it. Thanks! :v:

References

P.S.
I should mention that I am not interested in listening for events on the HA event bus. I can do that, but it is not suited to my needs because the event listeners are called before the events are written to the database.

What about using the MySQL node in NodeRED to connect and query the data?

My plan is to extend my variable component so that var entities can update their value based on a user-defined database query whenever a user-specified event fires.

As an example, suppose I want a variable to count the average number of door_open events in the past week. I can achieve that by querying the database, but I have to rerun that query every time a door_open event occurs. I’d like to rerun that query automatically whenever a door_open event occurs. Now, I can listen for door_open events on the hass event bus, but the event listener fires before the Recorder component writes the door_open event to the database. Thus, when I rerun my query, it misses the latest door_open event.

So the only solution I can think of that I’m happy with is to hook the database itself. An alternative solution that I don’t like is to poll the database. Yuck.

I like the idea of custom queries.

There already is a component that uses data from the history db that might be helpful, though it doesn’t hook into the sql directly. The existing component can already count state changes as your example use case.

I’m familiar with the history statistics, but it is fairly limited in what it can track. For example, it can’t do an average of a count over a duration. There are ways to achieve an average through hacks or combining sensors, but that makes the yaml ugly and unmaintainable.

My goal with my var integration is to create a single component with enough power to essentially replace all of these other limited sensor components that need to be hacked and strung together just to accomplish some simple data processing.

If I can get these database hooks in place, then I’m essentially there.

I experimented with snagging the database session of the recorder from a separate thread. The session in the separate thread did not appear to have the same flushed db entries as the recorder thread. Therefore, I’m left to conclude that it is not possible to get visibility into the recorder’s session from a separate component.

This is unfortunate as I can only think of a few workarounds to solving this problem, and they’re all lackluster.

Workarounds

Poll the database for changes

The workaround I’ve adopted is to listen for events in Home Assistant and then poll the database until the events have been committed to the database by the recorder. This seems to work, but it’s not efficient and there is a small delay due to the polling.

Maintain a separate database session mirror

Another workaround I considered was to maintain a separate database session that mirrored the session of the recorder. This mirror session would listen for home assistant events and “write” them to the database session, but the session would be configured so that the events are never actually committed to the database. This approach might be workable, but it seems like a poor approach in general.

I really wish there was a better way to hook the Home Assistant database…