[Solved] Options to execute SQL inserts in Hass.io (for a Datamart)

Hi HA gurus.

First post here (but many weeks as a ‘read only’ user :slight_smile:). So first, thanks to the creators and developers for such an awesome software.

I currently run a small installation of Hass.io on a RPI3 with some sensors and switches. Short list of plugins and versions running is:

  • Hass.io 0.91.4
  • MariaDB 1.2
  • DuckDNS 1.6

It’s working flawlessly and now I’m looking on how to get good insights from the sensor historical data.

Just giving a higher purge_keep_days won’t work as I want several months of data and that would be too much space.

I created some SQL queries to insert aggregated data from ‘states’ table so I can keep a loong period of data. However, I’m struggling to find a way to automatically execute the queries (let’s say daily).

So far I tried:

  • Execute the SQL files using shell command: It doesn’t work because “mysql” is not available to shell bash.
  • Execute the queries using python_script: It doesn’t work because I cannot import libraries
  • Excetute through SQL Sensor (wrapping the insert as an stored function): It doesn’t commit to the DB, so even I can see it executes the function records are not commited.

And then I ran out of ideas… I think I could add the Appdaemon and try there, but honestly I see that as a bit too much for such a simple task like executing some inserts…

So, mates, any other ideas or workarounds on how to execute sql commands? (I mean inserts, not only selects).

Thanks in advance!

1 Like

So finally I tried AppDaemon and it works flawlessly.

It requires a bit of time to read the docs/examples carefully and some try&error but no doubt it’s awesome.

I created the inserts using pyMysql and now it runs daily summing all the events into an aggregated table so I will be able to keep years of info and compare easily.

Hi, @jxmartin. Just ran across your post. I’m trying to do something similar with appdaemon. Can you show some of your appdaemon code? I can’t figure out what to import (I tried MySQLdb, and get an error that it can’t be found).

Any help to head me in the right direction is much appreciated!

The code that I have right now is:

import appdaemon.plugins.hass.hassapi as hass
import datetime
import mysql.connector as mariadb
class GetSQLData(hass.Hass):
  def initialize(self):
    database = mariadb.connect("core-mariadb","MY_USER","MY_PWD","homeassistant")
    cursor = database.cursor()
    query =   "SELECT ROUND(AVG(avg_per_hour),2) 'value' FROM (SELECT AVG(state) AS avg_per_hour \
               FROM states \
               WHERE entity_id = 'sensor.dark_sky_temperature' AND state != 'unknown' AND state != '' \
               AND created > datetime('now', '-24 hours') GROUP BY DATE(created) ) avgs;"
    cursor.execute(query)
    self.log("SQL Results: {}".format(cursor))
    database.close()

The error is (which means I can’t even get to the good part):

  File "/config/appdaemon/apps/python files/sql_mariadb.py", line 3, in <module>
    import mysql.connector as mariadb
ModuleNotFoundError: No module named 'mysql'
1 Like

I’ve figured out I need to add python packages into the appdaemon configuration under hass.io. I’ve done that with the following:

{
  "disable_auto_token": false,
  "system_packages": [],
  "python_packages": [
    "SQLAlchemy",
    "mysql-connector-python",
    "mysql",
    "mysqlclient"
  ],
  "log_level": "info"
}

I’ve simplified my appdaemon code to:

import appdaemon.plugins.hass.hassapi as hass
import datetime
from urllib.parse import urlparse
from typing import List
import sqlalchemy
import mysql.connector as mariadb
class GetSQLData(hass.Hass):
  def initialize(self):
    url = "mysql://user_id:password@core-mariadb/homeassistant?charset=utf8"
    fetch_entities = True
    self.db_type = self.get_db_type(url)
    self.log("db type {}".format(self.db_type))
    self.log("Parsed url {}".format(urlparse(url)))
    try:
        mariadb_connection = mariadb.connect(user='haas_admin', password='kirby', host=url, database='homeassistant')
        self.log("If we maded it here, the mariadb.connect command worked")
        self.cursor = mariadb_connection.cursor()
        self.log("Successfully connected to database {}".format(self.stripped_db_url(url)))
        if fetch_entities:
            self.fetch_entities()
    except Exception as exc:
        if isinstance(exc, ImportError):
            raise RuntimeError(
                "The right dependency to connect to your database is "
                "missing. Please make sure that it is installed."
            )
        print(exc)
        raise
    self.db_type = get_db_type(url)

The error I get now has to do with the host parameter in the mariadb.connect command. Can anyone help with that?

Thanks!!

1 Like

Hi Kirby,

Sorry, I’ve been out for a couple of weeks and couldn’t get to my env to get the code before.

Here is an example of a Test script I used to test the inserts into the MariaDB. Not sure about the SQLAlchemy or mysql packages… I used the “PyMySQL” package and it worked like a charm.

import appdaemon.plugins.hass.hassapi as hass, pymysql.cursors

#
# DB Test App. 
#
# I'll try to insert a record on the MariaDB
#
# Args:
#

class db_test(hass.Hass):

  def initialize(self):
     self.log("Trying to insert a record on MariaDB Table")

     # Connect to the database
     connection = pymysql.connect(host='192.168.X.XX',
                                 user='hass',
                                 password='*******',
                                 db='homeassistant',
                                 charset='utf8mb4',
                                 cursorclass=pymysql.cursors.DictCursor)
     self.log("Connected")
     try:
        with connection.cursor() as cursor:
            # Create a new record
            sql = "INSERT INTO ha_datamart_states(entity_id,friendly_name) VALUES (%s, %s)"
            cursor.execute(sql, ('my_first_insert', 'HelloWorld'))

        # connection is not autocommit by default. So you must commit to save
        # your changes.
        self.log("Inserted")
        connection.commit()
        self.log("Commited!")

        with connection.cursor() as cursor:
            # Read a single record
            sql = "SELECT entity_id,friendly_name FROM ha_datamart_states"
            cursor.execute(sql)
            result = cursor.fetchone()
        self.log(result)
     finally:
        connection.close()
     self.log("DBTest Done")
1 Like

And as a kind of example, what I run is a series of SQL Inserts depending on the type of entity I want to track. Each SQL summarizes/aggregates the values I consider important. For instance, this is the SQL I use for “binary” entities:

     sql1=("INSERT INTO ha_datamart_states (cday,chour,entity_id,friendly_name,sensor_type,state,num_changes,state_seconds,kW_consumed,avg_kWh_consumption,max_value,min_value,avg_value) "
          " SELECT concat (curdate() - INTERVAL 1 DAY, ' 00:00:00') cday,tmplt.hour chour,tmplt.entity_id,%s friendly_name,%s sensor_type,tmplt.state,SUM(tmplt.state_changed) num_changes,SUM(tmplt.state_seconds) state_seconds,null kW_consumed,null avg_kWh_consumption,null max_value,null min_value,null avg_value "
          " FROM ( SELECT ttime.hour,tevents.entity_id,tevents.previous_state state, CASE WHEN date_format(tevents.created, '%%H') = ttime.hour AND tevents.state <> 'last' AND tevents.state <> tevents.previous_state THEN 1 ELSE 0 END state_changed, TIME_TO_SEC( ADDTIME(TIMEDIFF( CASE WHEN date_format(tevents.created, '%%H') <> ttime.hour THEN date_add(concat (date_format(tevents.previous_created, '%%Y-%%m-%%d '), hour, ':00:00'), INTERVAL 60 MINUTE) ELSE tevents.created END, CASE WHEN date_format(tevents.previous_created, '%%H') <> ttime.hour THEN concat (date_format(tevents.previous_created, '%%Y-%%m-%%d '), hour, ':00:00') ELSE tevents.previous_created END), CASE WHEN tevents.state='last' and ttime.hour='23' THEN 1 ELSE 0 END )) state_seconds "
          " FROM( SELECT st.entity_id, st.state_id, st.state, st.created, st.previous_state_id, st.previous_state, CASE WHEN date_format(st.previous_created, '%%D') = date_format((NOW() - INTERVAL 2 DAY), '%%D') THEN concat (curdate() - INTERVAL 1 DAY, ' 00:00:00') ELSE st.previous_created END previous_created "
          " FROM (SELECT e.entity_id,e.state_id,e.state,e.created, ( SELECT e2.state_id FROM states e2 WHERE e2.state_id < e.state_id AND e2.entity_id = e.entity_id AND date_format(created, '%%D') = date_format((NOW() - INTERVAL 1 DAY), '%%D') ORDER BY state_id DESC LIMIT 1) previous_state_id, "
          " ( SELECT e2.state FROM states e2 WHERE e2.state_id < e.state_id AND e2.entity_id = e.entity_id ORDER BY state_id DESC LIMIT 1) previous_state, ( SELECT e2.created FROM states e2 WHERE e2.state_id < e.state_id AND e2.entity_id = e.entity_id ORDER BY state_id DESC LIMIT 1) previous_created "
          " FROM states e WHERE entity_id = %s AND date_format(created, '%%D') = date_format((NOW() - INTERVAL 1 DAY), '%%D') ORDER BY e.entity_id ) st UNION ALL "
          " SELECT * FROM (SELECT e.entity_id,e.state_id,'last' state,concat (curdate() - INTERVAL 1 DAY , ' 23:59:59') created,e.state_id previous_state_id,e.state previous_state,e.created previous_created FROM states e WHERE entity_id = %s AND date_format(created, '%%D') = date_format((NOW() - INTERVAL 1 DAY), '%%D') ORDER BY state_id DESC LIMIT 1) st2 ) tevents, "
          " ( SELECT '00' HOUR FROM dual UNION ALL SELECT '01' FROM dual UNION ALL SELECT '02' FROM dual UNION ALL SELECT '03' FROM dual UNION ALL SELECT '04' FROM dual UNION ALL SELECT '05' FROM dual UNION ALL SELECT '06' FROM dual UNION ALL SELECT '07' FROM dual UNION ALL SELECT '08' FROM dual UNION ALL SELECT '09' FROM dual UNION ALL SELECT '10' FROM dual UNION ALL SELECT '11' FROM dual UNION ALL SELECT '12' FROM dual UNION ALL SELECT '13' FROM dual UNION ALL SELECT '14' FROM dual UNION ALL SELECT '15' FROM dual UNION ALL SELECT '16' FROM dual UNION ALL SELECT '17' FROM dual UNION ALL SELECT '18' FROM dual UNION ALL SELECT '19' FROM dual UNION ALL SELECT '20' FROM dual UNION ALL SELECT '21' FROM dual UNION ALL SELECT '22' FROM dual UNION ALL SELECT '23' FROM dual ) ttime "
          " WHERE date_format(tevents.previous_created, '%%H') = ttime.hour OR date_format(tevents.created, '%%H') = ttime.hour OR (date_format(tevents.previous_created, '%%H') < ttime.hour AND date_format(tevents.created, '%%H') > ttime.hour) "
          " ) tmplt GROUP BY   tmplt.hour, tmplt.entity_id, tmplt.state ")

This query counts the number of times (and amount of time) that the entity was on each state. For example, the kitchen switch will tell me how many times it was ON or OFF and for how long, grouped per hour. Advantage of this approach is that I have only 48 records on a day for one switch, instead potential dozens or hundreds of them. So I can keep a quite big historical information.

For other entities light bulbs or power relays I also keep total consumption and average, and same for temperature sensors…

It was a long time since I played with SQL so I’m sure there is a lot of improvement for the query performance and the process, but so far it worked well for me :slight_smile:

2 Likes

Thanks for the response! And of course, no worries. We all have day jobs :slight_smile: I’ll give this a shot today. I’ll switch to the PyMySQL package in appdaemon - I wasn’t sure which ones to use.

This worked perfectly. And i really love your summary sql statement to boil down the data! Again, thanks for the help!

Now that you’ve used this for 2 years, do you still recommend using the MariaDB approach? I’m guessing you set up the MariaDB via an Add-On? It seems like that would work and be a best approach.

For context, I’m pulling out pieces of images from security cameras, then sending tailored MQTT and Notify messages. Next, I want to show them in a AppDaemon Dashboard and do processing on parts of them. I’m currently saving them all to the filesystem and exchanging URLs, also thinking through other approaches… especially on something that’s more reusable for other people.