Taking Home Assistant to the next level. Home Assistant.. Assistant? - AI-powered Machine Learning for HA

dadaloop82 · August 27, 2021, 6:38am

Good morning everyone, like many I am very interested in the issue of equipping our Home Assistant with some form of intelligence.

However, I must make a small premise: artificial intelligence does not really exist, there is a program that based on certain algorithms can detect events and deduce an action, but in this time it is not possible that the machine spontaneously performs actions.

So more than artificial intelligence, we talk about predicted automatisms.

I have read many people who have tried, with methods of certain interest, but then they were too complicated or the results were not satisfactory.

Some time ago I started to think about my own solution that I would like to share.
My concept of data modeling was based on the situations that triggered the change of state of a particular switch, for example.

( ex: prediction of switch_light_kitchen )

retrieve the history of all switch, binary_switch and person entities
look for when the state of the switch (switch_light_kitchen) changes and hang the snapshot of the situation in an object that has the state as key: in this way I have all the situations of when the switch was on or off
I examine the situations giving a “weight” to the entities that have had changes of state before the event, assigning a higher score if the change has happened in that moment
I create a data model based on the entities that have a greater “weight”, and then build a listener on that given entity, so I could calculate if “switch_light_kitchen” should be activated or not (it would send an event to HomeAssistant that was then managed)

That was my idea, but it was probably wrong.
What do you guys think?
Could this help in the evolution of Home Assistant ?

Bye!

Pirol62 · August 27, 2021, 9:27am

This is exactly the right approach because that’s the way we think about automate repeating activities.
The interesting entities are motion sensors, person entities, calendars, switches in the generic meaning that everything that can be turned on or off is a switch, cameras, environment sensors (like illumination, temperature, weather condition).

The most difficult thing is to detect, who is where when there are more then one persons in the house and who of them is intending the state change. That even can’t be solved manually. I simply don’t know exacly, who is in a certain room but thats important for different profiles (light intensity etc). I think this has to be solved reliable before approaching a real good working kind of intelligence.

Alexa is a light in the horizon because it can detect who is speaking when defined the different voices. We tested it and it works

dadaloop82 · August 27, 2021, 9:46am

Hi Pirol62 !

What you have written gives me so much comfort, I thought I was completely off road.
Relying on the sensors to create a snapshot from the moment a switch is activated and learning the data model to create the intuition seemed like a good idea to me.

This could also solve the problem of who performed the action, based on the presence sensor and thus working out a prediction model.

Sure, it won’t be perfect, but it could work.

Alexa recognizes voices based on the analysis model, it doesn’t perform any prediction logic, it’s simply a condition - it can’t be sure it’s me because a friend of mine may have a similar voice.

Pirol62 · August 27, 2021, 2:17pm

Yes, right. Alexa needs at least an activity (speech to alexa) in order to know who, hence it can only help, when alexa is called for an action and the person is passed through.
Predictive is another story

But how will you guess, who is presently initiating the state change?
At a certain point you have to get the data of the persons location. I have no idea than to store it via tag or somethig similar.

dadaloop82 · August 28, 2021, 3:13pm

[ POST UPDATED! ]

Good morning everyone,
convinced by your nice words, I picked up the code I was talking about, written in Python for AppDeamon.

I have modified and adapted it, and currently I get a well defined data model based on the situations that triggered the event

I made a test trying to create a model on the switching of my robot vacuum (switch.neato_botvac_arturo) based on 30 days of history and on my presence in the house (person.daniel_2) and that of my girlfriend (person.giuliana), at the moment I have this output

predict_switch: [ starting ]
predict_switch: [ parsing and analyze history data (days: 0/10) ]
predict_switch: [ parsing and analyze history data (days: 10/20) ]
predict_switch: [ parsing and analyze history data (days: 20/30) ]
predict_switch: [ parsing and analyze history data (days: 30/30) ]
predict_switch: [ history data to analyze: 71 ]
predict_switch:  [ {'bt': {'periodon': {0: [{'on': '14:00', 'off': '14:26', 'probs': 23.33, 'count': 1}, {'on': '19:38', 'off': '20:04', 'probs': 23.33, 'count': 1}], 1: [{'on': '14:03', 'off': '14:25', 'probs': 23.33, 'count': 1}, {'on': '19:07', 'off': '19:11', 'probs': 23.33, 'count': 1}], 2: [{'on': '13:16', 'off': '14:25', 'probs': 23.33, 'count': 1}], 3: [{'on': '10:33', 'off': '11:35', 'probs': 46.67, 'count': 2}, {'on': '14:01', 'off': '14:25', 'probs': 23.33, 'count': 1}, {'on': '08:43', 'off': '09:04', 'probs': 23.33, 'count': 1}, {'on': '11:10', 'off': '11:35', 'probs': 23.33, 'count': 1}, {'on': '14:01', 'off': '14:23', 'probs': 23.33, 'count': 1}], 4: [{'on': '10:46', 'off': '11:10', 'probs': 23.33, 'count': 1}, {'on': '13:49', 'off': '14:26', 'probs': 23.33, 'count': 1}], 5: [{'on': '10:10', 'off': '10:58', 'probs': 46.67, 'count': 2}, {'on': '13:35', 'off': '14:03', 'probs': 46.67, 'count': 2}], 6: [{'on': '09:21', 'off': '10:29', 'probs': 23.33, 'count': 1}, {'on': '15:57', 'off': '16:22', 'probs': 23.33, 'count': 1}]}}, 'bo': {'person.daniel_2': {'home': {'count': 51, 'probs': 71.83}, 'not_home': {'count': 20, 'probs': 28.17}}, 'person.giuliana': {'not_home': {'count': 48, 'probs': 67.61}, 'home': {'count': 23, 'probs': 32.39}}}} ]

Explanation:

(bt) = BaseSwitch (time analysis)
(bo) = BasedonSwitch

Results:
(prob. = calculated probability)

**ON periods during the week:**
- Sunday (0): 14:00-14.26 (23.33% prob) 19:38-20:04 (23.33% prob)
- Monday (1): 14:03-14:25 (23.33% prob) 19:07-19:11 (23.33% prob) 
- Tuesday (2): 13:16-14:25 (23.33% prob) 
- Wednesday (3): 10:33-11:35 (**46.67% prob**) 14:01-14:25 (23.33% prob) ...(and so on)
- Thursday (4): 10:46-11:10 (23.33% prob) 13:49-14:26 (23.33% prob)
- Friday (5): 10:10-10:58 (23.33% prob) 13:35-14:03 (23.33% prob)
- Saturday (6): 09:21-10:29 (23.33% prob) 15:57-16:22 (23.33% prob)

the data are grouped by margin of an hour, so if I have two data for example 13:05-14:10 and 13:15 and 14.35 becomes a single data that 13:05-14.35
I have only historical data for the month of August, the model will be much more accurate with the passage of time

Conditions:

Daniel at home? → 71.83% yes | 28.16% no
Giuliana at home? → 32.39% yes | 67.61% no

If this interests you I’ll keep you updated, maybe open a Github so you can follow or contribute to the project.

What do you think about it?

Pirol62 · August 29, 2021, 11:59am

That looks definately promising.
1st question:
As there will be tons of data, how do you plan to store it?
2nd question:
We see now on different days a certain propability in a given timeframe that the switch will be changed.
But how to interprete these values? If the prob is more then 90% then create an automation or let alexa ask “do you want to switch this device on?”
Maybe there is a gui needed, which allows the user to increase the numer of sensors or modify probability degrees or other parameters.

dadaloop82 · August 29, 2021, 1:38pm

Thank you so very much for your interest, this definitely invites me to continue.

I will answer your questions right away:

The amount of data for creating the initial model should not be a big problem, the problem will probably be the processing time. I have seen that with a 30 day data set, with 4 sensors connected the computation time is about 20 seconds. I assume that for 1 year the time will be about 8 minutes. Obviously it is in my goals to store the models and update them only with new data.
You absolutely hit the nail on the head! In any case, when the conditions are met, an event will be sent to Home Assistant with the probability percentage. It will be in Home Assistant that, if the probability is low, will be asked for confirmation through alexa (alexaActions)

I have opened a repository on Github, if anyone would like to give me some advice, programming help or anything else, I really thank you very much!

Pirol62 · August 30, 2021, 10:59am

Ok, sounds good.
I would implement it like this: the event triggers an automation, which let alexa/google ask: You switched this entity by 90% during the last week. Do you want to switch it now? if yes a second question should be “shall I create an automation?” When yes, alexa never have to ask again.

A second issue comes up when thinking about a yearly time period. The behavior changes i.e. over the seasons. In summertime a switch is changed at another time or never compared to winter. So the prediction has to be calculated over a long (at least a year) time period and will provide different results over this time.
So my first suggestion, creating an automation, is not that good in a first state. Otherwise: Let alexa always ask questions can lead to annoying situations where alexa asks too often. Maybe, there should be the option to reduce this support only to a type of switches (light, covers, heating etc)

dadaloop82 · August 30, 2021, 1:42pm

Hallo!
Can I please ask you if you write your suggestions directly on the project’s GitHub?
So we continue the discussion there.
Thanks!

Pirol62 · August 30, 2021, 2:10pm

Yes shure. Will do

klacol · October 20, 2021, 6:10am

This is a great idea @chrisvacc.

My use case is, to predict the amount of energy, that can be produced ny by solar system today, based on the weather situation (sunny, cloud, Rain) and the remaining time until sunset. I need this amount of energy (in kWh) to decide, weather to turn on the charger of the electrical car, or weather to postpone it to later (if possible). I have changed the use case from the one, that @dadaloop82 has provided, perhaps it helps to see another one.

My thoughts on this, after reading the posts carefully:

1. Prediction of sensor states

The prediction of sensor states, based on other sensor states is a very good way to start this. I would not reduce it only to switches. The sensor state of a switch is “on” or “off”. I could be possible to predict the state of sensors state with a probability of x percent.

2. Configuration

The configuration of such a component in the configuration.yaml could be like this:

machine-learning:
  models:
    - name: "Erwarteter Solarertrag heute"
      unique_id: expected_solar_output_today
      state_class: measurement
      device_class: energy
      unit_of_measurement: kWh
      start_prediction: 2021-05-01 00:00 +02:00
      add_future_values: true
      target_entity: sensor.solar_energy_produced
      input_entities:
        - entity: weather.myaddress
        - entity: sun.next_setting
      include_context: true
    - name: "my second model"
      unique_id: my_second_model
      ...

The context, explaind by @balloob, could be the way, to relate the states and the events; but that is not clear in my head right now.

3. Execution environment

The execution of such a prediction can be done by the execution environment of https://appdaemon.readthedocs.io/. Perhaps, we ask @ Andrew Cockburn, what he thinks about this concept. The execution can be fast or slow, depending on several external facts (CPU, time frame, amount of input_entities, ans so on); so we would expect the results, after the calculations are done (1 Minute, 10 Minutes, 2 hours). The questions ist, if we nee some feedback to get to know some information about the status of the calculations.

There will be the first execution of the prediction (the calculation of the first model). That would take a while, since the (recorded) history of all included entities since the start_prediction have to be calculated.

Then, the model has to be stored somewhere persistent, to stay alive also durings reboots.

After that, that, the existing model would be extended, when one of the input entities trigger a change: The model calculation would be triggered again, but now it will eventually run in shorter time, since only one change of an included entities has to be calculated. The updated model would have to be stored again, and the predecessor model can be overwritten.

4. Output

The output of such a ML prediction could be an sensor, based on the configuration above, and with attributes, e.g. like this:

entity: expected_solar_output_today
state: 2.1
attributes:
  - type: ml-sensor
  - probability: 79 %
  - state_class: measurement
  - device_class: energy
  - unit_of_measurement: kWh
  - friendly_name: "Erwarteter Solarertrag heute"
  - trace:
    algorithm: "The name of the ML algorithm"
    - calculations:
      first_prediction: 2021-08-19T19:57:27+01:00
      last_prediction: 2021-10-19T19:57:27+01:00
      number_of_predictions: 567
    ...:

If one of the input entities trigger a change, the model calculation itself would be triggered and the ml-sensor would be updated as described.

The trace adresses the very importan point, that @robmarkcole made: “why does it do this?” People will create further automation on the ml-sensor; and they need to know something about the quality of the sensor. It will be different, if the ml-sensor has 10 input values, or 1000. And it is important, from which time frame the values come.

5. Naming

The name of such a component could be machine-learning. This is, what is done here. It could be a custom component.

dadaloop82 · October 20, 2021, 6:58am

Hi @klacol!

I really thank you so much for the very valuable information you provided.

At the moment, due to force majeure, the project is temporarily shelved but I would like to resume it absolutely as soon as possible.
I should resolve some personal issues within a couple of weeks and then get back to work.
I’d love it if you’d join my Github channel to collaborate together, even if it’s just with ideas.

Thanks so much!
Daniel

wtv1211 · January 17, 2022, 10:24am

Hello! The “Compensation” integration in HA needs to be more advanced. I would like a button to feed in data from entities, to do some faster (or longer) calibrating. Or maybe having more than pairs of numbers… Case and point: predicting power usage of a split heat pump so that it can closely match available solar power based on factors like interior, exterior temp, its delta, etc so that an automation will select the best scenario.
Or even more awesome, showing real time heatpump COP and finding out if is better to crank up the heat output now at high exterior temp (thus better COP) or leave as is, based in some weather forecast or even local temp falling rate.
Update: used https://stats.blue to get a model (formula) for COP. Templated power*cop formula and then Riemann integration on the result to get equivalent gas in kwh for a heat pump.

2max2pax · February 15, 2022, 5:42pm

@dadaloop82, this is really interesting.
I’ve started collecting data in a SQLite database with the idea to create an algorithm that could predict next actions.
I used pyscript (that is much easier for me) for handling the database.
In order to limit the number of records I created two automations that trigger when the status of my devices change, one for covers and one for lights.
I’m now changing the database, trying to make it more meaningful, with more columns(data): each record should be a picture of the status of all entities in all the house.
So, now the database is really stupid, but once decided how to set it up I could share with you in order to let you evaluate your system with different data.
I’m open to suggestions for the set-up of the database.

EDIT:
@chrisvacc , sorry I missed to tag you.

vingerha · March 24, 2022, 5:24pm

@chrisvacc Hi, I see a lot of interest with the posts but very little response form your side. Is this still active, i.e. are you still working on this?

dadaloop82 · April 6, 2022, 2:00pm

Good morning everyone, sorry for my long absence.

I’ve been thinking a lot about applying AI to Home Assistant and I’ve done many experiments in this regard, but unfortunately they never gave me satisfaction.

I tried to change my point of view and wanted to express my experiences in this thread I just wrote in this Forum. I would be very pleased to receive your comments and proposals, your thoughts and concerns, in order to arrive together to a possible solution, in a constructive way.

Thank you!

veonua · April 30, 2022, 7:59pm

I am a data-scientist with a background in Computer vision. Currently want to play with time-series data for my next project. Also it would be interesting to do a ML project in IOT/Smart Home area.

There are several ideas for me to try, but I am not HA user and have only a week long data from my setup (about 40 sensors and devices)

If you can share your data set it would be a great start

sygys · May 20, 2022, 8:55pm

I know what you mean. I just started thinking of building a system with machine learning.but making it truly learn something is terrible complicated. Letting HA suggest automation based on stuff you do every day is fun but not really smart. Because it will only know this if you do the exact same thing every day. In which it’s easier to just write an automation for it and be done with it.

In my opinion AI needs to have a data set and being able make the connections on its own. So if I come out of bed one hour later it still needs to know what to do. And when I don’t work at home on Wednesday this week but a Thursday it needs to recognize that and act on it based on its findings. So it’s needs to recognize patterns that can differ in 1000s of ways but still knows what to do. I guess if you really want to go this far it’s simply impossible to write with a few lines of code. Google is spending 1000nds of man-hours a week on this and they can’t even set my alarm clock to go off earlier when there is a traffic jam. It’s very complicated. You have 1000s of variables that need to be checked every second

Jpsy · May 21, 2022, 5:02am

@sygys I so totally agree with you. I am following this topic out of curiosity. But I often shook my head reading the high hopes that were expressed here. I am convinced that a neural network will rarely be able to infere the deeper motivations of our actions from the patterns of the sensors and actuators in our houses. It would surely produce occasional hits, but they would be embedded in a large number of wrong actions. These fails would be frustrating to all users and would totally ruin the acceptance of the system. Sorry for disenchanting you guys, but IMHO this is not feasible in the magical way that you are envisioning it.

computermaster0101 · June 1, 2022, 4:58pm

Hey Team! I’ve started this same project earlier this year. Is there a slack channel that can be joined to jump into and participate the discussion and work?