Taking Home Assistant to the next level. Home Assistant.. Assistant? - AI-powered Machine Learning for HA

What attributes would we store for a data point?

Timestamp, Sensor Value, Measurement Unit,…

What else?
Execution context?

Sounds interesting. There was already a talk about this topic during the conference last year: https://youtu.be/ErWK0wV6uMQ

1 Like

Hi, fantastic idea! I found this thread while searching for such a solution.

I’m not a dev. But I’ll have an interesting setup in a few weeks: photovoltaic system, heat pump, weather station, dishwasher, washing machine, roller shutters etc. all connected to home assistant.

My input: think about solution to start when there is hardly any training data available. E. g. in my situation with a new house and a new home automation setting. I would like to be able to predefine some rules to avoid breaking things. E. g. “do not close the shades in strong winds”. So while I understand how ML and training works, I need a way to start without having a system already having been creating training data for months or even longer.

I’m not a developer so, I can’t help but…
WOW. What an idea

How could non-programmers help out? I’m not rich, but if there was a crowd fund I would contribute. The fund to commission outside HA community help when necessary.

Sound very interesting. I find it kind of frustrating to store a lot of data but not being able to make anything intelligent out of it except for kind of immediate reactive automations.

Which makes we also suggest that any such development could (should ?) leverage historical data from influxdb. Personally I keep maybe 5 days of data in HA’s local db only, but my influxDB has years of historical data. From an AI perspective, the more data the better, no ?

Great topic. Wanted to add my thoughts on a possible direction using AI.

I would love to be able to type my intent and for hassio to generate the necessary yaml automation.

I have tried the Almond addon to try to achieve this but results limited at best.

My proposal would be to evaluate OpenAI Codex to see if it could (in near future) be used to achieve this. I recently got beta access and its pretty amazing - where not getting rid of programmers anytime soon - but - for low/no code situations it could deliver real value.

If you want to get an idea of what’s possible this video shows me converting English language to SQL. Keen to know what people think.

Good morning everyone, like many I am very interested in the issue of equipping our Home Assistant with some form of intelligence.

However, I must make a small premise: artificial intelligence does not really exist, there is a program that based on certain algorithms can detect events and deduce an action, but in this time it is not possible that the machine spontaneously performs actions.

So more than artificial intelligence, we talk about predicted automatisms.

I have read many people who have tried, with methods of certain interest, but then they were too complicated or the results were not satisfactory.

Some time ago I started to think about my own solution that I would like to share.
My concept of data modeling was based on the situations that triggered the change of state of a particular switch, for example.

( ex: prediction of switch_light_kitchen )

  • retrieve the history of all switch, binary_switch and person entities

  • look for when the state of the switch (switch_light_kitchen) changes and hang the snapshot of the situation in an object that has the state as key: in this way I have all the situations of when the switch was on or off

  • I examine the situations giving a “weight” to the entities that have had changes of state before the event, assigning a higher score if the change has happened in that moment

  • I create a data model based on the entities that have a greater “weight”, and then build a listener on that given entity, so I could calculate if “switch_light_kitchen” should be activated or not (it would send an event to HomeAssistant that was then managed)

That was my idea, but it was probably wrong.
What do you guys think?
Could this help in the evolution of Home Assistant ?

Bye!

2 Likes

This is exactly the right approach because that’s the way we think about automate repeating activities.
The interesting entities are motion sensors, person entities, calendars, switches in the generic meaning that everything that can be turned on or off is a switch, cameras, environment sensors (like illumination, temperature, weather condition).

The most difficult thing is to detect, who is where when there are more then one persons in the house and who of them is intending the state change. That even can’t be solved manually. I simply don’t know exacly, who is in a certain room but thats important for different profiles (light intensity etc). I think this has to be solved reliable before approaching a real good working kind of intelligence.

Alexa is a light in the horizon because it can detect who is speaking when defined the different voices. We tested it and it works

Hi Pirol62 !

What you have written gives me so much comfort, I thought I was completely off road.
Relying on the sensors to create a snapshot from the moment a switch is activated and learning the data model to create the intuition seemed like a good idea to me.

This could also solve the problem of who performed the action, based on the presence sensor and thus working out a prediction model.

Sure, it won’t be perfect, but it could work.

Alexa recognizes voices based on the analysis model, it doesn’t perform any prediction logic, it’s simply a condition - it can’t be sure it’s me because a friend of mine may have a similar voice.

Yes, right. Alexa needs at least an activity (speech to alexa) in order to know who, hence it can only help, when alexa is called for an action and the person is passed through.
Predictive is another story :wink:

But how will you guess, who is presently initiating the state change?
At a certain point you have to get the data of the persons location. I have no idea than to store it via tag or somethig similar.

[ POST UPDATED! ]

Good morning everyone,
convinced by your nice words, I picked up the code I was talking about, written in Python for AppDeamon.

I have modified and adapted it, and currently I get a well defined data model based on the situations that triggered the event

I made a test trying to create a model on the switching of my robot vacuum (switch.neato_botvac_arturo) based on 30 days of history and on my presence in the house (person.daniel_2) and that of my girlfriend (person.giuliana), at the moment I have this output

predict_switch: [ starting ]
predict_switch: [ parsing and analyze history data (days: 0/10) ]
predict_switch: [ parsing and analyze history data (days: 10/20) ]
predict_switch: [ parsing and analyze history data (days: 20/30) ]
predict_switch: [ parsing and analyze history data (days: 30/30) ]
predict_switch: [ history data to analyze: 71 ]
predict_switch:  [ {'bt': {'periodon': {0: [{'on': '14:00', 'off': '14:26', 'probs': 23.33, 'count': 1}, {'on': '19:38', 'off': '20:04', 'probs': 23.33, 'count': 1}], 1: [{'on': '14:03', 'off': '14:25', 'probs': 23.33, 'count': 1}, {'on': '19:07', 'off': '19:11', 'probs': 23.33, 'count': 1}], 2: [{'on': '13:16', 'off': '14:25', 'probs': 23.33, 'count': 1}], 3: [{'on': '10:33', 'off': '11:35', 'probs': 46.67, 'count': 2}, {'on': '14:01', 'off': '14:25', 'probs': 23.33, 'count': 1}, {'on': '08:43', 'off': '09:04', 'probs': 23.33, 'count': 1}, {'on': '11:10', 'off': '11:35', 'probs': 23.33, 'count': 1}, {'on': '14:01', 'off': '14:23', 'probs': 23.33, 'count': 1}], 4: [{'on': '10:46', 'off': '11:10', 'probs': 23.33, 'count': 1}, {'on': '13:49', 'off': '14:26', 'probs': 23.33, 'count': 1}], 5: [{'on': '10:10', 'off': '10:58', 'probs': 46.67, 'count': 2}, {'on': '13:35', 'off': '14:03', 'probs': 46.67, 'count': 2}], 6: [{'on': '09:21', 'off': '10:29', 'probs': 23.33, 'count': 1}, {'on': '15:57', 'off': '16:22', 'probs': 23.33, 'count': 1}]}}, 'bo': {'person.daniel_2': {'home': {'count': 51, 'probs': 71.83}, 'not_home': {'count': 20, 'probs': 28.17}}, 'person.giuliana': {'not_home': {'count': 48, 'probs': 67.61}, 'home': {'count': 23, 'probs': 32.39}}}} ]

Explanation:

(bt) = BaseSwitch (time analysis)
(bo) = BasedonSwitch

Results:
(prob. = calculated probability)

**ON periods during the week:**
- Sunday (0): 14:00-14.26 (23.33% prob) 19:38-20:04 (23.33% prob)
- Monday (1): 14:03-14:25 (23.33% prob) 19:07-19:11 (23.33% prob) 
- Tuesday (2): 13:16-14:25 (23.33% prob) 
- Wednesday (3): 10:33-11:35 (**46.67% prob**) 14:01-14:25 (23.33% prob) ...(and so on)
- Thursday (4): 10:46-11:10 (23.33% prob) 13:49-14:26 (23.33% prob)
- Friday (5): 10:10-10:58 (23.33% prob) 13:35-14:03 (23.33% prob)
- Saturday (6): 09:21-10:29 (23.33% prob) 15:57-16:22 (23.33% prob)
  • the data are grouped by margin of an hour, so if I have two data for example 13:05-14:10 and 13:15 and 14.35 becomes a single data that 13:05-14.35
  • I have only historical data for the month of August, the model will be much more accurate with the passage of time

Conditions:

Daniel at home? → 71.83% yes | 28.16% no
Giuliana at home? → 32.39% yes | 67.61% no

If this interests you I’ll keep you updated, maybe open a Github so you can follow or contribute to the project.

What do you think about it?

2 Likes

That looks definately promising.
1st question:
As there will be tons of data, how do you plan to store it?
2nd question:
We see now on different days a certain propability in a given timeframe that the switch will be changed.
But how to interprete these values? If the prob is more then 90% then create an automation or let alexa ask “do you want to switch this device on?”
Maybe there is a gui needed, which allows the user to increase the numer of sensors or modify probability degrees or other parameters.

Thank you so very much for your interest, this definitely invites me to continue.

I will answer your questions right away:

  1. The amount of data for creating the initial model should not be a big problem, the problem will probably be the processing time. I have seen that with a 30 day data set, with 4 sensors connected the computation time is about 20 seconds. I assume that for 1 year the time will be about 8 minutes. Obviously it is in my goals to store the models and update them only with new data.

  2. You absolutely hit the nail on the head! In any case, when the conditions are met, an event will be sent to Home Assistant with the probability percentage. It will be in Home Assistant that, if the probability is low, will be asked for confirmation through alexa (alexaActions)

I have opened a repository on Github, if anyone would like to give me some advice, programming help or anything else, I really thank you very much!

2 Likes

Ok, sounds good.
I would implement it like this: the event triggers an automation, which let alexa/google ask: You switched this entity by 90% during the last week. Do you want to switch it now? if yes a second question should be “shall I create an automation?” When yes, alexa never have to ask again.

A second issue comes up when thinking about a yearly time period. The behavior changes i.e. over the seasons. In summertime a switch is changed at another time or never compared to winter. So the prediction has to be calculated over a long (at least a year) time period and will provide different results over this time.
So my first suggestion, creating an automation, is not that good in a first state. Otherwise: Let alexa always ask questions can lead to annoying situations where alexa asks too often. Maybe, there should be the option to reduce this support only to a type of switches (light, covers, heating etc)

Hallo!
Can I please ask you if you write your suggestions directly on the project’s GitHub?
So we continue the discussion there.
Thanks!

Yes shure. Will do

This is a great idea @chrisvacc.

My use case is, to predict the amount of energy, that can be produced ny by solar system today, based on the weather situation (sunny, cloud, Rain) and the remaining time until sunset. I need this amount of energy (in kWh) to decide, weather to turn on the charger of the electrical car, or weather to postpone it to later (if possible). I have changed the use case from the one, that @dadaloop82 has provided, perhaps it helps to see another one.

My thoughts on this, after reading the posts carefully:

1. Prediction of sensor states

The prediction of sensor states, based on other sensor states is a very good way to start this. I would not reduce it only to switches. The sensor state of a switch is “on” or “off”. I could be possible to predict the state of sensors state with a probability of x percent.

2. Configuration

The configuration of such a component in the configuration.yaml could be like this:

machine-learning:
  models:
    - name: "Erwarteter Solarertrag heute"
      unique_id: expected_solar_output_today
      state_class: measurement
      device_class: energy
      unit_of_measurement: kWh
      start_prediction: 2021-05-01 00:00 +02:00
      add_future_values: true
      target_entity: sensor.solar_energy_produced
      input_entities:
        - entity: weather.myaddress
        - entity: sun.next_setting
      include_context: true
    - name: "my second model"
      unique_id: my_second_model
      ...

The context, explaind by @balloob, could be the way, to relate the states and the events; but that is not clear in my head right now.

3. Execution environment

The execution of such a prediction can be done by the execution environment of https://appdaemon.readthedocs.io/. Perhaps, we ask @ Andrew Cockburn, what he thinks about this concept. The execution can be fast or slow, depending on several external facts (CPU, time frame, amount of input_entities, ans so on); so we would expect the results, after the calculations are done (1 Minute, 10 Minutes, 2 hours). The questions ist, if we nee some feedback to get to know some information about the status of the calculations.

There will be the first execution of the prediction (the calculation of the first model). That would take a while, since the (recorded) history of all included entities since the start_prediction have to be calculated.

Then, the model has to be stored somewhere persistent, to stay alive also durings reboots.

After that, that, the existing model would be extended, when one of the input entities trigger a change: The model calculation would be triggered again, but now it will eventually run in shorter time, since only one change of an included entities has to be calculated. The updated model would have to be stored again, and the predecessor model can be overwritten.

4. Output

The output of such a ML prediction could be an sensor, based on the configuration above, and with attributes, e.g. like this:

entity: expected_solar_output_today
state: 2.1
attributes:
  - type: ml-sensor
  - probability: 79 %
  - state_class: measurement
  - device_class: energy
  - unit_of_measurement: kWh
  - friendly_name: "Erwarteter Solarertrag heute"
  - trace:
    algorithm: "The name of the ML algorithm"
    - calculations:
      first_prediction: 2021-08-19T19:57:27+01:00
      last_prediction: 2021-10-19T19:57:27+01:00
      number_of_predictions: 567
    ...:

If one of the input entities trigger a change, the model calculation itself would be triggered and the ml-sensor would be updated as described.

The trace adresses the very importan point, that @robmarkcole made: “why does it do this?” People will create further automation on the ml-sensor; and they need to know something about the quality of the sensor. It will be different, if the ml-sensor has 10 input values, or 1000. And it is important, from which time frame the values come.

5. Naming

The name of such a component could be machine-learning. This is, what is done here. It could be a custom component.

3 Likes

Hi @klacol!

I really thank you so much for the very valuable information you provided.

At the moment, due to force majeure, the project is temporarily shelved but I would like to resume it absolutely as soon as possible.
I should resolve some personal issues within a couple of weeks and then get back to work.
I’d love it if you’d join my Github channel to collaborate together, even if it’s just with ideas.

Thanks so much!
Daniel

Hello! The “Compensation” integration in HA needs to be more advanced. I would like a button to feed in data from entities, to do some faster (or longer) calibrating. Or maybe having more than pairs of numbers… Case and point: predicting power usage of a split heat pump so that it can closely match available solar power based on factors like interior, exterior temp, its delta, etc so that an automation will select the best scenario.
Or even more awesome, showing real time heatpump COP and finding out if is better to crank up the heat output now at high exterior temp (thus better COP) or leave as is, based in some weather forecast or even local temp falling rate.
Update: used https://stats.blue to get a model (formula) for COP. Templated power*cop formula and then Riemann integration on the result to get equivalent gas in kwh for a heat pump.

1 Like