This is a great idea @chrisvacc.
My use case is, to predict the amount of energy, that can be produced ny by solar system today, based on the weather situation (sunny, cloud, Rain) and the remaining time until sunset. I need this amount of energy (in kWh) to decide, weather to turn on the charger of the electrical car, or weather to postpone it to later (if possible). I have changed the use case from the one, that @dadaloop82 has provided, perhaps it helps to see another one.
My thoughts on this, after reading the posts carefully:
1. Prediction of sensor states
The prediction of sensor states, based on other sensor states is a very good way to start this. I would not reduce it only to switches. The sensor state of a switch is “on” or “off”. I could be possible to predict the state of sensors state with a probability of x percent.
2. Configuration
The configuration of such a component in the configuration.yaml
could be like this:
machine-learning:
models:
- name: "Erwarteter Solarertrag heute"
unique_id: expected_solar_output_today
state_class: measurement
device_class: energy
unit_of_measurement: kWh
start_prediction: 2021-05-01 00:00 +02:00
add_future_values: true
target_entity: sensor.solar_energy_produced
input_entities:
- entity: weather.myaddress
- entity: sun.next_setting
include_context: true
- name: "my second model"
unique_id: my_second_model
...
The context, explaind by @balloob, could be the way, to relate the states and the events; but that is not clear in my head right now.
3. Execution environment
The execution of such a prediction can be done by the execution environment of https://appdaemon.readthedocs.io/. Perhaps, we ask @ Andrew Cockburn, what he thinks about this concept. The execution can be fast or slow, depending on several external facts (CPU, time frame, amount of input_entities, ans so on); so we would expect the results, after the calculations are done (1 Minute, 10 Minutes, 2 hours). The questions ist, if we nee some feedback to get to know some information about the status of the calculations.
There will be the first execution of the prediction (the calculation of the first model). That would take a while, since the (recorded) history of all included entities since the start_prediction have to be calculated.
Then, the model has to be stored somewhere persistent, to stay alive also durings reboots.
After that, that, the existing model would be extended, when one of the input entities trigger a change: The model calculation would be triggered again, but now it will eventually run in shorter time, since only one change of an included entities has to be calculated. The updated model would have to be stored again, and the predecessor model can be overwritten.
4. Output
The output of such a ML prediction could be an sensor, based on the configuration above, and with attributes, e.g. like this:
entity: expected_solar_output_today
state: 2.1
attributes:
- type: ml-sensor
- probability: 79 %
- state_class: measurement
- device_class: energy
- unit_of_measurement: kWh
- friendly_name: "Erwarteter Solarertrag heute"
- trace:
algorithm: "The name of the ML algorithm"
- calculations:
first_prediction: 2021-08-19T19:57:27+01:00
last_prediction: 2021-10-19T19:57:27+01:00
number_of_predictions: 567
...:
If one of the input entities trigger a change, the model calculation itself would be triggered and the ml-sensor would be updated as described.
The trace adresses the very importan point, that @robmarkcole made: “why does it do this?” People will create further automation on the ml-sensor; and they need to know something about the quality of the sensor. It will be different, if the ml-sensor has 10 input values, or 1000. And it is important, from which time frame the values come.
5. Naming
The name of such a component could be machine-learning
. This is, what is done here. It could be a custom component.