Machine learning models for time series forecast

yes all of these, i first thought it must be something to setup in the addon config or set an entity_id in the predict call but doesnt worked as well, attached the log for every of those:

2023-07-16 13:21:54,898 - web_server - INFO - Setting up needed data
2023-07-16 13:21:54,904 - web_server - INFO - Retrieve hass get data method initiated...
2023-07-16 13:21:54,925 - web_server - ERROR - The retrieved JSON is empty, check that correct day or variable names are passed
2023-07-16 13:21:54,925 - web_server - ERROR - Either the names of the passed variables are not correct or days_to_retrieve is larger than the recorded history of your sensor (check your recorder settings)
2023-07-16 13:21:54,926 - web_server - ERROR - Exception on /action/forecast-model-predict [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/flask/app.py", line 2190, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.9/dist-packages/flask/app.py", line 1486, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.9/dist-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.9/dist-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/usr/local/lib/python3.9/dist-packages/emhass/web_server.py", line 174, in action_call
    input_data_dict = set_input_data_dict(config_path, str(data_path), costfun,
  File "/usr/local/lib/python3.9/dist-packages/emhass/command_line.py", line 146, in set_input_data_dict
    rh.get_data(days_list, var_list)
  File "/usr/local/lib/python3.9/dist-packages/emhass/retrieve_hass.py", line 147, in get_data
    self.df_final = pd.concat([self.df_final, df_day], axis=0)
UnboundLocalError: local variable 'df_day' referenced before assignment

as commented i changed and checked the recorder settings and they record 30 days.

Ok. To be sure when you launch the fit method provide a custom name for the model type and provide the name of your sensor. For example "model_type": "my_custom_model" and "var_model":"sensor.my_sensor"
Then when using the predict method provide the exact same keys for model_type and var_model.
As you already checked that you have enough data on your sensor then the other only possible problem is just the name of the sensor, double check that it is correct.

1 Like

this finally solved my issues. The following config is now working:

  forecast_ml_fit: "curl -i -H \"Content-Type:application/json\" -X POST -d '{\"days_to_retrieve\": 7, \"var_model\": \"sensor.tanken_grand\",\"model_type\": \"my_custom_model\",\"num_lags\": 48}' http://homeassistant:5000/action/forecast-model-fit"
  forecast_ml_pred: "curl -i -H \"Content-Type:application/json\" -X POST -d '{\"model_predict_publish\": \"True\",\"model_predict_entity_id\": \"sensor.tanken_prediction\",\"model_type\": \"my_custom_model\", \"var_model\": \"sensor.tanken_grand\"}' http://homeassistant:5000/action/forecast-model-predict"

and follow up question: everything is working perfectly now beside the fact, that the sensor in home assistant only has one value over all timepoints. on the webiterface i can see nicely predicted values between 1.7 and 2 but the sensor only holds 2 as a value for every point.

Any clue what i could have done wrong here?

Do you mean that in the attributes for future values of the new forecast sensor all the values are the same? But in the graph on port 5000 you have continuous variable values?
If this is the case this is a current issue that was solved just recently on the core code. I’ll publish a new release of the add-on with this solved this week

1 Like

yes thats exactly the case, the prediction is a constant value for all dates and the port 5000 website shows continous values. Looking forward to your release but this is really awesome and i was really eagerly awaiting this a long time. Also a big thanks for your help in getting everything running=)

Sure don’t worry :+1:
The power of this is that you can actually use this as many times as you want and with as many sensors as you want as long as you have enough data. Then build your automations based on the forecasts.

1 Like

Done! Just released the new patched version, it should be available soon (~45mins)

@davidusb I have been running mlforecaster successfully over the last weeks and started having issues in MPC optimization due to array sizes being different and not working well with prediction horizon. I started investigated and published the predict results from the mlforecaster model… it only publishes 12 hours. Is this normal or how can I push the predict function to output a longer prediction, for example, next 48 hours?

2023-08-14 20:29:50,274 - web_server - INFO - >> Performing a machine learning forecast model predict…
2023-08-14 20:29:50,309 - web_server - INFO - Successfully posted to sensor.p_load_forecast_custom_model = 1720.83

unit_of_measurement: W
friendly_name: Load Power Forecast custom ML model
scheduled_forecast:

  • date: ‘2023-08-14 19:00:00+00:00’
    p_load_forecast_custom_model: ‘1720.47’
  • date: ‘2023-08-14 20:00:00+00:00’
    p_load_forecast_custom_model: ‘1815.17’
  • date: ‘2023-08-14 21:00:00+00:00’
    p_load_forecast_custom_model: ‘1732.66’
  • date: ‘2023-08-14 22:00:00+00:00’
    p_load_forecast_custom_model: ‘1807.06’
  • date: ‘2023-08-14 23:00:00+00:00’
    p_load_forecast_custom_model: ‘1587.01’
  • date: ‘2023-08-15 00:00:00+00:00’
    p_load_forecast_custom_model: ‘1681.45’
  • date: ‘2023-08-15 01:00:00+00:00’
    p_load_forecast_custom_model: ‘1585.14’
  • date: ‘2023-08-15 02:00:00+00:00’
    p_load_forecast_custom_model: ‘1586.99’
  • date: ‘2023-08-15 03:00:00+00:00’
    p_load_forecast_custom_model: ‘1482.97’
  • date: ‘2023-08-15 04:00:00+00:00’
    p_load_forecast_custom_model: ‘1425.45’
  • date: ‘2023-08-15 05:00:00+00:00’
    p_load_forecast_custom_model: ‘1529.29’
  • date: ‘2023-08-15 06:00:00+00:00’
    p_load_forecast_custom_model: ‘1406.04’

Started playing with this add on for my water sensor (wanna compare actual consumption with forecasted data, and if there’s a big deviation, trigger a notification) and first suggestion to the community is: don’t run the forecast-model-tune on a PI4 as it will bring it down :sweat_smile:

Thanks for the feedback. There was a warning about this but I guess that it is worth trying.
For reference that computation crashed for how much history data?
I don’t see why it can’t work for short and hence small data sets.

Indeed, I read the warning, but since I’m waiting to have PI5 released, I decided to give it a try.
I ran it for a sensor with 7 days history.

I may have ran it wrongly, as I passed the same parameters for the model-tune as I passed previously for the model-fit:
{"days_to_retrieve":7,"var_model":"sensor.agua_streamlabs","model_type":"load_forecast","sklearn_model":"KNeighborsRegressor","num_lags":48,"split_date_delta":"48h","perform_backtest":"False"}

so will give it a new try by passing only the sensor in the model-tune after running the model-fit:
{"var_model":"sensor.agua_streamlabs"}

@davidusb just did as explained above, and do get a (big) array of data (almost immediately after sending the post command):

But the logs said differently:
2023-10-19 09:40:54,103 - web_server - ERROR - Either the names of the passed variables are not correct or days_to_retrieve is larger than the recorded history of your sensor (check your recorder settings)

I then added a new parameter in the request:
"days_to_retrieve": 7

… as I guessed the model used what I had on the EMHASS configuration (30days history), and got results:

image

These are the EMHASS logs now:

And optimized model:

For reference, yesterday (with the request from my post above), this was the PI4 processor usage (you can see it increasing and then going down, with it shut down) and today’s:
image

Ok great!
That seems like a good improvement for that model.
Your graph is not showing processor usage but memory use instead…

Sorry, PI4 memory you’re right

I’ve setup the following to compare actual consumption of water with predicted:

  1. Configured the following flows in node-red (still running manually and with debug nodes, later will automate and delete the debug nodes):

Code here:
[{"id":"0cfd43a5b58423a3","type":"inject","z":"d487ddbb6fe9ee48","name":"","props":[{"p":"payload"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"{\"days_to_retrieve\":7,\"var_model\":\"sensor.agua_streamlabs\",\"model_type\":\"my_custom_model\",\"sklearn_model\":\"KNeighborsRegressor\",\"num_lags\":48,\"split_date_delta\":\"48h\",\"perform_backtest\":\"False\"}","payloadType":"json","x":230,"y":180,"wires":[["20f58003f042c8f1"]]},{"id":"20f58003f042c8f1","type":"http request","z":"d487ddbb6fe9ee48","name":"Forecast Model Fit","method":"POST","ret":"bin","paytoqs":"ignore","url":"http://localhost:5000/action/forecast-model-fit","tls":"","persist":false,"proxy":"","insecureHTTPParser":false,"authType":"","senderr":false,"headers":[],"x":450,"y":180,"wires":[["68cb84c07b5a0361"]]},{"id":"68cb84c07b5a0361","type":"debug","z":"d487ddbb6fe9ee48","name":"debug 25","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","statusVal":"","statusType":"auto","x":700,"y":180,"wires":[]},{"id":"d55ee4456bcf294e","type":"inject","z":"d487ddbb6fe9ee48","name":"","props":[{"p":"payload"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"{\"days_to_retrieve\":7,\"var_model\":\"sensor.agua_streamlabs\"}","payloadType":"json","x":230,"y":300,"wires":[["a3ba44439ed1c0da"]]},{"id":"a3ba44439ed1c0da","type":"http request","z":"d487ddbb6fe9ee48","name":"Forecast Model Tune","method":"POST","ret":"bin","paytoqs":"ignore","url":"http://localhost:5000/action/forecast-model-tune","tls":"","persist":false,"proxy":"","insecureHTTPParser":false,"authType":"","senderr":false,"headers":[],"x":460,"y":300,"wires":[["d725e885e47387d2"]]},{"id":"d725e885e47387d2","type":"debug","z":"d487ddbb6fe9ee48","name":"debug 27","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","statusVal":"","statusType":"auto","x":700,"y":300,"wires":[]},{"id":"36076dcd02bce423","type":"inject","z":"d487ddbb6fe9ee48","name":"","props":[{"p":"payload"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"{\"model_predict_publish\":\"True\",\"model_predict_entity_id\":\"sensor.agua_streamlabswater_prediction_emhass\",\"var_model\":\"sensor.agua_streamlabs\",\"model_type\":\"my_custom_model\",\"days_to_retrieve\":7}","payloadType":"json","x":230,"y":420,"wires":[["6753f2abee4df3e2"]]},{"id":"6753f2abee4df3e2","type":"http request","z":"d487ddbb6fe9ee48","name":"Forecast Model Predict","method":"POST","ret":"bin","paytoqs":"ignore","url":"http://homeassistant:5000/action/forecast-model-predict","tls":"","persist":false,"proxy":"","insecureHTTPParser":false,"authType":"","senderr":false,"headers":[],"x":470,"y":420,"wires":[["820856ac20763445"]]},{"id":"820856ac20763445","type":"debug","z":"d487ddbb6fe9ee48","name":"debug 28","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","statusVal":"","statusType":"auto","x":700,"y":420,"wires":[]},{"id":"3ac6d85e1ed0dadc","type":"comment","z":"d487ddbb6fe9ee48","name":"----- Machine Learning Forecast Água Streamlabswater ----","info":"","x":380,"y":80,"wires":[]},{"id":"1c593b38d1ec0858","type":"comment","z":"d487ddbb6fe9ee48","name":"Passo 1: Executar o Model Fit","info":"","x":280,"y":140,"wires":[]},{"id":"b9207e450e10f6c6","type":"comment","z":"d487ddbb6fe9ee48","name":"Passo 2: Executar o Model Tune (optimização)","info":"","x":330,"y":260,"wires":[]},{"id":"95f6dcd21fe773ab","type":"comment","z":"d487ddbb6fe9ee48","name":"Passo 3: Publicar resultados do Model Tune no sensor de previsão","info":"","x":400,"y":380,"wires":[]}]

  1. Created a new template sensor, as the one determined by Step 3 (Passo 3) returns “Power” on state_class and “W” on unit of measure (not sure why):

  2. Created a graph where will be comparing actuals vs prediction:
    image

Code here:

type: custom:apexcharts-card
graph_span: 2d
span:
  start: day
  offset: '-24h'
now:
  show: true
  label: now
header:
  show: false
  title: Consumo Água (real vs previsto)
  show_states: true
  colorize_states: true
series:
  - entity: sensor.agua_streamlabswater_forecast_model_prediction_emhass
    name: Prediction
  - entity: sensor.agua_streamlabs
    name: Actuals

I was looking for something like this to predict family members presence in the house to adapt the heating schedule without the hassle of managing different and changing schedules.

Can it be used with a presence sensor like or home/not_home, or maybe I should create a sensor with the number of people present at some point.

Hi, it will only work with numerical values.
You can use it with a sensor created with the number of people present.
I’m interested to see your results with this.

1 Like

My understanding of this forecast model is that it only takes into account a single input over time and then forecasts based on the input? Is it possible to have multiple inputs and forecast one output? For example, I’d like to forecast my AC usage based on outside temperature, set point, and time. Is this usecase supported already?

This should be possible with the ML regressor class. You will need to set an automation to store your data on a csv file, then use the ML regressor. See rhe documentation for this.