Machine learning models for time series forecast

davidusb · March 11, 2023, 8:15am

Sometime ago I developed an add-on called EMHASS that can be used to optimize your home energy bill by providing an optimal schedule of the home controllable loads. For this optimization to work some forecasts are needed. Among those are the home load power consumption. To improve the precision of the optimization I developed a new class within the EMHASS module that uses machine learning model to forecast the load consumption time series.

This works well and I’m very satisfied with the results.
But I realized that this module can be actually used to forecast any Home Assistant sensor, as long as it is stationary and you have enough data to train the ML model.

This is a quick procedure to get started and to use the new machine learning feature:

1.- Install the EMHASS add-on by using the repository: GitHub - davidusb-geek/emhass-add-on: The Home Assistant Add-on for EMHASS: Energy Management Optimization for Home Assistant

2.- Open the add-on webui and fit the default model with the default parameters by using the buttons there:

3.- After clicking on “ML forecast model fit” you can check the add-on logs to see the fit metrics results. In my case the logs show:

And refreshing the webui now shows the graph with the results:

Zooming in:

This used the default parameters which are these:

runtimeparams = {
    "days_to_retrieve": 30,
    "model_type": "load_forecast",
    "var_model": "sensor.power_load_no_var_loads",
    "sklearn_model": "KNeighborsRegressor",
    "num_lags": 48,
    "split_date_delta": '48h',
    "perform_backtest": False
}

As we can see this is using a default sensor name that is useful for energy optimization purpose of the add-on. But in fact you can change any of these parameters by using a data dictionary during the curl call.

For example we can change to another sensor and extend the quantity of data that will be used to train the ML model. Like this:

curl -i -H "Content-Type:application/json" -X POST -d '{"days_to_retrieve": 150, "var_model": "sensor.home_temperature"}' http://localhost:5000/action/forecast-model-fit

This will fetch 150 days of data for that sensor from Home Assistant. Try to provide as much data as you can. Check and change your recorder settings if you don’t have enough data. The recorder will retain only 7 days of data by default.

4.- The “ML forecast model tune” button can be used to launch a hyperparameter optimization using bayesian optimization. Again check the logs, in my case it shows:

We have better metric and we used backtest to validate this.
Refreshing the webui will give us the graphic results after this model tuning:

5.- You can automate all this by defining the curl commands using the shell_command service and some custom automations in Home Assistant.

The add-on exposes these endpoints for ML forecasting:

forecast-model-fit: to fit the ML model using train/test split and backteting.
forecast-model-predict: to obtain predictions with the previously trained model. There is an option to publish the forecasted series to a new Home Assistant sensor.
forecast-model-tune: to tune the model hyperparameters using bayesian optimization.

Check the complete documentation of this feature here: The machine learning forecaster — emhass 0.4.8 documentation

Refit the model often (once a week?) to update to any changes on your consumption dynamic. Careful with the tuning routine that can be computation intensive for RPi devices. Limit the days_to_retrieve parameter if the optimization takes too long.

Cao_Hoa · March 13, 2023, 1:50am

Hello. i really like your project. but i’m using docker compose can you build a docker compose for hass core?

davidusb · March 13, 2023, 5:53am

Hi thanks.
Read the installation instructions. You can pull a standalone docker image or even build one yourself. The docker command to launch the app is also given. So no need of docker compose there. I mean, I guess that you can put all that together inside a docker compose file if you want but it is completely equivalent as it is.

Cao_Hoa · March 13, 2023, 7:31am

can you be more specific and write a sample for me. I’m not very devoted to docker

davidusb · March 13, 2023, 7:40am

It is already written in the documentation here: Intro / Quick start — emhass 0.4.4 documentation

Installation Method 2

Cao_Hoa · March 13, 2023, 7:47am

Thank you very much . I always write. I will ask for your help

ydestord · March 21, 2023, 3:01pm

Hello,

I am already using EMHASS for a while now using the dayahead optimisation for some scenarios and it works fine. I use the add-on for a Hassos based install. So it works fine with the configuration variables exposed through the add-on web-ui. As I want to start exploring the ML forecasting I ran into an issue when triggering the ML Fit script from the Emhass web ui:

2023-03-21 15:49:50,269 - web_server - INFO - Setting up needed data
2023-03-21 15:49:50,273 - web_server - INFO - Retrieve hass get data method initiated…
2023-03-21 15:49:50,288 - web_server - ERROR - The retrieved JSON is empty, check that correct day or variable names are passed
2023-03-21 15:49:50,288 - web_server - ERROR - Either the names of the passed variables are not correct or days_to_retrieve is larger than the recorded history of your sensor (check your recorder settings)
2023-03-21 15:49:50,289 - web_server - ERROR - Exception on /action/forecast-model-fit [POST]
Traceback (most recent call last):
File “/usr/local/lib/python3.9/dist-packages/flask/app.py”, line 2528, in wsgi_app
response = self.full_dispatch_request()
File “/usr/local/lib/python3.9/dist-packages/flask/app.py”, line 1825, in full_dispatch_request
rv = self.handle_user_exception(e)
File “/usr/local/lib/python3.9/dist-packages/flask/app.py”, line 1823, in full_dispatch_request
rv = self.dispatch_request()
File “/usr/local/lib/python3.9/dist-packages/flask/app.py”, line 1799, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File “/usr/local/lib/python3.9/dist-packages/emhass/web_server.py”, line 170, in action_call
input_data_dict = set_input_data_dict(config_path, str(data_path), costfun,
File “/usr/local/lib/python3.9/dist-packages/emhass/command_line.py”, line 146, in set_input_data_dict
rh.get_data(days_list, var_list)
File “/usr/local/lib/python3.9/dist-packages/emhass/retrieve_hass.py”, line 147, in get_data
self.df_final = pd.concat([self.df_final, df_day], axis=0)
UnboundLocalError: local variable ‘df_day’ referenced before assignment

Reading the The machine learning forecaster — emhass 0.4.8 documentation there are now some extra configuration values that are not exposed through the WebUI so I guess I need some yaml based configuration file with these additional statements. Looks like the Yaml is not present in my Homeassistant config directory and I tried connecting direct on the Emhass container but can’t seem to find where I would need to find & update the yaml config file.

Regards

Yves

davidusb · March 21, 2023, 7:22pm

The problem that you have is most probably your Home Assistant recorder settings. By default the recorder saves only 7 days of history data. Here we are trying to fetch 30 days of data to train the model so it fails because you don’t have the data. See the Home Assistant recorder documentation to change the settings and allow to record longer history data. In the recorder you can specify the specific sensor for which you want to record longer. Use that option to avoid that your database grows too much.

Those extra parameters are not passed in a yaml but directly in the curl command that you can define using the shell command integration of Home Assistant. See that like that you shared from the documentation, there are some examples to do just exactly that.

ydestord · March 21, 2023, 8:48pm

Thank you @davidusb I will have a look but was under the impression that due to my recorder setting

recorder:
purge_keep_days: 45

I was OK on this one. Which exact sensor is being used as input?

KR

Yves

davidusb · March 21, 2023, 9:37pm

Take a look at the docs.

These are the default parameters:

runtimeparams = {
    "days_to_retrieve": 30,
    "model_type": "load_forecast",
    "var_model": "sensor.power_load_no_var_loads",
    "sklearn_model": "KNeighborsRegressor",
    "num_lags": 48,
    "split_date_delta": '48h',
    "perform_backtest": False
}

But you can change any of those parameters to whatever you want during the curl call like this:

curl -i -H "Content-Type:application/json" -X POST -d '{"var_model": "sensor.my_own_sensor"}' http://localhost:5000/action/forecast-model-fit

alexandrechoske · March 29, 2023, 1:34pm

This is a very cool solution, I will give it a try later.

I was wondering if is possible to do a similar thing, but with linear regression, i.e:

I wanted to know the correlation between x times I open the windows with the temperature.
Also is very cool \õ/

davidusb · March 29, 2023, 9:44pm

The sklearn model can be changed to LinearRegressor. However additional covariates that are not known in the future are not supported here because we are looking to build a real forecast. What you are saying is simple regression for data analysis. You can certainly do that out of a forecasting context.

alexandrechoske · April 8, 2023, 9:01pm

david, the only way to change sensor is the curl command? or via yaml in the addons config as well?
I changed the sensor there and got the error.
ValueError: Expected a 1D array, got an array with shape (101, 2)

davidusb · April 9, 2023, 8:03am

The most generic way is by providing the sensor name in the curl command.

Use this:

curl -i -H "Content-Type:application/json" -X POST -d '{"var_model": "sensor.my_own_sensor"}' http://localhost:5000/action/forecast-model-fit

alexandrechoske · April 9, 2023, 4:24pm

it is in the HomeAssistant terminal, right?

I got this error.

davidusb · April 9, 2023, 7:35pm

No. That won’t work.

Just define a shell command in your Home Assistant yaml configuration file. See here: Shell Command - Home Assistant

Flipso · July 10, 2023, 5:34pm

Hey there,
just stumbled over your addon and this really is awesome and something im searching for a long time. I can easily imaging multiple use cases for this and my aim is to get estimates for good refilling days for my car’s gasoline.

Up untill now I get the shell command with “forecast-model-fit” running but when i try to publish the data afterwards the console throws an error saying that a optimization has to be run - is there a way to set all of these in shell commands that i can run a prediction and set the sensor in home assistant for any sensor? It also would be great to set my sensor in the base config, so the addon just runs to predict my gas prices, but additionally it would be awesome to have like a routine that pushes the predictions for multiple sensors to multiple according sensors for predictions.
As far as i was able to follow for this i have to call a shell with the “forcast-model-fit”, somehow have to run a “dayahead-optim” and finally have to push this to a sensor with “publish-data” but i cant figure out how to call these different ones except the “forcast-model-fit” shell command.

Bests,

davidusb · July 13, 2023, 11:55am

Hi.
If you just want to use the machine learning part of the add-on then you don’t need to worry about optimization and publish commands.
Just use the forecast-model-fit and then the forecast-model-predict. This second command with automatically publish the prediction to a Home Assistant sensor.
Check the documentation here: The machine learning forecaster — emhass 0.4.13 documentation

Flipso · July 16, 2023, 10:18am

hey there,
thanks for pointing me to the documentation and i tried to get it running with the fit and predict commands.

First i use

curl -i -H \"Content-Type:application/json\" -X POST -d '{\"days_to_retrieve\": 7, \"var_model\": \"sensor.tanken_grand\",\"num_lags\": 48}' http://homeassistant:5000/action/forecast-model-fit

which nicely gives me the indication that the model fitted and it reports the R2 value of the regression. This is also reflected on the webpage, showing me a beautiful graph. i further checked the recorder settings and have values for 30 days, but changed to curl commands parameter and tested with 7 and 30 days. Afterwards these three predict methods, but they all throw errors with “df_day” mentioning.

  forecast_ml_pred_load: "curl -i -H \"Content-Type:application/json\" -X POST -d '{\"model_type\": \"load_forecast\"}' http://homeassistant:5000/action/forecast-model-predict"
  forecast_ml_pred: "curl -i -H \"Content-Type:application/json\" -X POST -d '{\"model_predict_publish\": \"True\",\"model_predict_entity_id\": \"sensor.tanken_prediction\"}' http://homeassistant:5000/action/forecast-model-predict"
  forecast_ml_pred_empty: "curl -i -H \"Content-Type:application/json\" -X POST -d '{}' http://homeassistant:5000/action/forecast-model-predict"

im sorry for really not getting the point here, also in the issues on github there seems to be noone getting this wrong, so im really lost at this point.

davidusb · July 16, 2023, 3:12pm

All these three commands are failing?
Can you please post the logger errors?