I have just published a first usable version of an integration that uses machine learning to predict when entities should be turned on or off.
To use it, you define a target entity (which can currently only be a binary entity, i.e., a light, switch, input_boolean) and a bunch of feature entities (ideally ones whose states are correlated with the target entity). The integration then grabs past constellations of these entity states from the database, and begins observing future changes of these entities.
The integration works in two modes: training and production. In the training mode, the integration stores all state changes (target and feature entities) and stores them in an additional table (as a csv file, which is likely not the best idea in the long run). Once you have more than 10 observations, you can start training a logistic regression model to predict whether the target entity should be on or off. As long as you’re in training mode, you don’t get actual predictions, but performance scores (as a sensor entity; accuracy as a state and precision/recall/f1 score as attributes).
If these scores are high enough for your liking, you can switch to production mode. Once you have trained a model in this mode, you’ll get a sensor entity that shows the predicted state. If you want to actually switch the target entity based on this, you can create an automation (which could also include additional guardrails to prevent lights turning on in the middle of the night).
Caveats / Future Work
- At the moment, the integration only supports binary entities as targets.
- The collected data is stored as a CSV file, and there is no built-in way to control its size. It’s on the list to do something about it.
- Logistic regression is the only machine learning method currently integrated. The original plan was to use
scikit-learnfor the machine learning part, but it seems to be impossible to get it installed. That’s why the integration contains a numpy-based implementation of logistic regression. It probably can be optimized more, but my experiments so far (on a RaspBerry Pi4 CM) suggest that compute power is not a problem at the moment. - I hope to integrate a decision tree implementation at some point (because of it’s transparency).
- GitHub Copilot and Anthropics Claude have been involved in the implementation.
