[Release] HA Energy Forecast — 48-hour household electricity forecast using your own data (early beta)

Hi all,
I’ve been running this for a few weeks and it seems stable enough to share — but treat it as early beta. There are likely rough edges, and I’d appreciate feedback before calling it production-ready.

A couple of honest caveats upfront: this app was predominantly written with AI assistance (Claude), and development time is very limited. I won’t be monitoring this thread consistently — if you have bug reports, feature requests, or extended feedback, please use GitHub Discussions rather than this thread, as that’s where I’m more likely to see it. Issues and feature requests will be addressed as time permits.

What it does
Trains a LightGBM model on your historical grid-import energy data and local weather, then publishes a 48-hour hourly forecast as native HA sensor entities — updated every hour, retrained weekly.

Sensors published

  • sensor.energy_forecast_today / _tomorrow — daily totals
  • sensor.energy_forecast_today_06_09 etc. — 3-hour block forecasts (×8 per day)
  • sensor.energy_forecast_next_3h — rolling 3-hour ahead
  • sensor.energy_forecast_model_mae — model accuracy diagnostic
  • EV charging detected kWh (today / yesterday)

Notable features

  • No cloud service required — runs entirely local inside AppDaemon
  • Automatic fallback to scikit-learn GBR if LightGBM won’t build on your platform (armv7, etc.)
  • Swiss weather via SRG-SSR API (optional) or Open-Meteo (free, no key)
  • EV charging detection — subtracts charger load before training so EV sessions don’t skew the household baseline
  • Persistent CSV history cache — survives HA database purges
  • Backfill tool to seed the model from years of existing HA recorder data

Requirements

  • AppDaemon 4.x add-on
  • A cumulative grid-import sensor (state_class: total_increasing, kWh)
  • A few weeks of history (backfill tool helps if you’re starting cold)

Repo: GitHub - m-zenker/ha-energy-forecast: An AppDaemon app for Home Assistant that forecasts household electricity consumption for the next 48 hours using a machine-learning model trained on your own historical grid-import data and local weather. · GitHub

v0.3.0 — accuracy fixes and prediction intervals

Following up on the original post — v0.3.0 is out. Same caveats apply: early beta, AI-assisted development, limited support bandwidth. GitHub Discussions for bug reports and feedback.

Honest note on the bug fixes

A few of these are significant enough to call out directly, because they affected accuracy from day one:

  • Sunshine duration was hardcoded to zero on the Open-Meteo path — anyone without SRG-SSR credentials has been training and predicting without sunshine data. Fixed; Open-Meteo now returns real values.
  • Forecast timestamps were shifted +1 hour inside _update_sensors, causing the weather merge to find zero matches on every sensor update. All weather features (temperature, sunshine, precipitation) were silently imputed from training-set medians instead of the actual forecast. Fixed.
  • Rolling features (24h/7d mean, 24h std) were broadcast as a single scalar across all 48 prediction hours rather than sliding per-hour. This created a systematic train/predict mismatch. Fixed.
  • SRG-SSR API v1 was decommissioned — anyone using SRG credentials would have been falling back to Open-Meteo silently (or crashing under pandas 3.x). Migrated to v2.

If you’ve been running a previous version and your MAE sensor looked suspiciously flat or high, these are likely why.

New in v0.3.0

New sensors

  • sensor.energy_forecast_{next_3h,today,tomorrow}_low / _high — 80% prediction intervals from quantile regression (α=0.1 / α=0.9)
  • sensor.energy_forecast_today and today_HH_HH block sensors now blend actuals for elapsed hours — past hours show measured consumption, future hours show the forecast

Model improvements

  • Log-transform on the training target — reduces the influence of high-consumption outliers
  • LightGBM early stopping per CV fold — avoids over-fitting on longer histories
  • lag_72h autoregressive feature (same time 3 days ago)
  • Cloud cover percentage and direct radiation (W/m²) added as features; available to all users via Open-Meteo
  • 72h of measured Open-Meteo history now anchors the temp_rolling_3d feature at predict time
  • Bridge-day proximity: days_to_next_holiday / days_since_last_holiday features
  • likely_ev_hour binary feature — marks hour-of-week slots with a historical EV charging pattern
  • Cantonal holiday support: set holiday_canton: "ZH" (or any Swiss canton code) in apps.yaml

Operational

  • Adaptive retraining: if live day-ahead MAE exceeds adaptive_retrain_threshold × cv_MAE (default 2×) with sufficient matched pairs, an early retrain is triggered automatically

Upgrading

No schema changes. Drop in the new files and restart AppDaemon. Existing pickle files will be used as-is until the next scheduled retrain; the log-transform and new features activate automatically after that.

Test suite: 114 tests, all passing.

Feedback welcome in GitHub Discussions as before.

v0.5.2 — Accuracy improvements, distribution UX, pandas 3.x fix

v0.5.2 picks up where v0.3.0 left off. It stacks seven development stages of accuracy and distribution improvements.

What’s new since v0.3.0

Sub-energy sensors (v0.4.0)

  • Track hourly consumption of custom HA cumulative kWh sensors (heat pump, dishwasher, etc.) as lag_24h / lag_168h features
  • Configure via sub_energy_sensors list in apps.yaml; fully optional — no behaviour change for existing deployments

Accuracy improvements (v0.4.1–v0.4.5)

  • Short-horizon lags lag_1h/2h/6h/12h — improves near-term precision (#27)
  • Feature importance + CV fold std logged after every training run (#29/#30)
  • Day-of-year cyclical features (doy_sin/doy_cos) for smoother seasonal signal (#33)
  • hours_ahead horizon feature — model learns horizon-specific bias (#34)
  • num_leaves sweep on final CV fold (LightGBM only) (#28)
  • Sub-sensor binary activity flag {prefix}_active_24h (#35)
  • Sub-sensor rolling run count {prefix}_runs_7d (#36)
  • Per-hour-of-week NaN fill medians for lag/rolling columns (#31)

Distribution / UX (v0.5.0–v0.5.1)

  • Setup checker sensor sensor.energy_forecast_setup_status — diagnose install issues from HA Developer Tools (#17)
  • CSV append-only writes — hourly updates no longer rewrite the full history file (#19)
  • Config validation: warn when ev_threshold >= ev_charger_kw (#20)
  • Adaptive retrain cooldown uses correct timezone (DST-safe, H1)
  • SRG OAuth token cached for 55 min — reduces API calls ~24×/day → ~1/day (M1)
  • Missing cloud/radiation keys fall back to NaN instead of 0 (M2)
  • Holiday vectorisation via np.searchsorted (#32)

Sensor UX (v0.5.2)

  • MDI icons on all published sensors (mdi:lightning-bolt, mdi:car-electric, etc.)
  • unique_id attribute on every sensor (groundwork for future MQTT Discovery)
  • Bugfix: pandas 3.x ValueError on date-only midnight entries in energy_history.csv — without this fix all lag/rolling features silently degraded to medians for the rest of the affected day

Upgrade notes

  • No breaking changes. All new config keys are optional with safe defaults.
  • Existing meta.pkl / energy_model.pkl files are backward-compatible.
  • After restart, AppDaemon will publish the setup status sensor on first initialise.

v0.7.1 — Anomaly dashboard cards, MQTT fixes, startup cleanup

What’s new since v0.6.0

see below for v0.6.0 release notes

This release bundles v0.7.0 and v0.7.1.


v0.7.1 — 2026-03-24

Fixed

  • 404 DELETE spam on startup: _cleanup_legacy_states now guards every remove_entity call with entity_exists, eliminating ~30 [404] HTTP DELETE: Not Found log errors on fresh installs where legacy entities were never created (fixes #47).
  • Anomaly binary sensor attributes missing in MQTT mode: _publish now routes the four anomaly attributes (residual_kwh, residual_std_kwh, sigma_threshold, n_pairs) to a dedicated binary_sensor/.../attributes MQTT topic. Discovery payload for energy_forecast_unusual_consumption now includes json_attributes_topic. State topic path corrected from sensor/ to binary_sensor/ (fixes #48).

Added

  • Dashboard card — anomaly detection (dashboard/anomaly-detection.yaml): vertical-stack with mushroom state card + conditional attribute detail that expands when the sensor is ON.
  • Dashboard card — SHAP feature importance (dashboard/shap-importance.yaml): native Lovelace markdown card using a Jinja2 template — no custom card dependency.
  • dashboard/dashboard.yaml updated: anomaly mushroom card inserted after the MAE mini-graph card.

v0.7.0 — 2026-03-23

Added

  • Quantile interval calibration (CQR): prediction intervals (10th–90th percentile) are calibrated via split conformal prediction. Gives ≥ 80% marginal coverage on held-out data. q_hat correction is persisted to energy_model_interval_correction.json.
  • SHAP feature importance (#42): top-N driving features exposed as shap_top_features attribute on sensor.energy_forecast_today. LightGBM uses native TreeSHAP; sklearn GBR falls back to global feature_importances_. New config key: shap_top_n (default 5; set to 0 to disable).
  • Anomaly detection sensor (#39): binary_sensor.energy_forecast_unusual_consumption fires when actual consumption deviates more than anomaly_sigma_threshold σ from the day-ahead prediction. Cold-start state is off until 10 matched pairs accumulate. Attributes: residual_kwh, residual_std_kwh, sigma_threshold, n_pairs.
  • Rolling MAE sensors (#41): sensor.energy_forecast_mae_7d and sensor.energy_forecast_mae_30d track live forecast accuracy using stored prediction-vs-actual pairs. Published in both set_state and MQTT Discovery modes.
  • Vacation / away flag (#25): new is_away binary feature lets the model learn reduced consumption during vacations. Optional config keys: away_mode_entity, away_return_entity. Fully backward-compatible.
  • ApexCharts dashboard (dashboard/energy-today.yaml, dashboard/dashboard.yaml): copy-paste Lovelace YAML showing 48-hour forecast vs actuals with prediction-interval shading.

Fixed

  • _load_interval_correction stale-value bug: _interval_correction is reset to 0.0 before parsing the JSON file, preventing a stale value from persisting across corrupt-file events.
  • SHAP summary early-day fallback: shap_summary now falls back to all 48 prediction rows when fewer than 3 rows match today’s date slice.

Upgrade notes

  • No breaking changes. Existing apps.yaml and model pickles continue to work.
  • New optional config keys: shap_top_n, anomaly_sigma_threshold, away_mode_entity, away_return_entity.
  • MQTT users with the anomaly sensor: json_attributes_topic is now included in the discovery payload — HA will pick up attributes automatically on the next restart.
  • Dashboard YAMLs are in dashboard/ — copy the cards you want into your Lovelace config.

Test suite

243 tests, all passing (python -m pytest tests/ -v).

v0.6.0 — MQTT Discovery + next_1h sensor

Added

  • MQTT Discovery (#37) (energy_forecast.py): opt-in entity registration via MQTT Discovery. Set mqtt_discovery: true in apps.yaml to register all ~29 sensors in the HA entity registry, enabling area assignment and labels. Requires the AppDaemon MQTT plugin and a running MQTT broker. Config keys: mqtt_discovery (default false), mqtt_namespace (default mqtt), mqtt_discovery_prefix (default homeassistant). All sensors grouped under a single HA Energy Forecast device. Prediction interval sensors (*_low/*_high) are registered lazily on the first update cycle where quantile models exist. Availability topic publishes "online" at startup and "offline" on AppDaemon shutdown. Existing set_state() behaviour is unchanged when mqtt_discovery: false.
  • sensor.energy_forecast_next_1h: 1-hour-ahead point forecast sensor.
  • README: added MQTT Discovery section (prerequisites, appdaemon.yaml snippet, apps.yaml example, sensor count table, availability behaviour, revert instructions); added mqtt_discovery / mqtt_namespace / mqtt_discovery_prefix to parameter reference.

Fixed

  • Doubled “Energy Forecast” prefix in MQTT Discovery sensor names: Discovery name values are now short labels ("Model MAE", "Today", "Setup Status", etc.); set_state() paths are unchanged.
  • Doubled sensors after enabling MQTT Discovery: _cleanup_legacy_states() removes ghost set_state entities on startup when mqtt_discovery=True, without requiring an HA restart.
  • MQTT publish broken on HASS apps: replaced self.mqtt_publish() with self.call_service("mqtt/publish", ...), which works from any AppDaemon HASS app.
  • numpy 2.x retraining error: df["gross_kwh"].to_numpy(dtype=float) forces float64 before np.log1p, fixing "loop of ufunc does not support argument 0 of type float" on numpy 2.x.
  • Align hourly sensor updates to XX:01:00 wall-clock time using run_hourly; eliminates startup-time drift.
  • Downgrade prediction-time sub-sensor NaN log from WARNING to DEBUG; training-time WARNING (weekly) is sufficient.

Your math may benefit a an integration i been working on, alternative energy monitoring and power consumption protection/enforcement. Been running it for few months working out the bugs needed just what you made among a few other things before i start releasing this



also i feel you may benefit from this small read Communicating from Addon to Home Assistant: MQTT vs Registered Integration - #2 by zodyking

Hi zodyking

Great to hear that you can use part of the work. Your UI screenshots look interesting and it seems like you already have a good data foundation for the predictions. Let me know how it works once it’s running!

Actually, this is also part of a bigger project on my end. I’m working on a custom energy management system for solar energy usage optimization.

And I will definitely have a read through your summary.

Strictly because I see your using mqtt to communicate from Addon/App to ha and had duplicate sensor trouble tho you fixed it. I was going down same path when I was eager to get away from opwer integration and decided to use same underlying api library from the GitHub gist to make an Addon/App that does why power does + allot more using advance scrapping, python logic, and make ha service calls automatically when it detects an event on my Con Edison account.

One of those special things I wanted wanted the app to do were to send actionable notification which is easy until you want it to be able to know what the user selected from the notification choices (mqtt can’t do this and using ha api alone cannot either, as an Addon/App has no access to ha event loop :repeat:) so my solution is always to break apart other well known apps and integration and the solution was in plain sight. You can ditch mqtt altogether and make your App/Addon install an integration so apart of your project runs inside ha full and your Addon does the other stuff which an integration running in ha cannot


Soon I’ll release them so a more detailed post but I just feel overall we’ve been holding back on what intergration/apps can be rather than just a data display I see and build them as an integral part of the home automation system with built in logic to run its own automation and make service calls and being interactive with the user via things like embedded tts or actionable notifications

The duplicate sensors were coming from the migration of direct appdaemon publication to MQTT. The deletion logic simply removes the sensors originating from the previous implementation.

From what I’ve seen so far, I’m pretty happy with the decision to migrate to MQTT. Sensors and controls show up grouped under one device and are manageable through the HA UI.

In another module I’m working on, I’m now also able to send commands back over the MQTT integrated device. However, I haven’t tried actionable notifications yet.

service calls will work via the api but actionable notification wont as you will be only able to send the initial notification but not receive the response regardless keep me updated