Randomised trial to test effectiveness of energy automation

Hi folks,
I’ve got 6kw of solar panels, a 5kw inverter, and an enphase system for monitoring. For about 5 months I’ve been using the enphase monitoring to drive some theoretically energy efficient automations in my home. These automations do three things: turn on heating, dehumidification or cooling to automatically soak up excess solar; run ceiling fans automatically to circulate air when there is excess solar; recommend actions like opening windows.

This has been nice, and the house has been more subjectively comfortable while power costs have stayed affordable. But it hasn’t been possible to really measure if I’m making a saving. So I’m doing a randomised trial. Every day at midnight the input boolean that toggles HVAC and recommendations being automated is randomly toggled, and the automation that controls ceiling fans is separately toggled. I want to compare the effects of those separately:

- id: energy_tests
  alias: "Perform energy tests"
  trigger:
  - at: '00:00:00'
    platform: time
  condition: []
  action:
  - service: automation.turn_{{ ['on','off'] | random }}
    data:
      entity_id: automation.control_air_circulation
  - service: input_boolean.turn_{{ ['on','off'] | random }}
    data:
      entity_id: input_boolean.climate_control_alerts

In order to track the results of this experiment I needed to come up with an SQL query that will group my consumption, import and export of energy by day, and mark whether the fans or HVAC/recommendations were automated that day. This is the SQL query that performs that:

SELECT DATE(FROM_UNIXTIME(last_updated_ts)) AS `Day`,
CAST(MAX(IF(entity_id = 'sensor.enphase_lifetime_energy_consumption',state,NULL)) AS INT)-CAST(MIN(IF(entity_id = 'sensor.enphase_lifetime_energy_consumption',state,NULL)) AS INT) AS `Consumed`,
CAST((MAX(IF(entity_id = 'sensor.grid_export_energy',state,NULL))-MIN(IF(entity_id='sensor.grid_export_energy',state,NULL)))*1000 AS INT) AS 'Exported',
CAST((MAX(IF(entity_id = 'sensor.grid_import_energy',state,NULL))-MIN(IF(entity_id='sensor.grid_import_energy',state,NULL)))*1000 AS INT) AS 'Imported',
SUM(CASE WHEN entity_id='automation.control_air_circulation' AND state = 'on' THEN 1 WHEN entity_id='automation.control_air_circulation' AND state='off' THEN -1 ELSE 0 END) AS `Fans`,
SUM(CASE WHEN entity_id='input_boolean.climate_control_alerts' AND state = 'on' THEN 1 WHEN entity_id='input_boolean.climate_control_alerts' AND state='off' THEN -1 ELSE 0 END) AS `Climate automation`
FROM states 
WHERE states.entity_id IN ('automation.control_air_circulation','input_boolean.climate_control_alerts','sensor.enphase_lifetime_energy_consumption','sensor.grid_import_energy','sensor.grid_export_energy')
AND last_updated_ts > UNIX_TIMESTAMP('2023-3-6') AND state !='unavailable' 
GROUP BY Day;

The SQL query could do with some improving. The current configuration means days when the fans run the “fans” column will be a value > 0 (each time the automation trigges counts 1). The day it gets turned off the value becomes -1 and if it is off multiple days in a row the value is 0 after the initial day. Similarly, the Climate automation column will be either 1, 0, or -1. 1 is a day it was turned on after being off; -1 is a day it was turned off after being on; 0 is a day it was the same state as the day before. I’d rather each of these columns were just 1 or 0 for on or off - but I couldn’t figure out how to accomplish that.

Anyway, I’ll come back with some results when I have collected enough data for them to be meaningful. Just thought I’d share in case it was a useful idea to anyone else.

It’s currently hot weather where I live, so currently dehumidifying and AC is needed most days. I have my doubts that my fans running automatically when no-one is in a room is useful in this weather, but I want to find out. I’ve read that, In theory, this can save energy in winter - I guess by evening up the temperature gradient between floor and ceiling.

Okay I’m back with some results after a full Autumn and Winter season of running these tests. I made a modification to my SQL so that a day is treated as the time from when solar generation first starts - because automation actions during solar generation times proved to have a big impact on morning energy usage, so I wanted to capture early morning energy consumption with the previous day’s automated actions.
Energy costs with my provider are $0.33 per kwh with an export tarif of $0.13 per kwh for excess solar. For that whole six month period, here are the results:

So $6.88 per day running no automations, $6.53 automating just fans, $6.29 automating just a reverse cycle AC system for heating, cooling or dehumidifying, $6.65 per day combining fans and AC.

Looking just at winter, there’s a much more pronounced effect. Please note that for all the charts other than the first one the reported n is incorrect. I only just noticed that now and don’t have the energy to correct it right now:

$7.94 per day with no automation, $6.73 per day with just automation of fans, $6.82 per day with just AC automation, and $7.58 per day combining fan and AC automations.

But it’s the next chart where it really gets interesting - what happens to energy consumption in the 24 hours after a particular condition? I originally looked at this because of that morning energy use issue, before I thought to separate days from around 9am. I thought that when I made that change this chart would look like there is no relationship between energy use and actions the day before. However, they effect still looks compelling:

So $7.19/day the day after there were no automations, $7.05/day the day after there were fan automations, $6.18/day the day after AC automation and $5.83/day the day after combining fan and AC automation.

This might all point to my data being random and the effects being illusions. I would normally run statistical analysis but I don’t know the correct technique to use for data like this where the samples aren’t really independent - energy usage on any given day is likely to have some correlation with the prior day since the house is insulated reasonably well. Temperature charts do suggest that a difference of up to 0.7 degrees between the different experimental groups - but they don’t match the cost differences. I’m doing my research to work out how to analyse it all properly.