IotaWatt, Influxdb vs native, status

Linwood · October 7, 2024, 7:59pm

Some years ago and at a different home I set up a pair of IotaWatt’s and uploaded (with a push) from there to influxdb, and then accessed it from Grafana to display interesting stuff.

Time passed, I sold that house, and didn’t have time to redo it. In the meantime Home Assistant did a lot of work on an energy dashboard and energy related integrations.

It appears that HA stores data not in InfluxDB but in the default database as sensors (with the usual fade to less detail of statistics). There is an integration which can pull data from Iotawatt (though with some limitations).

Is there a consensus as to whether this is a better alternative? I had really good luck with the reliability of the push from Iotawatt, including over periods when HA or Wifi was down. But it is all home grown, and I appear to have misplaced all my prior work, so need to redo the aggregation/integration queries, retention periods, etc. But Influxdb seems a better database for this sort of data, and there appear there may be some issues of robustness in restarts, etc. for the pull integration.

Can someone share insight why Influxdb wasn’t used for the Energy work?

I’m also not wild at losing frequent data after 7 days (I think that’s how statistics work, right?).

But I’m also not wild about reinventing a lot of code (I wish I had thought of this before cleaning house in HA for the new house).

I realize this is a vague question but am curious if people have gone down this thought-path and where they landed?

Linwood

ogiewon · October 7, 2024, 8:09pm

Good questions! I run HAOS on a HA Yellow with a 1TB NVME SSD. I have two IoTaWatt home energy monitoring systems, one for my Main Panel and one for my Sub Panel. I have chosen to allow HAOS to host both InfluxDB and Grafana as HA Add-Ons. I then configured both IoTaWatts to push data to InfluxDB. I had this up and running for quite some time, and I have never bothered to try using the HA built-in Energy Monitoring/Dashboard. If I had solar panels, I think it might be more interesting. But for simple consumption monitoring, I am pleased with simply using Grafana charts. I really like the IoTaWatt’s built-in store-and-forward for pushing data to InfluxDB. I can take down my HA Yellow whenever I want, without having to worry about losing any energy data.

Linwood · October 7, 2024, 8:45pm

Thanks. Exactly my thoughts.

By the way, for others reading this, another discussion aimed specifically at the official integration is here: https://community.home-assistant.io/t/iotawatt-discussion/696578/70?page=4

My purpose in a separate thread is specifically to pursue whether Influxdb is not a better choice, or have all the kinks (some mentioned in the above) been worked out? There were also one or more forks I guess due to limitations in the official? Didn’t try to follow the whys so much.

tom_l · October 7, 2024, 10:47pm

I use a mix of both.

IoTaWatt → HA core integration → HA Energy Dashboard, automations and personal energy monitoring dashboard.

HA core → InfuxDB add-on → Grafana add-on for long term monitoring.

This is probably not the best way to do it. I should be doing:

IoTaWatt → HA core integration, and
IoTaWatt → InfluxDB addon

Linwood · October 7, 2024, 11:00pm

Can you do InfluxDB → HA in a form/domain that the energy dashboards can use? To avoid the double export?

tom_l · October 7, 2024, 11:07pm

Yes the influxdb integration has sensors for import. InfluxDB - Home Assistant

Linwood · October 7, 2024, 11:24pm

I know you can, the question was whether they were compatible with the energy dashboard integration.

tom_l · October 8, 2024, 12:26am

Yes, you’ll just have to add the device and state classes using customize.

Linwood · October 8, 2024, 12:37am

Excellent!

Linwood · November 7, 2024, 2:26pm

Finally getting around to trying this and feeling stupid.

When I create a sensor for HA from influxdb, I need energy not power. That’s fine, but energy implies time frame, and a sensor is a single value and normally saved only on change.

So if I send values

Time T1 = 5
Time T2 = 6
Time T3 = 6  <<<<< is this even saved? 
Time T4 = 5

And when it goes to change from discrete values to statistics, is it smart enough to sum over the associated period, i.e. if it looks at X readings during an hour it sums over those? As opposed to some sensor values that average? It looks at class to know that?

I have minute-by-minute power readings in Influxdb. I use those for displaying load graphs, and then I integrate to hourly and daily rollups for use there.

Is the “right” approach for home assistant to just multiply the watts by 1/60th and store minute by minute energy for an HA sensor, and let the normal statistics process do everything else? Will it do the right thing for consecutive equal values?

tom_l · November 7, 2024, 9:03pm

The right approach is to integrate on the IoTaWatt and report that value to InfluxDB → Home Assistant.
https://docs.iotawatt.com/en/master/integrators.html

Linwood · November 8, 2024, 2:16am

I can do the integration, my question is about the InfluxDB → Home Assistant. But I guess I need to set up and test it. My real question is if the InfluxDB value going to HA is the same on two consecutive intervals, does HA “see” the second one? Or does it presume the next time interval is the same because it didn’t write a change.

But I’ll set up and do some testing.

bobflux · November 15, 2024, 10:02am

Can someone share insight why Influxdb wasn’t used for the Energy work?

Im not reading HA devs minds, but here are two likely reasons:

Influx support of the SQL language is either “none” or “SELECT only” depending on version. HA uses SQLAlchemy to talk to the database, and that needs SQL.
Need to ship a lightweight database and SQLite is absolutely fine for this use. Makes for simple installation.

If you only intend to log energy use, once per hour is fine, and SQLite will handle that.

I realize this is a vague question but am curious if people have gone down this thought-path and where they landed?

I installed solar panels (12kW), inverters, batteries, and an EV charger. Here’s the EV charger (pink) responding to clouds.

All the hardware speaks Modbus, so this is controlled by an Orange Pi Lite with Waveshare USB-RS485 interfaces. There are 8 RS485 ports.

It’s a bunch of python scripts talking to each other via MQTT.

Most of the stuff is queried 2-10 times/s and published to MQTT with slight rate limiting so it results in a very low throughput of 50-100 messages/s.

This tiny Pi has redundant power supply (AC mains and PV battery via 48V isolated DC-DC). I didn’t find any package to persist MQTT traffic so another python script grabs all the MQTT traffic, timestamps it and stuff it into zstd compressed files, which it stores and serves.

The Linux PC serving as house NAS queries that and stores everything into Clickhouse. If it is shut down, it will grab the MQTT backlog from the Pi on power up and update.

I considered InfluxDB, in the end I went with Clickhouse for the compression features. Each MQTT message as a row in the table uses about 1 byte of disk space, so keeping all the data costs basically nothing.

I used automatic materialized views to compute hourly/daily min/max/avg, another convenient feature. These are mostly to avoid overwhelming the plotting engine when zooming out.

I used bokeh for plotting. Couldn’t figure out how to tell grafana to combine streaming MQTT data with older data from SQL and switch to different queries depending on zoom levels to use the materialized views to keep it fast.

This uses about 4% cpu on the Pi and 0.5% on the NAS.

Now you ask, which one should you use?

I think if you just want to log energy, once per hour is fine, that’s 8760 rows/year. SQLite will work absolutely fine. 1 state row uses about 80 bytes, let’s say 1MB/year/sensor. No problem at all.

If you want real time then you will need a time series database like clickhouse or influx, and a functional plotting package (ie, not HA default one).

But iotawatt only updates every 5 seconds so it’s not exactly “real time”, not optimal for solar power routing for example. And unless you want to control devices in real time, then why would you care about plotting power at 1 second granularity? 1 hour is fine. The reason I store everything in the database is to debug the stuff, and controlling the damn EV charger took a lot of debugging.

I also display the solar stuff in HA, but I excluded it from the recorder. So all the display updates in real time, but it doesn’t choke out the poor SQLite. The use of my energy dashboard is mainly to check if everything is fine, look at home battery state, and input settings for the EV charger.

Linwood · November 15, 2024, 2:32pm

Thank you for the general comments, and they do make sense.

I ended up using InfluxDB and Grafana. I keep 1 minute data for a week or so, which seems adequate for “real time” in the sense I can see variations a human scale, turning on and off things like fans, appliances. I could pull data faster, but it seems pointless until I see a use case. I roll this up to hourly and daily data for different graphs (not zoom scales, though that’s an interesting aspect and I may look at the tools you mentioned).

I spent some time looking at the energy dashboard demo in HA, and decided not to go that way even as a trial, it seems to lack a lot of the things I got out of customization, for example I have gauges for each branch circuit and their colors are customized to expected load ranges, gives for immediate notice of unusual load. I also have one that can do voltage drop analysis both overall and for a subpanel (comparing main and sub panel as each has a iotawatt and voltage reference).

I suspect with solar I would have a very different approach. Thanks for sharing yours.

Linwood

bobflux · November 15, 2024, 8:49pm

I agree, it looks well suited to your use case. Although Influx is overkill, for a few million rows per metric, sqlite would probably be just fine.

For the dynamic zoom, I have materialized views by minute and hour that contain max and avg over the time period. Then depending on zoom level, I make it switch between the main table and the views. The reason I used several aggregates is because some metrics like power or temperature make sense to aggregate, but for other metrics like energy or response time it makes no sense to average, it needs either the max or the last data point in the period.