InfluxDB2.0: database size sensor

I’ve just upgraded to influxdb2. Was fairly easy with docker and the great HA documentation.
However, I also need to reconfigure my database size sensor as it no longer works with influxdb.

I’m an infant when it comes to queries and/or the flux language. Does anyone already have a working influxdb2 size sensor? If so, please share! I’m unable to find anything with google.

I measure the size on disk. As I am using Hass OS I had to resort to this convoluted workaround:

Thanks Tom! Yeah I noticed your post. However, that seems really rather unnecessarily complicated. Also, I run HA Core so I can’t use that addon.

It shouldn’t be too complicated to write a proper flux query, right?

2 Likes

If you run HA core you can use the du command to return the disk size in a command line sensor.

sensor:
  - platform: command_line
    name: 'Influxdb Size'
    command: du -s /data/influxdb/data/homeassistant # fix the path to your database, this is mine.
    value_template: "{{ (value.split('\t')[0]|int/1000)|round(3) }}"
    unit_of_measurement: 'MB'

The unnecessarily complicated workaround was only because I am using HA OS and don’t have access to the file system in the Influxdb addon container.

This will return the actual size of your database. The shard method is only an approximation.

You can then turn off the influxdb internal database for a performance improvement.

1 Like

Hi tom. Thanks for your reply. I only just got around to testing this. However, it does not work. The resulting value from the command in HomeAssistant is ‘unknown’. Even though it works fine when I directly enter the same command on the host system.

Probably a docker thing. Might work to add the data folder to my docker-compose file, but I don’t really want to do that.

I guess I’ll wait for someone else to figure out the proper command in the Flux-language!

Wait, what?

You said you run HA Core.

Yes, HA Core in a Docker container. Apologies if that wasn’t clear!

Good news though. Guy on another forum thought of a nice workaround: create a cronjob that writes data to the HA-config folder and then use a cat command to read the data. Like this.

*/10 * * * * /usr/bin/du -s /home/aeiou/docker/influxdb2/engine/data/0ffc2fcbc0fac7db | cut -f -1 > /home/aeiou/docker/hass/influxdb2size

And the sensor:

#-----InfluxDB size --- See cronjob for du command
  - platform: command_line
    name: 'InfluxDB Size'
    command: cat /config/influxdb2size # read data from du cronjob
    value_template: "{{ (value.split('\t')[0]|int/1000)|round(1) }}"
    unit_of_measurement: 'MB'

Hope this helps someone! No credit for me though!

5 Likes

Thanks a lot! I made it work and it’s much better. This ways I will also be able to get the disks stats from df command for the USB mounted disks :slight_smile:

Just a small correction to the Bytes math… the value_template should have int/1024 not int/1000, so this is right:

value_template: "{{ (value.split('\t')[0]|int/1024)|round(1) }}"

:slight_smile:

1 Like

Just a note:
I had to add the parameter -b to make sure I get bytes and not blocks.
According to the readme:

Display values are in units of the first available SIZE from --block-size

And then also divide it by (1024*1024) instead of 1024

Again, this is not usable for HA OS (or Supervised) users, right?

No I don’t think so.

What do you mean with that? What is it used for (at all)? As far as I can see I have those:
grafik

  • _internal with 7.2 GB (:exclamation::exclamation::exclamation:)
  • chronograf with 33 MB
  • homeassistant with 5.5 GB

I’m really wondering why (the hell) _internal seems to use SO MUCH space! Need to take care of that… somehow…

I think it’s also because of the huge retention policy. Default seems to be 7 days…

Update: found this InfluxDB _internal 1.x measurements and fields | InfluxData Platform Documentation

So according to this _internal is used for metrics. Disabling _internal would not allow this (only few things) to work anymore (which was useful from time to time especially when maintaining ):

  • SELECT numMeasurements FROM "_internal"."monitor"."database" WHERE "database"='homeassistant' ORDER BY time DESC LIMIT 1
  • SELECT numSeries FROM "_internal"."monitor"."database" WHERE "database"='homeassistant' ORDER BY time DESC LIMIT 1
  • Also shard is not available anymore. But those results were never useful, miles away from actual disk usage (e. g. SELECT diskBytes FROM "_internal".."shard" WHERE "database"='homeassistant' order by time desc limit 1)

Update:
I set the retention policy back to 7 days, waited roughly one hour and checked folder size again which decreased to <300 MB - unnecessary storage issue fixed. I’ll keep _internal enabled for now. Only motivation would be

  • actual performance increases or
  • especially less system utilization (CPU, RAM).
    Can you maybe say some words on those two things @tom_l ?

To anyone who stumbles into this post to find a solution for tracking InfluxD|B storage use (perhaps someone can point this/me to the InfluxDB integration page as this thread REALLY should either be linked or feel free to lift my version of the code here to save someone else a day of testing and struggles)…

The basis for this was this was Aeiou’s previous post. Format for executing a command_line (bash) command has changed and likewise there are now much bette/easier ways to setup/maintain the value templet/sensor so without further ado…

Requirements:

  1. Access to host’s NATIVE file system to run ‘du’ command (and of course add said command to system’s cron for update(s)
  2. Access to files HA’s config directory (though in theory you can store this file ANYWHERE HA can read. ‘/config’ is the default location/home directory where command line jobs.

First add simple cron job on the host running your HA instance. In my case I use HA core running in Docker but regardless the only thing needed is access to the ACTUAL host’s file system and location where your HA config directory lies. I chose to update the file every 10 min. Remember to restart & reload the cron task as needed for your environment.

*/10 * * * * /usr/bin/du -sb  "<<path_to_persistent_Influx_storage>>/Influx/engine/data/<<Your_actual database_Directory>>/" | cut -f -1 > "<<path_to_persistent_HA_data>>/HomeAssistant/config/influxdb2size"

Next you need to add the sensor. I choose to do this old skool by simply cat’ing the file. You MAY be able to also do this by using the file read primitive in HA but since there are many other possible applications of bash-based sensors I went this route.

As mentioned the format for command_line has changed (no longer being platform based). I also added a unique_ID so you can set area and see it by default in Lovelace. I sent both Device_Class AND unit_of_measure to you can change the units and precision as you wish all from the UI without having to deal with the complications of customizing the value_templet. I did leave the orignal rounding in, though it’s no longer needed.

command_line:
  - sensor:                 #-----InfluxDB size --- See cronjob for du command (writes file size in bytes every 10 min)
      name: HA InfluxDB Size
      command: "cat /config/influxdb2size"       # read local data from du cronjob
      value_template: "{{ (value.split('\t')[0]|int)|round(1) }}"   
      unit_of_measurement: 'B'
      device_class: data_size
      icon: mdi:database
      scan_interval: 300                                       # refresh sensor every 5 min from file                     
      unique_id: <xxxxxx>                                      # https://www.uuidgenerator.net

Then restart. And find your sensor and config in UI as desired.
SIMPLE!

As I said hopefully this helps someone… Now to clean up may MariaDB sensor…