HA disk size and InfluxDB

I am studding the latest HA to be installed as a Linux KVM virtual machine. My goal is to replace an old version that I have currently running at home.

I saw that the original qcow2 image haos_ova-12.1.qcow2 has a dynamic allocated size (thin provision) of 32GB. However, when we check the disk from within the HA CLI we get the info as the screenshot below that says the disk has a lot more than 32GB in size.

I would like to increase the OVA size because I want to use InfluxDB to retain all my data longer, lets say for a minimum of 3 years.

So I got the following questions:

  • If I use “qemu-img resize disk.qcow2 +XXXG” to increase the disk size will then HA recognize the new disk size automatically and use it or should I increase the partitions size from within the HA CLI?

  • As there are several mounted partitions (see the screenshot), if I have to go into the CLI to increase the partitions size, which one should I increase to support a larger InfluxDB?

  • How do I configure HA to exclusively use InfluxDB as its database and where do I define the InfluxDB retention time?

Thanks!
Screenshot from 2024-03-30 10-33-47

Just shut the VM down, expand the drive, and boot it back up. HA will claim the new space and make adjustments as necessary.

You don’t. That’s not what influxDB is meant for.

1 Like

Consider moving InfluxDB to another machine.
Your backups will be huge and slow and InfluxDB will run less optimal in a VM or container (here it is actually both a VM and a container, so it is a double hit).

3 Likes

Thank you! So the only thing I need is to increase the qcow2 OVA file before its installation.

Regarding InfluxDB, the documentation clear states that HA supports InfluxDB only for sensors and that is why I am concerned. Everything else that can not be defined as a sensor will be hold inside HA. If my goal is to have 3 years of data retention, how big should I have the HA qcow2 disk to accomplish this?

As a last question, is is possible to extract the data from the internal HA database? This could also be a good choice to keep historical data.

Yes… That is what I thought would definitively remove any HA limitation in database size. Hence, data backup is pain. Also, the resources for 2 VMS, one for HA another for InfluxDB.

I’ll second that you should run InfluxDB as a standalone process. At worst, maybe in a docker container OR a VM, but never both. The performance of an instance hosted as an addon in HAOS in a VM can get pretty bad under stress and you will see performance suffer because of it.

The question I have is why do you need 3 years of retention? Do you really need to know that a door was opened or closed 3 years ago? If you’re going to go this route, I would avoid HA’s sqlite3 internal database and set up a MariaDB instance as well, then point your Recorder to that. It can handle storing the data you want and will be faster on most lookups. That, along with InfluxDB will give you the retention that you want while still keeping HA on a moderately sized volume.

Thinking about what you have said, you right! It does not make sense to keep a lot of “junk” information. But everything that relates to power I want to be able to keep it, even for over 3 years. You may ask why, so let me explain my use case:

I have a 12kW solar production that I may expand to above 20kW. It is expected that the production will lower over time. If I keep record of it I can get to know how much I was producing, being able to evaluate production losses as the solar panels have warranties on production levels. By other hand, this data is just numbers… You may want, for example, compare the system production of your last 5 years, on January (here winter).

This said, what would be your recommendation, just InfluxDB for “sensor’s numerical” history?

I didn’t know that we could use an external MariaDB. Thinking about you said (keeping “junk”) what is the use case for an external MariaDB?

Thank you!

1 Like

You might not need to do anything at all. HA stores certain data (like power statistics) indefinitely.

Search around for Long Term Statistics (LTS) and you’ll soon discover what type of data is considered “junk” (deleted after 10 days by default) and what is LTS (never deleted).

At most, you might need to make sure your entities are configured correctly, but other than that, there is no need for an external database.

1 Like

That’s exactly where I was going with it. Thanks for reading my mind! lol

1 Like

Thank you all!

HA is fantastic… I guess all the problems I may have has already been addressed in the product life time.

Just for curiosity, I got a final question:

I got several devices sending a lot of MQTT messages. Some of them will publish, in a single message, over 150 data fields (i.e.: my inverter). However, most of sent data I just don’t use/care about. The ones I collect via “sensor” definition will be kept in history. Now, what happens to the ones I do not define as a sensor data? Are they discarded? (I hope so…)

Thanks!

1 Like

If no sensor is created , the data is discarded

1 Like

And if it is then you can discard sensor recording by configuring the recorder integration, which you probably should anyway. :wink:

1 Like