InfluxDB v2
Always do a backup before deleting data… Right… * You have been warned. Now: Continue one your own. I’ve tested the query, but nothing more.
For people with InfluxDB v2.x and newer, here’s the Flux query I’m using in the Data Explorer:
import "influxdata/influxdb/schema"
schema.tagValues(
bucket: "homeassistant",
tag: "entity_id",
)
If you’d like to confirm that you can query data, try and run:
from(bucket: "homeassistant")
|> range(start: -20y, stop: now()) // Last 5 years
|> filter(fn: (r) => r["entity_id"] == "0x90fd9ffffe0d1286_linkquality")
Here’s another one:
from(bucket: "homeassistant")
|> range(start: 2023-06-01T00:00:00Z, stop: 2024-02-01T00:00:00Z)
|> filter(fn: (r) => r["entity_id"] == "sm_s906b_battery_level" and r["_field"] == "value" and r["_measurement"] == "%" and r["domain"] == "sensor" and r["_value"] <= 20.0)
You can also add more filters, like:
from(bucket: "homeassistant")
|> range(start: -5y, stop: now()) // Last 5 years
|> filter(fn: (r) => r["_measurement"] == "lqi")
|> filter(fn: (r) => r["entity_id"] == "0x90fd9ffffe0d1286_linkquality")
Here’s a way to find the first timestamps from a given entity:
from(bucket: "homeassistant")
|> range(start: 0, stop: now()) // Change the start time if necessary
|> filter(fn: (r) => r["entity_id"] == "0x90fd9ffffe0d1286_linkquality")
|> first()
But with InfluxDB v2.x you’d need to run the delete request through the API or in the CLI.
Deleting in InfluxDB v2 (Docker container, not a HASS.io solution)
Here’s one example where I’ve jumped into my InfluxDB container with docker exec -it influxdb /bin/bash
.
First I exported my API key, which I’ve made in the InfluxDB Web UI.
Do that like this:
export INFLUX_TOKEN=ADD_TOKEN_HERE
Next get the org ID with:
influx org list
And now run the command:
influx delete \
--org "YOUR_ORG_ID" \
--bucket homeassistant \
--start '1970-01-01T00:00:00Z' \
--stop '2022-10-22T00:00:00Z' \
--predicate '_measurement="lqi" AND entity_id="0x90fd9ffffe0d1286_linkquality"'
Remember to adjust both start and stop.
Result before delete:
Result after delete:
And I know… I need to update those exclude
keys, cause… What should I ever use a link quality for… …
I’d like to point out that regex can be used, too:
from(bucket: "homeassistant")
|> range(start: -1y, stop: now()) // Last 1 years
|> filter(fn: (r) => r["_measurement"] == "lqi")
|> filter(fn: (r) => r["entity_id"] =~ /_linkquality$/)
You cannot use the regex in the delete statement, but it’s possible to extract all entity ID’s like this:
from(bucket: "homeassistant")
|> range(start: 0, stop: now()) // All time
|> filter(fn: (r) => r["_measurement"] == "lqi")
|> filter(fn: (r) => r["entity_id"] =~ /_linkquality$/)
|> distinct(column: "entity_id")
And hereafter, continue deleting them one by one, creating a shell-script, a Python script, or whatever else you’d like to do .
But, don’t listed to me. I’d always recommend to take a look at the docs :D!
Updating the Data Retention Policy
Again, you’ll find the information in the InfluxDB docs too, but here’s the steps.
First secure that you’re authenticated.
Next, render your buckets inside your organization: influx bucket list --org-id SET_ORG_ID_HERE
Now set the new Data Retention rule with influx bucket update --retention 730d0h0m0s --id SET_BUCKET_ID_HERE
Personally I had no restruction on Home Assistant entries, but now it’s capped to two years. You’ll notice that InfluxDB at some point will start cleaning up the data. I’m not sure 100% sure on how that’s scheduled, but you can always check the latest logs with docker logs -f influxdb --tail 100
.
Happy Housekeeping everyone !