I needed a solution for my very large Home Assistant database of 27 GB (growing by the minute).
After a quick search I decided to go with InfluxDB and Grafana.
Unfortunately, Home Assistant doesn’t have a built-in way to migrate historical data to InfluxDB. Since I couldn’t find any recent discussions or solutions for this, I teamed up with ChatGPT to create a Python script that copies Home Assistant historical data from an SQLite database to an InfluxDB bucket.
Tested with Home Assistant Core 2024.10.1 and InfluxDB v2.7.10.
I run both HA, Influx and Grafana in separate Dockers.
You can check it out on GitHub:
My next step is to create a nice Grafana Dashboard.
Comments or hits are welcome!
I’ve been using influxdb for more than a year now. Could I use your script to import only the data that aren’t in influx db yet? Does your script check the existing data in influx, or it duplicates them?
Thank you for your work and sharing it @eldigo.
I have installed and configured influxdb and homeassistant and had run your script to migrate all data from sqlite to influxdb.
I have set a source of HomeAssistant for the influxdb integration instead of HA.
After the migration i have two sets of sensor data for outside temperature, one with source HA another with source HomeAssistant.
(No problem with that, will correct it later and restart.)
But i have detected an overlap between the two graphs in grafana.
Do you have an idea how to correct it when i rerun the script?
(green= source HA)
The overlap starts at 15:24 and ends at 18:32, so does not seem to be UTC to local time difference which is 2h.
But when i look at the temperature at start and find one where it could append that seems to be 2h. I think a have a problem with the time setting somewhere.
@eldigo
I had a conversation with Gemini about my finding and Gemini came up with a code modification to your script which i have adopted and tested.
It seems to remedy my problem.
The timestamp from the sqlite database is converted to UTC before inserting into influxdb.
I could attach the modified code for inspection and adoption?
if unit_of_measurement == '':
unit_of_measurement = 'count'
try:
## Convert timestamp from Unix epoch to datetime object
#last_updated_dt = datetime.fromtimestamp(float(last_updated_ts))
# Convert timestamp from Unix epoch to datetime object (local time)
local_dt = datetime.fromtimestamp(float(last_updated_ts))
# Make the datetime object timezone-aware for your local timezone (CEST)
local_tz = pytz.timezone('Europe/Berlin') # Use the correct timezone for your HA
local_dt_aware = local_tz.localize(local_dt)
# Convert to UTC
utc_dt = local_dt_aware.astimezone(pytz.utc)
# Create an InfluxDB point with tags and fields
point = Point(unit_of_measurement).tag("source", "HA").tag("domain", domain)
point.tag("entity_id", entity_id_short).tag("friendly_name", friendly_name).time(utc_dt)