Can I use Timescale DB as an alternative to Influx DB in Homeassistant for Grafana?

Hi @hastarin, thanks for the hint. Very interesting!
Think I will try that ans post the results here.

Hi @hastarin,

Today I took some time to install Timescale DB via Docker. After that I made a backup of my old Postgres database via “pg_dump --schema-only -f old_db.bak old_db”. Then I created the corresponding database in Timescale and restored the Postgres backup.
After that I wanted to create a hypertable using “SELECT create_hypertable('events', 'time_fired');”, which unfortunately didn’t work :frowning: BTW, according to the Timescale data migration instructions, the table must be empty before the hypertables are created.

After some searching and reading I found out that the Hypertable could not be created because the PRIMARY KEY must contain all dimensions of the partitioning of the Hypertable: https://github.com/timescale/timescaledb/issues/447#issuecomment-399510172

So I decided to delete the previous PRIMARY KEY:
alter_table

And to recreate it, including the time value desired for the partitoning:
alter_table_constraint

After that I created an index on it, and created the desired hypertable without any problems:
create_index
create_hypertable

This process includes that the foreign key between the events table and the states table must also be deleted and cannot be recreated afterwards:

Unfortunately, I don’t know if this has any effect on Homassistant, and the tables can still be used. I will test that again and get back to you.

Best regards

Replying from mobile here so I will keep it short. First great work.

You could add a unique constraint to the events table for event_id then you’ll be able to recreate the foreign key.

I unfortunately haven’t had time to even update my home assistant setup lately let alone play with Timescale so thanks for doing the experiment and sharing it with the community.

Hi, thank you very much for you Feedback
Unfortunately, creating a unique constraint on the event_id that doesn’t seem to work anymore:

indexnnt

Hi,
For anyone interested: https://github.com/freol35241/ltss

Its not meant to replace the recorder component, rather to be an alternative to the InfluxDB component.

5 Likes

Thank you very much @freol!
I tried your custom component today and so far everything seems to work. :slight_smile:

Maybe one more clue. For the initial setup of the postgis extension the postgres superuser is apparently needed. Otherwise you get the following error message:

Protokolldetails ( ERROR )
Mon Dec 23 2019 08:44:26 GMT+0100 (Mitteleuropäische Normalzeit)
Error during connection setup: (psycopg2.errors.InsufficientPrivilege) permission denied to create extension "postgis"
HINT:  Must be superuser to create this extension.

[SQL: CREATE EXTENSION IF NOT EXISTS postgis CASCADE]
(Background on this error at: http://sqlalche.me/e/f405) (retrying in 3 seconds)

After that you can switch back to the actual homeassistant user for postgres

Nice catch! Will add a note about this in the readme.

And thanks for reporting back, good to hear that it seems to work fine.

Thank you so much for this! finally able to migrate all into a single time series db that supports proper grouping by month! Awesome!

To aid this solution, i’ve created an addon for running timescaledb:

Custom repo addon url:

Please feel free to review/suggest/testdrive.

1 Like

because I use Home Assistant Core as a docker container, I can’t test your addon. I already use Timescaledb as a standalone Docker Container.
In any case, thanks for your efforts! I think it can be very helpful for others.

Best regards and thanks again.

No probs!
Thank YOU for the integration!
TimescaleDb and HA LTSS feels like a perfect fit. It fulfills all my data and monitoring needs.

Hey there,
I am intrigued by the proposed performance increase when compared to InfluxDB, but has anyone of you guys actually noticed a difference? I have used InfluxDB for over 4 years now and only when querying a lot of things at once over a multiple year time span is when I can actually notice the time it takes to load (in Grafana). My services run on a full blown PC though, not a tiny RPI, so that may help InfluxDB to still perform.

Did one of you migrate their existing InfluxDB data to TimescaleDB?
Did you notice performance improvements? CPU, Memory and Latency?

The query performance in Grafana is felt to be slightly worse compared with Influx.
Unfortunately I can’t say anything about the other things, because I neither migrated my data nor do I have a special look on the system performance, because I also have a real computer with enough resources.

I use the TimescaleDB mainly because of the DB functions it offers me.

Here is an interesting comparison of the two databases: https://blog.timescale.com/blog/timescaledb-vs-influxdb-for-time-series-data-timescale-influx-sql-nosql-36489299877/

2 Likes

Thank you.
I already read this comparison, but its written by the creators of TimescaleDB, which is why I do not trust their findings that much. While it may be true that TimescaleDB is more suited for extremely big pools of data, I am not sure InfluxDB is the worse choice for a relatively small dataset produced by something like Home Assistant. They also ramble quite a lot about the fact that TimescaleDB is way more stable because it relies on the PostgresQL foundation while Influx had to rewrite everything from scratch, and while that is certainly not irrelevant, I have never ever experienced anything even remotely bad while using InfluxDB. It just worked flawlessly, all of the time, and I don’t even run it on linux but FreeBSD.

So, all in all, I am not sure if the simple fact that TimescaleDB is a SQL based DB is enough for me to justify a DB switch. If someone comes across an in-depth and more importantly independent performance comparison of both, please share it. I have searched quite a bit but have not been able to find something of relevance.

Why don’t you just do a comparison test yourself and share your findings with us? :wink:

If you have a computer with enough system resources, it should be no problem for you to run both databases in parallel.

If you do not benefit from the full SQL compatibility of Timescale and have been fully satisfied with Influx so far, there is probably no reason for you to change.

Here you maybe have an also “less biased” DB Benchmark: https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b

Because I don’t have the time to invest hours into installing an unknown database, testing it in a scientifically accurate way and create a valid summary that other people can actually make sense of. That’s the exact reason I used a search engine to find this thread and ask you guys about it :wink:

That other test you linked has the exact same problem as the one you posted before. I didn’t know about that one though, so thx :slightly_smiling_face:

And, in addition to that, yes, I feel like InfluxDB is not only the “ok” choice for Home Assistant, but actually the better one. But that’s just a feeling, so I might be totally wrong. My conclusion for now is that I will happily continue to use InfluxDB until it breaks and I can finally test my backup strategy in a real scenario :crazy_face:

Of course. Feel free to use whatever you feel most comfortable with :slight_smile:

I think the performance differences only become clear if you have an extremely large set of sensors logging a lot of data. So from the performance-perspective, I think there is no reason to switch.

In this light, Influx makes perfect sense for home assistant.

The reasons I switched over are:

1- I have vast sql knowledge. So using SQL gives me far quicker results then learning Flux and spend hours and hours to find out how a certain query is written in Flux.

2- Joins! I was baffled reading this: https://github.com/influxdata/influxdb/issues/13498
Joins were not supported like 6 months ago, and are a technical-preview kind of thing.
But… only within influx itself. Say I wanted to join a table containing a lookup-set of the holidays? Good luck!

3- Updates. So in my case my data was slightly off. I wanted to rename an entity because I made a typo when setting it up, and noticed after a month. But this leads to old data still being stored on the old entity-id, and the new data on the new entity-id.
Sorry bro! Update does not exist! You have to remove your old data, edit it somehow somewhere outside of influx, and re-insert in again.

4- Indexes. Influx stores just a blob of data and indexes only it’s tags. So say you want to add an index to a field in your measurement-data? Sorry! Cannot do. Remove your previous data and reinsert it all again using the correct tags.
With timescale (or any postgresql), you can create an index on any json-field you like, or simply create a computed column and index that one. 1 line of SQL. Done!

5- Gap-fill data. Say I missed 1 day of measurements because I was updating home assistant. I would like to ‘copy’ the previous day, so my monthly totals will be somewhat OK. Easypeasy with a SQL-query.

6- Client tools. This last one was the final deal-breaker.
There are simply no client tools available, except for a few command-line tools: https://docs.influxdata.com/influxdb/v1.8/tools/
Where is InfluxDb studio??
So, there you have a nice timeseries db, and no good way to explore your data before you paste your query into a Grafana chart. It’s like having a great engine but no car around it with seats and a steering wheel. And no, the explore-window in influx is not equivalent. Your query won’t be saved, and you cannot have multiple queries open. It’s feels like programming through a keyhole.

This list goes on and on. Influx is nice!. SQL is better. Because it’s more mature.
So, let me turn this question around now. What has Influx to offer that cannot be done by TimescaleDb? Performance and memory certainly not, because I run both side by side on an RPI4

Ofcourse my opinion is biased, and yes, I am the creator of the timescale-db addon.
But the reason for creating this addon is because Influx let me down when I needed it.
Not on performance, it’s fast enough. But on functionality.
I want Influxdb to be better, because it sounds cooler and PostgreSql sounds so 1974…
But in the end, I just wanted my dashboards, and I wanted them quick.

(Btw, I never worked with PostgreSql before in my entire life, I’m a Microsoft SqlServer expert).

So the moral of the story: It took me less time learning how to create a hassio-addon ánd imlement timescaledb ánd learn PostgreSql ánd learn TimescaleDb ánd build my dashboard on top, then learning flux and trying to let it sing without proper tooling.

I am more then happy to give Influx another spin someday when above concerns have been properly adressed. And maybe this time it will blow my socks off and I will never look back at the 1970’s again :wink:

In the mean time, I am enjoying my dashboards:

3 Likes

Thx for the detailed insight. These are very valid points, if applicable. I guess I just don’t need most of the things you listed. I only use InfluxDB through Grafana, nothing else, and although renaming old entity data would be nice, its not worth the effort to make the switch. Its not just the data that I would have to migrate (which would probably be the easiest task), its also all of my grafana dashboards, and sadly thats not as easy as running a migration script :frowning_face:. Also, there is nothing I could name off the top of my head that I dislike in my Grafana setup, that is caused by InfluxDB’s limiting abilities.

Indeed, you should have a very compelling reason to rewrite all your dashboards.
If influx works for you now and in the near future, don’t switch.

But In your original question you asked of there were any people who had made the switch.
Well, I have… only not for performance reasons.
And later you stated you feel that Influx is actually the better choice for HA.
That claim depends on what you are trying to achieve with it I suppose.

In the dashboard above, I calculate the costs of the last charge.
But the electricity-price depends on the time of day (and weekend, and high low tarifs).
So, If I start charging on friday 21:30 PM until saturday 2 AM, what are my real charging costs?
Or what if that friday is a national day-off (so it falls in a low-tariff).
Its exactly that complexity where SQL really shines.

But I really hope that Influx one day gets this super awesome uprade that makes this kind of stuff trivial. That day is the day I will switch back to Influx.

1 Like