Hi! Finally sunk some hours into it now myself. I think I found a working solution!
After trying some minimal approaches & your full Dockerfile I created a MWE that should be applicable to your Dockerfile as well.
The builds took ~1.5h each, so I didn’t want to spend more time on that, especially since you are much more proficient in your repo than I am
After that I tried to get your Dockerfile to run w/o the toolkit build - it did run but I had to update a few versions and remove a few version pins to get it to work. All in all the build process took too long, so I switched to the MWE approach for the toolkit build.
Here is the final Dockerfile which builds timescaledb + the toolkit w/o any problems on alpine (but in 5915 seconds, so quite a while!):
FROM ghcr.io/hassio-addons/base/aarch64:14.3.2
# Install Postgresql & build dependencies
RUN apk update \
&& apk add --no-cache cargo clang cmake curl gcc gcompat make musl-dev git openssl-dev pkgconfig postgresql postgresql-dev rust rustfmt
# Build timescaledb
RUN git clone https://github.com/timescale/timescaledb \
&& cd timescaledb \
&& ./bootstrap \
&& cd build \
&& make install
# Set up Postgresql extension build environment
RUN cargo install --version '=0.10.2' --force cargo-pgrx \
&& cargo pgrx init --pg15 pg_config
# Build toolkit
RUN git clone https://github.com/timescale/timescaledb-toolkit \
&& cd timescaledb-toolkit/extension \
&& cargo pgrx install --release \
&& cargo run --manifest-path ../tools/post-install/Cargo.toml -- pg_config
# Prepare folder for socket
RUN mkdir /run/postgresql
RUN chown postgres:postgres /run/postgresql/
# Initialise DB
USER postgres
RUN initdb -D /var/lib/postgresql/data
## Poor man's test with a nasty sleep :)
RUN (postgres -D /var/lib/postgresql/data &) \
&& sleep 2 \
&& echo "CREATE DATABASE testdb;" | psql \
&& ON_ERROR_STOP=on psql testdb -c "CREATE EXTENSION timescaledb_toolkit; CREATE TABLE test(ts timestamp, value float); SELECT time_weight('Linear', ts, value) FROM test;"
ENTRYPOINT postgres -D /var/lib/postgresql/data
I went down the gcompat route once while I was borrowing binaries from the official docker image, but it never occured to me I could also need this to fix the SIGKILL-issue during the build!
So with that out of the way, the next thing I guess should be to split the build-times over different pipelines. I was almost thinking about creating an alpine package myself with timescaledb-toolkit in a separete repo+pipeline, and onlu pull in the package during the build of the add-on.
I mitigated this by splitting up the steps in multiple Dockerfiles.
For non-failing pipelines I guess you could still stick to one single Dockerfile if you arrange it efficiently (order steps by least changes) and if pipeline caches are properly set up.
Build times should then be quite low as well.
I guess you can’t parallelize much here since things depend on each other as far as I understood.
No it’s not. The recorder and the LTSS addon store the same data, and have separate settings. Difference is, the LTSS stores it for long term, and the standard recorder stores it for at most a few days.
Creating sensor entities seems to be not the correct approach. Instead I use mqtt and in Home Assistant configure an mqtt sensor. This is working fine.
What a great addon !!
I have installed HA addons for progress and timescaleDB and LTSS and Grafana instead of using sperate servers for influxDB and Grafana, I am very happy to say goodbye to Influx and enthusiatic about the ability to store LT data in an SQL environment which seems to be something that time series DBs are returning to, or embracing. But I have 2 questions for which i cant find the answers .
Question 1) Postgress is now running as a docker based addon how do I get external accces to it. External access to long term data is very important for future analytics and the movement of data into other AI pipelines. These are seperate concerns that need to be addressed outside of HA and addons. I want to run other SQL applications outside of the HASS containers on a desktop machine (on the same network) and attach to postgres and access my LTSS data but I cant find how to expose the postgres port and the correct url to use.
Question 2) I want to use the configuraton options to include and exclude entities from being sent to Timescale LTSS table. BUT I cannot find examples how how to construct the filters to allow me to filter the entity names other than to use a ‘*’ wild card. So what filter filter options do I have to do this. Can I use NINJA2 templates for this. If so an example of a more complex filter matching part of an entity name would be nice to see. I want to rename all my entities in a prefixed and postfixed way that would allow me to group and manage my entities more preciseley by function and control what is sent to LTSS. I want to know that they can be filtered by LTSS before I do this.
Hi, with TimescaleDB and LTSS my Database grow huge in linear scale. I now have 10GB. I have 2000~ entities total and can handle 20GB or more DB size, but worring how much it will grow. but Can anyone help me with my config? Is anything wrong? Is including all sensor and person value bad practice?
You can open-up the HA container addon to the outside world on any port you like, in the config of the addon in HA.
But… I also had another guy who wanted to run the container addon without Home Assistant. I am currently making that a bit easier, so you could use exactly the same docker image as a home assistant addon, and/or a postgresql/postgis/timescale/timescaledbtools image for running outside of Home Assistant (like on kubernetes or just docker).
Filtering of entities works the same way as for the recorder component ( Recorder - Home Assistant (home-assistant.io)) , or you could use a pgagent job to remove the data you don’t want once every x time.
I have a fundamental question about this add-on and the need in a modern world.
I just recently moved to PG16 with TimeScaleDB and wanted to know what is the value of LTSS with TimeScaleDB as the backend? Does TSDB not do a good enough job to make LTSS not as needed in this day and age?
I am not trying to be a jerk but just trying to determine what is the value of this if the backend is already TSDB via the recorder.
I think this is more a question of how HA is designed, rather than this component. Since HA needs to support sqlite db’s on sd cards in raspberry pi’s etc, it takes a design approach that works well for that, and we expand from there. So yes, TSDB could do it all quite effectively, but since using TSDB is an alternative backend, it has to fit the existing design requirements, hence we have the separate recorder and ltss datastores.
On my system I point both recorder and ltss to the same TSDB database, works great. Having the recorder and ltss in separate tables does mean that bad query designs in HA for the recorder don’t bog down the UI, which is a plus (it’s not always easy to keep query optimisation high when supporting multiple db backends).