Dump event/state data periodically to duckdb file

I frequently find myself thinking “if only I had the data for this sensor over the last few months”. I know I can increase the HA DB’s retention period, but I’m worried this is not really what the SQLite DB is for, and other options like sending the data to InfluxDB, or installing a separate instance of MariaDB or other DB server for this seems a bit of overkill.
Being heavily involved with Snowflake at work, I know the power of purpose-built analytical databases (columnar, well partitioned etc), and wanted to try the using DuckDB to hold my data. My thought process is that if HA can do without a DB server for transactional data, why should I need a DB server for my single-or-very-few-users analytical needs?

So I built this small script and systemd unit/timer, and will start scraping my HA db every hour into a DuckDB file in my host server. I simply copy the HD SQLite DB to the host via ssh to avoid impacting the HA instance, then do light processing to save the data in a easy-to-use format

It’s very simple, and doesn’t yet allow for filtering entities, nor does it automatically manage data retention (I’ll have to trim it from time to time), but I think it will do the trick for a few months.

Does anyone see any obvious downside or issue with this approach?
Any ideas on what fun things you could do with this type of rich and granular data collected over a few months?

Not strictly a downside or issue, more of a heads up:

Home Assistant already stores long term statistics for measurement & metered devices (eg temperature sensors or energy meters). That (numeric) data is then aggregated & never purged from the system, assuming these are set up correctly.
The only things that are purged from the recorder are non-numeric entity states, like lights going off & on or binary sensor states.

Therefore, by storing all data indiscriminately, you might end up with duplicate data in duckdb due to the presence of LTS entries.

Not using it myself, but also check out the InfluxDB integration for this purpose. People then use this with Grafana, for example.