Online percentile estimation

I’m looking to implement a view of energy usage distributions by e.g. 5 minute windows across the day, aggregated for all days. To do this, i need a nice way to do percentile estimations, but ideally do this for longer data sets.

I was thinking about this, and thought it might be useful for several other use cases (e.g. temperature across the day), and then wondered if this was something to add as an integration in some way.

I came across (and tested) the datadog ddsketch library, which seems to create an accurate, and pretty clean structure that could be stored/kept, and updated e.g. once per day or similar.

It might also be nice to do a set of aggregations across e.g. month (i.e. energy usage by 5 min windows across the day for February) but also e.g. combine into longer windows (e.g. season, year, all-time), it seems like with ddsketch, you could create estimators for a given windowing and e.g. months, and then store and update/combine them as necessary.

I’m not familiar enough with the home assistant structure to know if I’ve missed something that exists already, or if I’m suggesting something that would be too difficult to implement directly.

I might well start to implement an integration or add-on for this to see if I can make it work conceptually, but thought I’d post here to see if I’ve missed something that already exists?