Maximum number of source sensor measurements stored.
The problem with this definition is it does not mention the frequency of samples (every 10 seconds, or 1 minute, or 1 hour, etc). So what time period does the stored data represent if you have a sampling_size of 50 (for example)?
Where do I find the frequency of the sensor? The recorder is the only place I can find a frequency (commit_interval), and I don’t think that’s it.
Look at the history of the sensor you are feeding to the stats sensor. Find the smallest change in time.
Use that in your sample size calculations. Add a small fudge factor. Then keep an eye on the statistics sensor attributes. There’s an attribute that monitors the sample buffer (buffer_usage_ratio). You want this to be close to but not at 1.
it’s not a clear science. Sensors update at a constant rate or event-driven based on external influences. In addition, home assistant does not propagate new values to the statistics integration when they are equal to the previous value.
The source sensor sample size is not always most useful. How you configure your sensor depends on the use case. For most use cases you are interested in a given time period (e.g. the minimum value over the last 30min) and sampling size shouldn’t interest you too much.
Discussions are going on in other threads about the relevance of sampling_size vs max_age and you will see improvements to the statistics component soon.