Integration tracking is done on each installed H.A instance, and the sum of it is shown on each integration page. For example: Piper - Home Assistant show 8.5% of users have this enabled.
This kind of tracking is Wonderfull for loads of reasons: decide where to invest in, improve areas and find weaknesses etc.
Can we also count the used voices used in Piper? (eg. nl_BE_XXX or nl_NL_YYY)?
Most of the time the selection of a voice, and the country one lives in would be similar, but within each language there could be more variations or dialects (or the lack of it on the other hand).
As he mentioned nl_be, Belgium has 3 official languages, Dutch, French and German. A lot of countries have more official languages, as Switzerland, Canada, Italy etc…
I know countries can be multilingual. but you can infer a lot without being too far off. You want to know French? look up the percentages of population for the countries speaking it, and estimate based on HA usage in those countries.
How actionable will the data be to HA devs? The more detailed data HA analytics gets, and the more data they collect, the more I want to opt out. One NL_nl less. And I think I might not be alone in this, as HA users tend to be privacy minded. The percentage of people opting in to analytics is not high, so expecting precise results is not an option anyway. There’s enough data to get some idea.
And if you have the numbers, then what? You will see higher usage statistics on Piper languages working well, because languages not working well don’t get used. So where would the effort go? To languages not being used? Or will it be like Google News, gravitating into a funnel? And how would one get more/better training data based on the results?
So here’s a tip: contribute to the training data. Did you know you can have piper use your own voice?