For the record, sending data to a third party is not the only concern.
Many users would like to opt out to prevent the hourly nag and the additional overhead and logging caused by all this checking. This is not a function which everyone needs, and certainly not a function which anyone needs hourly.
It’s not actually the only data being sent out by supervisor but it might be the only data sent to a 3rd party currently, I’m not sure. Supervisor does communicate with home-assistant.io though in order to obtain version information and (if you allow it) track some usage information by way of updater.
I ask this because if you don’t have an issue with that would an acceptable compromise be if HA checked these passwords indirectly and anonymously? Like if instead of having Supervisor directly call the Have I Been Pwned service, what if Supervisor passed this information to home-assistant.io where it was scrubbed of all identifying info before being checked against Have I Been Pwned? Would that be an acceptable compromise of privacy and security or is a full opt-out required?
The difference being quite clear. In order for homeassistant to physically operate it must know the correct time, and therefore communicate with an NTP server. In order for homeassistant to be updatable, it needs to check what the latest version available is.
This check is not in any way related to the operation of homeassistant, and therefore should be optional.
As @anon43302295 stated, the difference here is that implicit consent is given to allow HA to communicate with it’s own backend (in this case, home-assistant.io) by leaving default_config: and/or adding updater: to configuration.yaml.
With this “feature”, we are not being given the ability to approve or deny communication with a 3rd party external of our own networks and to be honest, the way that it was described in the release notes was literally a one-liner with no details behind what is being sent or how often or even under what circumstances.
" * Passwords and secrets in add-on configurations are checked against known breaches with https://haveibeenpwned.com/ ".
Believe me, I love the HA devs and I appreciate everything they do. But this is just shady and not at all what I expect from them.
Well, I don’t believe it’s shady. I disagree with the policy whole-heartedly but it’s likely well intentioned.
Personally I think a developer’s policy needs to be stated “officially”. Kind of like “all integrations need to support the UI, yaml optional” but in this case with respect to user information sent over the network.
As mentioned above, this is already done with information shared via the updater.
As for NTP, I personally run a local server that all machines on my network point to. So ideally it would be nice to have that configurable as well. But there’s a large difference between pinging for time and sharing items from my configuration over the 'net.
It’s extremely troublesome to learn that my passwords have been harvested and transmitted without my knowledge or consent.
I can’t understand why something like this would be forced upon users and not be an opt-in.
The idea “trust us, cause they’re hashed and there’s NO WAY we messed up and compromised your passwords that we’re harvesting and processing then transmitting through the internet without your knowledge or consent” is quite shocking from an organization I’ve grown to trust and send money to every month.
Yea sorry if I wasn’t clear. I agree, this is very different from those things. I was just curious if more people would be ok with it if the API call did not come directly from their HA instance but rather after being anonymized further via home-assistant.io.
It’s a moot point though, its pretty clear there’s issues with the feature overall and an opt-out is strongly desired.
For example. The SHA1 for the poor password “admin” is: d033e22ae348aeb5660fc2140aec35850c4da997
A K-anonymity request from HIBP sends the first five d033e and your IP to HIBP. It can there be logged where it was returned, what time and what hash was returned.
It returns about 580 hashes. The hash d033e22ae348aeb5660fc2140aec35850c4da997 has 51040 hits. What would be interesting here is to collect the IP’s for everyone who has sent d033e from the same IP that would be vulnerable. Some of them would not be, but some of them would.
Looking through the logs of HIBP, a returning pattern of requests from the same IP would indicate that this is a service that runs and not a user application. So with a pattern recognition, you can start single out what services are requesting what.
There are a couple of other angles included into this, but takes more time to go into.
I think however, this boils down to what Home Assistant was about and what the community expected from the development team. I think this more comes down to the feeling of being let down on one of the key promises and the reason why many have chosen HA instead of other more “cloud based” solutions.
This kind is crap keeps happening, they add whatever change from some pull request that seems like a good idea at the time - then nothing, any objections to it or concerns they say have to be raised as a new issue along side the thousands upon thousands of already open issues, or hope for some public outcry, like this one is generating.
Otherwise you’re either digging into the code to fix it locally in your own installation (and having to maintain that at every release) or you’re just SOL.
Tbh it’s getting really frustrating how the devs are ignoring so much feedback from people who are unable to submit said feedback as a pull request.
I agree it was probably well-intentioned. And far be it from me to denigrate any of the wonderful developers who contribute so much of their time and effort to this project.
But I can’t help but wonder what sort of developer would think it’s a good idea to perform this kind of function hourly, and not give any option to throttle it or opt out of it.
Even disregarding the security concerns, the whole reason HA exists is to give the world something better than appliance-type solutions which allow only limited local control and force the vendor’s preferences on every user.
Every other part of HA allows the administrator to configure it to their own needs, or skip installing it in the first place. Where would the idea to do this one differently come from?
It’s worse than that. They’re sending the partial hash for all your secrets, in the same order every time. No randomization, no salting of the list. This is a textbook example for fingerprinting the calling device. And the third party could then start tracking when and how you change your passwords, if you change it into vulnerable ones, track your IPs using the fingerprint (are you using VPNs from time to time, are you changing your ISP, etc).
Worst case, settle on a version and fork it. Of course that’s far from ideal if you have cloud based services, etc.
Add me to the “Yes I would like to opt out.” List.
The Pwnd site is too much of a “one size fits all” For example If your password is “asdfghjkl” that is very common (over 400K hits) If you password is “LexusES” that is still Pwnd even though it is only on their list 3 times. Both generate a warning. But one is considerably better than the other (and I’m not sure it is LexusES, a longer password is almost always better than a short one).
Ironically, “LexusES!” is OK and “passes” but every hacker knows that putting an ! at the end of a password makes it no more secure (many argue that is slight less secure in some cases) . Sites/Systems that require a “special character” just causes 80% of users to put an ! at the end (and now the hacker knows one of the chars to guess).
About 5 years ago Microsoft’s Head Office did a corporate audit of their employees passwords… Over half were “Seahawks##” where the ## was the number of the month. 01-12… (those are all Pwnd too most with low double digits). They forced a special Charter, did another audit and Most of the passwords were “Seahawks##!” .
I guess it is better than nothing, but only slightly.