Why is HA Voice Preview going to an outside IP address in San Francisco when using Local Assistant?

I received my HA Voice Preview (yay!) and happily set it up. I connected it to a 2.4Ghz SSID that is firewalled to be unable to reach the internet. Setup wouldn’t work, so I temporarily unblocked it until setup was complete and the device had been updated.

However, even after setup is complete, it only answers spoken queries if it has Internet access. in particular, it seems to reach out to an IP address in San Francisco, California, every time I do a query - the following two firewall logs each occurred when I asked it, “Hey Mycroft, what time is it?”:

Jan 9 15:16:38 UDM-Pro [LAN_IN-RET-20010] DESCR=“Allow Study HA Voice preview to reac” IN=br30 OUT=eth9 MAC=f4:92:bf:74:e5:29:20:f8:3b:09:2e:c3:08:00 SRC=192.168.30.96 DST=104.21.48.243 LEN=48 TOS=00 PREC=0x00 TTL=63 ID=1061 PROTO=TCP SPT=61741 DPT=443 SEQ=2686062293 ACK=0 WINDOW=65535 SYN URGP=0 MARK=1a0000

Jan 9 15:19:32 UDM-Pro [LAN_IN-RET-20010] DESCR=“Allow Study HA Voice preview to reac” IN=br30 OUT=eth9 MAC=f4:92:bf:74:e5:29:20:f8:3b:09:2e:c3:08:00 SRC=192.168.30.96 DST=104.21.48.243 LEN=48 TOS=00 PREC=0x00 TTL=63 ID=1224 PROTO=TCP SPT=61742 DPT=443 SEQ=1189115325 ACK=0 WINDOW=65535 SYN URGP=0 MARK=1a0000

When I block access to the Internet, it stops responding. What’s going on here? I thought the whole point of HA and HA Voice Preview was privacy and keeping things local.

2 Likes

Are you using Home-Assistant Cloud for your Assistant speech-to-text & text-to-speech engines?

No, I am using Home Assistant + “Local Assistant” as the Assistant setting. The HA Preview device (192.168.30.96) should only be talking to my Home Assistant box (192.168.1.9) shouldn’t it?

I also note it is reaching out for DNS to 8.8.8.8 and 1.1.1.1 (both on port 53) instead of using the DHCP-vended 192.168.1.1 address. I opened the firewall for those two to allow that much to work, but it shouldn’t be hardcoding DNS IP addresses to use either.

2 Likes

Interesting. I agree with you. I’m surprised it would communicating outside for every command you speak. It might seem reasonable to check periodically for updates & such, but every time?
That IP looks like it’s owned or at least hosted at cloudflare.

What if you just ask for controlling a local light or switch. Do you still see the device looking outside? I’ll check to see what mine is doing over the weekend.

I’m assuming you have not added ChatGPT conversation agent as the fallback for commands not known.

8.8.8.8 is Google.

Both 8.8.8.8 and 1.1.1.1 are public DNS services. The Cloudflare one is typically known as Quad-1.

2 Likes

Could be just pinging them to see if the internet is available.

I have yet to run TCP dump to see what the voice assistant previews I bought are doing, but I can tell you that they have local network and they do not have internet access and they still work fine. From my network observations of the device so far, I have determined that the device needs to be able to talk to home assistants’ internal address, which is usually running on port 8123, to obtain voice responses stored in HA by your text to speech integration. In addition to that, because it is an ESP home device, home assistant needs to be able to reach out to the voice assistant preview and connect to a specific TCP port that device is listening on. No further networking should be required for the device to work.

Maybe you have some funky internal URL configuration for Home Assistant, which requires devices to connect with over the internet through a tunnel?

2 Likes

I would advise to increase the packet capture size (-s1500) and to dump to plain text the connection packets that device is sending (-A) so that you can see the server name indication and determine what host name it is trying to connect to.

No, it’s clearly not a ping. It’s sending an HTTPS connection attempt through port 443. That is not how ping looks like.

3 Likes

You are right - it was an internal HA URL issue. I had pointed the internal URL to my public URL (which is different and is behind a reverse proxy). The correct internal URL also is supported with TLS and reverse proxy, but a local adguard instance serves ..com as the local Nginx proxy manager.

Once I fixed the internal URL on HA, and then updated the firewall rule for the HA Preview box to allow it to reach the adguard dns server and the npam reverse proxy, and changed the DNS resolution handed out by DHCP on the firewalled network to point to adguard, everything worked and HA Preview was strictly using local addresses.

I realize the above is a lot of gibberish to non-networking nerds, who can ignore it. But if somebody is using a reverse proxy they can probably understand it and hopefully this will help them. Thanks all for your pointers!

4 Likes

It is the 8.8.8.8 Google public DNS but DNS is all it is and often used because it has been pretty rock solid. Think they also do backup @ 8.8.4.4
1.1.1.1 is the Cloudflare DNS

Both on port 53 and are global DNS that are supposedly very reliable than what might be your own DHCP ISP DNS errors.

Its not an issue really DNS as without a domain name server no URL will work without returning it IP address from DNS.
That its Google and Cloudflare global public DNS that are there as part of thier internet redundancy and performance tools and both have very good privacy as not a core part of any account tracking.

You have to be seriously tin foil hat to start worry about the decades old 8.8.8.8 bedrock DNS that I have had as an alternative DNS for what must be 3 decades now.

Cisco I think also have public DNS
The Umbrella IPv4 addresses are:

  • 208.67.222.222
  • 208.67.220.220

The Umbrella IPv6 addresses are:

  • 2620:119:35::35
  • 2620:119:53::53

If your further perplexed by just those public port 53 enquiries then you should be running unplugged from the internet as they are just the 1st base services and if they are a problem then any connection is…

Without using those rock solid privacy aware public DNS any error by your ISP or own DHCP will not be resolved and likely create a ton more support questions.

A person’s DNS lookups is great metadata about their interests. The US and Chinese governments have both had publicly documented programs where they capture metadata about people’s Internet usage. The NSA has previously compromised Google servers.

Putting privacy aside, I think the bigger reason one would not want the NA Preview box going to 8.8.8.8 (or indeed, any outside-the-home address) is that it should be an absolute requirement that it work when one’s Internet is down. This is one of the fundamental problems of Siri, Alexa, Google, etc. - if one’s internet goes down, all voice control is broken.

In any case, the use of 1.1.1.1 and 8.8.8.8 was due to a misconfigured DHCP option being served to the HA Preview box. Once I fixed that, the device happily used local DNS servers.

4 Likes

My ISP’s DNS is quite alright. I don’t need Big Tech DNS.

1 Like

Really gotta ask, is this sarcasm?

Nope. It’s earnest sentiment.

(I worked for Big Tech in the past.)

1 Like

What’s your thoughts on doh?

It’s fine is I control the DNS resolver. If not… might as well use ISP DNS. Or local recursive resolver, which is what I do.