Improve Privacy, Stop using hardcoded DNS

MustangMatt · August 24, 2021, 3:51pm

Cloudflare is a perfectly reasonable DNS provider but forcing people to use it is not.

In the unlikely event cloudflare goes down you’re screwed. (Probably not because cache would last until they came back but still) It does happen sometimes.

mvv · August 24, 2021, 6:32pm

Main issue for me is that you should never override the dhcp by default. This breaks stuff and is a security issue. With Firefox wanting DoH to be enabled by default this would break resolution in split dns environments. So for instance if a user connects their device with Firefox with default DoH enabled on a network I manage they cannot access internal webmail, chat etc.

Also regarding Cloudflare, Americans are somewhat protected from spying at the endpoint, the rest of the world are not.

Tommmii · August 24, 2021, 7:14pm

just don’t try to mention this to a dev or project leader, you’ll get muzzled.

electrobento · August 24, 2021, 7:29pm

Hardcoding DNS is, at best, against best practices. I hope the devs see how silly this choice is.

mvv · August 24, 2021, 7:54pm

For some reason they consider having it not hard coded a feature. People who point out the error that it is are demanding a feature. Just mark it as the bug it is and fix it when you can I would think.

electrobento · August 24, 2021, 8:51pm

It’s not a bug since it’s intentional. Just a bad decision.

Knobee · August 25, 2021, 1:09am

I live on a sailboat. I use HA to control various things aboard the boat. Right now, I have full Internet access. When I go offshore, I’m not going to have access to anything outside of my little network.

Having a (breaking) reliance on external DNS is a travesty.

rimo · August 28, 2021, 7:34pm

Hi, I’m new to Home Assistant . I installed the VM to test it and the first thing I noticed was how it was hammering my edge firewall non stop trying to reach CloudFlare DoT servers.
I read the thread and all the issues posted and is sad to see the Devs do not consider this as a bug.
Hardcoding DNS is a no-go and is difficult to imagine a reason rather than to simplify the development. If that was the case they should not close the issues and leave them opened for future improvements.

My current solution is to block traffic to 1.1.1.1 and 1.0.0.1 at the host using iptables. Is a hack and this does not survive a reboot but so far it works. I guess I could make it persistent but I want to test it for some days first.

I did a quick look a the DNS docker and could not find the obvious source of the requests. If someone knows I’d like to get a pointer because I would prefer to point this to my own servers rather than blocking them.

If someone wants to do the same is very simple. Use the “SSH & Web Terminal” addon to SSH to the host and then execute the following:

iptables -I  FORWARD -d 1.1.1.1 -j REJECT
iptables -I  FORWARD -d 1.0.0.1 -j REJECT

Becarefull because if you mess with the iptables the docker communications might fail

Tommmii · August 28, 2021, 7:55pm

have a look at /usr/share/tempio/corefile inside of the hassio_dns container, I think you’ll see the reason why CloudFlare DNS is getting hammered.

rimo · August 28, 2021, 7:56pm

too fast, I found the configuration file in /etc/corefile

bash-5.1# vi /etc/corefile 

.:53 {
    log {
        class error
    }
    errors
    loop

    hosts /config/hosts {
        fallthrough
    }
    template ANY AAAA local.hass.io hassio {
        rcode NOERROR
    }
    mdns
    forward . dns://192.168.1.3 dns://192.168.1.3 dns://127.0.0.1:5553 {
        except local.hass.io
        policy sequential
        health_check 1m
    }
    fallback REFUSED,SERVFAIL,NXDOMAIN . dns://127.0.0.1:5553
    cache 600
}

.:5553 {
    log {
        class error
    }
    errors

    forward . tls://1.1.1.1 tls://1.0.0.1  {
        tls_servername cloudflare-dns.com
        except local.hass.io
        health_check 5m
    }
    cache 600
}

As you can see all requests are sent to my local server (192.168.1.3) but also to a local instance in the port 5553. This local instance will send all request plus any that the local DNS server rejects, to Cloudflare using TLS over DNS.

Basically is an anti “pi-hole” feature. It does not matter if you filter your local DNS because your filters are always bypassed using CF unless you block the port 853 in your edge firewall.

I confirmed that if I change the 1.1.1.1 to another IP the request change

rimo · August 28, 2021, 8:00pm

Thanks,
I need to find a way to make the configuration changes persistent

Tommmii · August 28, 2021, 8:02pm

if you’re not running hassos, you could mount your own corefile into that path.

rimo · August 28, 2021, 8:29pm

I do.
Also I need a docker crash course.

Tommmii · August 29, 2021, 8:36am

If you’re on HassOS, then don’t bother. As lowly user, we have no write access to the docker config.

HeyImAlex · August 29, 2021, 10:42am

To be perfectly honest, if you want to modify the Docker config for the DNS to point to your own, heck if you even know what Docker is, then you’re not really part of the target demographic for Home Assistant OS. HAOS is supposed to be a plug’n’play appliance mostly for non-technical users or people who just don’t care about the specifics. When I buy a new dishwasher, I don’t really care about having write access to the firmware. I just want it to, you know, clean my dishes.

If you want to change these more advanced settings, there are other HA installation methods much better suited for advanced users. With HA core, I have full access to everything, I configure everything the way I want and I modify the HA source if I feel I have to.

(And for the record, I agree that hardcoding a DNS is indeed very bad practice and there should be an option to change it in the UI).

kaosmagix · August 29, 2021, 1:18pm

As far as I understand you actually can change this in the interface in Supervisor in the tab system in Host Settings (I can change my IP settings for IP4/6 to manual and change the DNS settings there. Or is HA then still using the other DNS servers? (PS using HasOS 6.2 / supervisor-2021.08.0 / core-2021.9.0b3)

tescophil · August 29, 2021, 9:56pm

You cant change it because the DNS ‘fallback’ is hard coded, and if you block the fallback the system goes into meltdown.

tom_l · August 30, 2021, 12:46am

But if you have a correctly configured DNS it wont have reason to use the fallback.

Tommmii · August 30, 2021, 8:22am

this is wrong, I have gone through detailed troubleshooting this with 2 (two) devs. Both agreed my DNS is configured as should be. One decided it must therefore be an issue with CoreDNS, the other told me to live with it.

CoreDNS will , for no detectable reason start using the hardcoded fallback. (This is bad, but workable)
The real gotcha is, that it will never revert back to original configuration, but stay stuck on the fallback.

This breaks local DNS, leaks local hostnames to CloudFlare, and is only fixable by restarting the DNS container.

All because 1.1.1.1 is hardcoded as fallback ? Yet, it is user’s fault ?

But don’t take my word for it…anyone who is having these issues : remove the fallback from the container, and watch rock-solid DNS for days on end - until you need to restart HA, thus reverting your edits, and have DNS break within 24Hrs.

How about : I do not want to maintain the OS, so I’ll opt for the simplest & most recommended way to use Home Assistant. Who could’ve guessed something as basic as DNS is broken in this appliance ?

tescophil · August 30, 2021, 9:45am

The problem is the following config in CoreDNS

.:53 {
    log {
        class error
    }
    errors
    loop

    hosts /config/hosts {
        fallthrough
    }
    template ANY AAAA local.hass.io hassio {
        rcode NOERROR
    }
    mdns
    forward . dns://192.168.1.3 dns://192.168.1.3 dns://127.0.0.1:5553 {
        except local.hass.io
        policy sequential
        health_check 1m
    }
    fallback REFUSED,SERVFAIL,NXDOMAIN . dns://127.0.0.1:5553
    cache 600
}

.:5553 {
    log {
        class error
    }
    errors

    forward . tls://1.1.1.1 tls://1.0.0.1  {
        tls_servername cloudflare-dns.com
        except local.hass.io
        health_check 5m
    }
    cache 600
}

Here, the ‘fallback’ is included in the primary list of DNS servers to query:

forward . dns://192.168.1.3 dns://192.168.1.3 dns://127.0.0.1:5553

So this is one problem, this list should just include the DNS servers configured by the user. The local service defined at dns://127.0.0.1:5553 is hard coded to use CloudFlair DNS-over-TLS

The next issue is this:

fallback REFUSED,SERVFAIL,NXDOMAIN . dns://127.0.0.1:5553

So, for starters REFUSED and NXDOMAIN are not errors. REFUSED means that for whatever reason, usually policy restrictions on the network, the DNS request is rejected. NXDOMAIN means the queried domain does not exist. This may mean it doesn’t acutally exist, or is being specifically blocked by a service like PiHole or AdGuard. The queries that return these response codes should not be forwarded to a fallback.

In the case of SERVFAIL a (thin) argument could be made to pass these queries to a fallback, however, hard coding the fallback service goes against best practice.

The thin argument offered by the developers is that “All users are idiots”, and if a user incorrectly configures the HA DNS settings then the system will cease to function. If the hard coded fallback is removed, they believe they will get more ‘problems’ i.e. users complaining that the system isn’t working when they haven’t configured the DNS correctly.

This is more common than you would think, and is implemented by small and large organisations alike. eg. All Google devices (Google Home Hub, Google Speakers Nest Hubs, Chromecast, Android Phones, Nest Thermostats etc.) are all hard coded to use Googles public DNS servers. The difference is when, for example, you block access to Google public DNS on your local network, these devices carry on working without issues using the locally configured DNS. HA does not.

If you block access to CloudFlair DNS-over-TLS (say by blocking outgoing traffic on port 853), then instead of simply accepting that the request was blocked, the CoreDNS module sends the request again, and again, and again, until it gets to the point where it’s sending so many requests that all other HA services cease to function (My HA instance was sending 1.3 million DNS requests to CloudFlair every 24h, that’s 15/second)

So, in summary, this behaviour exists because the developers believe more problems/support requests will be generated if they remove it, and they are probably right. But those of us who are ‘advanced’ users believe there should be an ‘advanced’ option to prevent this behaviour that ‘normal’ users would be very unlikely to access.