Local DNS!

I’m using HA OS and updated to 2022.5.0 this morning as well as Supervisor 2022.05.0. I wasn’t having any issues using local hostnames in the configuration prior to today. I tried going to my backup of 2022.04.07, but still having issues, so I don’t think it is specifically 2022.5.0 that is the issue.

I have an Adguardhome LXC running on the network with DNS Rewrites for all the local hostnames and like I said that has been working fine until today.

I have gone through all the sites I can find for a solution. Even added the DNSMasq add-on to see if HA would work better with that. I checked in ha dns info, both local and servers include the 192. address for the AdGuardHome server. I also set fallback to false. Still could not get configs to work with hostnames, so I installed tcpdump on the HA OS filter on udp port 53 and noticed a couple of things

  1. if I run nslookup mosquitto.jnetinc.local, it fails out and tcpdump shows traffic between core-ssh.local.hass.io.39905 > hassio_dns.hassio.53
  2. if I run nslookup mosquitto.jnetinc.local 192.168.X.X using the AdGuardHome IP, I get the appropriate response and tcpdump shows the traffic between core-ssh.local.hass.io and 192.168.X.X
  3. If I go in to HA Dashboard and configure the MQTT using mosquitto.jnetinc.local as the hostname, it fails to connect, tcpdump doesn’t show any traffic at all, so it’s like it’s not even asking

Am I missing something?

In adguard home did you add a DNS rewrite for mosquitto.jnetinc.local?

Yes, they have been there for a while and I was using the hostname in the MQTT configuration without an issue for several months now. Just stopped working today when I restarted Core as part of the 2022.5.0 update. All configs using local hostnames stopped working. It’s possible to use the IP addresses for some, but that’s messy. And there are a few that use certificates that fail as they are tied to the hostname.

image

1 Like

Yes so that actually won’t work anymore. People submitted CVE-2020-36517 saying that we were forwarding local domain names inappropriately so we did an audit. One of the things we realized that we were in fact handling .local poorly. This was changed in this PR:

We looked at what systemd-resolved did for handling LLMNR and MDNS names and noticed that it flatly refused to forward these types of queries to resolvers as they are reserved for LLMNR and MDNS, you can see that code here. So now we do this as well.

If you need to use a DNS rewrite for .local then something is not working properly in your network. .local is exclusively for mdns and DNS resolvers are not supposed to answer queries for those. Something either isn’t broadcasting mdns correctly or is handling mdns queries incorrectly.

Or alternatively you can switch to using a domain like .lan or one of the other reserved for local TLDs. But not .local since that is reserved for multicast only.

Using mosquitto.jnetinc.home.arpa and adding that to the DNS Rewrites fixed the issue. Never knew that about .local. Now to start the process of updating my self signed certs and configs…

1 Like

Yea that will work. Sorry about that but it was handled as a security issue. Those warrant breaking changes sometimes. We do need to work on communication of some of these though.

Just as an aside, I have a neat trick for SSL within my LAN. This guide shows how to set up a completely private vaultwarden instance but with publicly trusted (not self-signed) certificates and can be used for almost anything (not really about vaultwarden). Basically you do this:

  1. Buy a domain
  2. Transfer that domain to cloudflare (others may work for this, not sure, cloudflare definitely does)
  3. Create a subdomain and assign it a local IP address
  4. Use certbot with the DNS challenge to get a Let’s Encrypt certificate for that subdomain
  5. Use that subdomain as your internal URL everywhere. SSL on your LAN without any warnings, uploading certificates or really any out of the ordinary certificate management.
2 Likes

I will need to look in to this. I’ve just gotten started with certs for internal traffic, nothing external, so self-signed hasn’t been too bad, but I am considering opening somethings to the external world and definitely want real certs for that.

Ah yea. Well when you are dealing with a publicly facing URL that’s usually pretty straightforward. There’s lots of guides for getting a certificate in those cases. Plus the Let’s Encrypt addon makes it really easy.

The tricky bit is if you want SSL for a private URL only accessible within your network. Self-signed certificates work for that they’re just kind of a pain. You have to upload that certificate to every device that wants to connect. And some devices simply don’t have that option so you either have to hope they ignore SSL warnings or you’re screwed.

Normally trusted public CA’s won’t give certificates for private domains but with that guide you can make it happen.

@CentralCommand Hey man, do you have any idea if this change will break sonofflan?

It uses mDNS to receive status updates from sonoff devices, as that’s the only way they respond to any commands.

I mean it shouldn’t? mdns doesn’t involve DNS resolvers. Devices that should respond on mdns names broadcast that on your network and others use that information to resolve queries. It has nothing to do with whatever DNS server you’re using, its a different mechanism entirely.

If you’re worried you can test it though. I use these commands for testing mdns queries on my network:

# Query an mdns name
alias dig-mdns='dig -p 5353 @224.0.0.251'

# Query an LLMNR name
alias dig-llmnr='dig +noedns -p 5355 @224.0.0.252'

# Browse and inspect broadcasted mdns services
alias mdns-browse='dns-sd -B _services._dns-sd._udp.'
alias mdns-query='dns-sd -B' # Type (ex. "_home-assistant")
alias mdns-inspect='dns-sd -L' # Name (ex. "Home") Type (ex. "_home-assistant")

Those ports and ip addresses are not unique to my network, they are standard. Same commands should work on yours.

Cool cheers, I’ve literally just got a pi this week (haven’t set it up yet, using core right now) - will find out when I set it up :slight_smile:

Excuse my ignorance but I don’t get it. No internal name is resolveable anymore since that change. Before everyting was nicely working when I manually removed the fallbck and my DNS servers where just handling it. How can I just have my DNS servers handling my DNS lookups, it can’t be that complicated, other apps/servers/systems can do it too.

Now my ha dns log is full of NXDOMAIN, howerver a nslookup and ping from any other machine works fine.

To be clear, there is no issue with giving local names to your systems. Just don’t use .local as that is reserved for mdns, any other tld will work fine. Feel free to read rfc 6762 if you want to learn more about why .local is special.

I can’t speak for how the other machines on your system work but I can assure you that the host HA is running on cannot resolve your .local names if you’ve simply added them to your DNS server and aren’t actually using mdns. You’re welcome to ssh into the host and try it yourself. The DNS plugin isnt the one resolving .local names, it is asking systemd-resolved on the host to do that for it. That was one of the big changes. So when plugin DNS returns nxdomain for a .local address that’s because the host system is doing that.

1 Like

Mike, sorry for my rude post.

I was frustrated that now none of my domain.local addresses where working anymore.
My LAN is 2 decades old, so the name ends in at that time even recommended .local.
And most probably I am not alone with that.

But there is a solution in coexistence with mDNS when the DNS requests are handled first.

I dived into coredns, and as you are cleaning up supervisor anyway, I have a working solution for the configs and an idea to implement it.

I understand that HASS is not intended to be used in a corporate or more business styled environment, however, as also many cheap NAS providers offers LDAP and DNS services many of us have a proper DNS setup at home. I can understand that the fallback has a very good reason, I can imagine there were some support requests based on misconfiguration. I am happy to take that support for some time and to change the docuemtation accordingly.

Supervisor currently just asks for the hostname, there should be either the option for a FQDN, or another field to specify the domain.
If a domain is specified, this domain part is now added to /usr/share/tempio/corefile in coredns container as the first entry along with it’s DNS servers:

mydomain.local:53 {
    log
    error
    forward . 192.168.111.11 192.168.111.12 192.168.103.13
    cache 30
    }

If no domain is specified, we stay at the usual solution.
mDNS people happy, no DNS people happy, DNS people happy :slight_smile:

Is that an option we can explore?

So just to clarify, the fallback isn’t involved in mdns queries. On a system with supervisor, here’s a rough overview of what happens:

  1. When you do ha dns info, the host system knows about the DNS server(s) listed in locals. The ones listed in servers are not known to it, plugin-dns/coredns handles those.
  2. When a query that ends in .local or a single-label name comes in to plugin-dns, it does not attempt to resolve it on its own. Instead it simply asks the host’s systemd-resolved for an answer over dbus. Keep in mind the host knows mdns, llmnr and the dns servers listed in locals. It can use any and all of this information in answering queries (though it may choose not to, we have no control over it)
  3. Plugin-dns returns whatever answer is provided by systemd-resolved in this case. It does not ask any other dns servers or answers regardless of the status code. Not the ones in servers, not fallback, nothing else is queried.

What your suggesting is actually pretty complex. I mean you’re asking for all the options required to configure a domain with totally separate handling from everything else. And even if we did that we probably wouldn’t allow .local since according to the spec we shouldn’t. The authority for .local on a network is mdns. I think if you need this much control over the network then tbh you would probably be better served with a container install then one with supervisor.

One thing that I suppose we could do here is an option to simply disable mdns/llmnr. So it would look kind of like this:

  1. Add a new option for “disable multicast dns” to supervisor’s DNS API
  2. Supervisor writes this option into config file for dns plugin
  3. DNS plugin omits this line based on this new option. Then no special treatment for MDNS and LLMNR queries is given.
  4. Update the CLI to support this new option
  5. Update API doc for this new option

In addition there would also need to be a check to mark the system as unsupported for usage of this setting in supervisor. There are definitely a bunch of dependencies on mdns that may break with this setting such as:

  1. ESPHome devices broadcast their name using mdns by default
  2. Home Assistant broadcasts its name via mdns and llmnr
  3. MDNS discovery enabled by default in HA
  4. Local Google Assistant relies on mdns (not sure if this would break or not)

This is a shortlist off the top of my head. Basically either the setting would simply mark a system as unsupported or an author would need to research all the combinations of things with that setting which would break and mark those combinations as unsupported. Since we don’t want issues caused by folks disabling mdns while also enabling features that rely on mdns.

If you or someone wants to PR this then I think that could work. I can add it to my list but tbh there’s a lot on there already unrelated to DNS so I’m not sure when I’ll get to it. That and I’m not 100% convinced we should be doing it. If you scan this thread and the others around local DNS issues you’ll see a common refrain is “Why is Home Assistant doing its own DNS and not just asking the host system?” Well .local is now a case where that’s exactly what we are doing. We’re not doing our own DNS and simply asking the host system. Seems very counterintuitive to remove that.

Puh, now you create ideas :slight_smile:

From a development POV I would love to do the ultimate configuration option, but based on the complexity of its range of settings and my experience with half educated users this may create unnecessary headache and should be IMHO, if, a subbutton aka Advanced Settings.

But should we really be able do disable mDNS? From my technical point of view more and more “smart” and not so smart devices will support it, Apple pushed it not just into home network but also into enterprise networks, and as it can co-exist I do not see an advantage when it is disabled - except maybe some less noisy network.
I know that the RFC meanwhile reserved .local for mDSN, Apple for example integrated in a way that it does not interfere with a mydomain.local additionally in place. The key is to respect the name server(s) for each zone and one of them can be the mydomain.local.

You may disable the possibility to set a XYZ.local domain but from my POV this is not needed, because in that solution they do not interfere and co-exist. And the interface does not prevent me from adding a wrong IP address, some certain idiocracy/liberty may be left on the user’s shoulders. In this case I can imagine that either a good quick start manual for settings with DHCP, own DNS and own domain can resolve many issues. I am up for that docu work as well.

Regarding your clarification. I understand that the FB is not involved in mDNS, and neither should be the local domain. I read through the possible configuration options of core DNS and found often in examples and sites about coreDNS the way of answering a specific zone with a forward to certain DNS servers, which I then adapted. Also, I addressed that in many configurations people do not run a TLS enabled server at home, but a simple bind or dnsmasq on their routers, NAS or small hypervisor, but I already thought about the button under the Domain Name: Enable DoH or DoT and Hostname

Yes, the authority for .local is mDNS, which does not exclude possible authority of mydomain.local to another NS, each subdomain can have its own name server(s).

Regarding possible solutions in case of (mis-)configurations. Improper DNS setting ruin pretty much everything, that’s clear. What could go wrong? Assuming we have the new “domain name” option under supervisor - system - network:

  • User enters .local as domain, system says .local is not allowed because of mDNS
  • User enters domain.local but his domain is called mydomain.local and therefore his DNS server refuses to answer for example esp-kitchen.domain.com and all other internal requests will end in NXDOMAIN and the log will contain
    172.30.32.1:57044 - 24079 “A IN espkitchen.domain.local. udp 38 false 512” NXDOMAIN qr,aa,rd 52 0.011588231s
    and obviously many other NXDOMAIN…
  • user enters his correct mydomain.local and DNS queries to his domain are served as well as mDNS.

A handy reset network settings button could be helpful. Resetting to DHCP or a new manual address and just restoring the original config, applying the newly entered settings and rewriting the domain (if applicable) into corefile… hm, is there some way of loading files for zone info, but a quick google did not bring too much on split config or load additional configs - it would be nice to not touch the corefile at all but load the zone forwarder additionally if needed. I will try to find a way for that or do you know?

What do you think, leaving everything as it is, just adding DNS forward for the domain set to the NS set?

And yes, I would PR that, I just need help connecting the supervisor UI to the config files…
And I maybe would ask extensively the users if they want it that way, I am experienced in heterogenous networks but I may miss a certain constellation that is very well in use and it could be taken care of.

Tbh I’m not sure I totally follow. But we are open to PRs here if you want to give it a go. I guess here’s the general ground rules for a PR in this area:

  1. We won’t add an option that’s basically just “inject custom corefile here”. We want specific options in the APIs for the features we support so we can ensure a finite support space.
  2. Start by adding new options to supervisor’s API here. Probably in dns.py. You should also reflect current values of these new options in the info API
  3. Generally all options for the plugin are captured in one or more properties here. You can see in there that is also where the config is written out that is passed to plugin-dns.
  4. In plugin DNS you can assume the option(s) are available and use that information to write the core file. Take a look at the template here to see how that works.
  5. New DNS API options must also be added to the CLI here. And documented here.
  6. You’ll need good test coverage for everything added to supervisor.
  7. Unfortunately we don’t have great options for automated testing of the plugins right now other then ensuring they start so you’ll have to do more manual testing there. I find its easiest to just run it either by pretending its an addon or building the image and running it locally and then test it with dig. You can also build the image and then retag it so it displaces the current image for DNS on a dev system and gets fully exercised as a DNS plugin somewhere.

I would recommend starting with a PR for supervisor and use that to lay out your plan then wait for feedback before proceeding with the other parts.

We may also ask for a new unsupported check depending on the content of the PR and expected interactions with the rest of the ecosystem. I don’t think we would do that in this case but others may think of something I forgot.

Yes, it got pretty mixed up, a short declutter:

  • Adding the option to handle the personal local domain eg. mydomain.home with the provided DNS server(s) while leaving mDNS and the fallback function untouched.
  • Optionally adding the option to reset the network settings to resole misconfigurations.

I went back to your post about the rough overview about systemd-resolved, checked resolvectl and tried a couple of local lookups, which all failed. Adding the search domain with resolvectl domain 3 mydomain.home resolved that issue, all mDNS and DNS requests are performed now properly. I used the enp0s3 interface.

Tbh I am now a bit lost with the architecture. I am pretty sure there are reasons for that kind of setup but I have rearely seens that kind of complexity for name resolution:

  • Why are there two more or less competing name lookup services? There is coreDNS wich is differently configured than the systemd-resolved.
  • Why do I need to set a search domain to properly lookup a FQND of the same domain? Because even though resolved knows about the name servers it does not look it up properly until a the search domain for that domain is set.

I can’t imagine this is practical from a maintenance and support point of view.
I would go now the other way round and clean it up from the requriements side, instead of fixing the existing: What NS service is needed, which service can take care of it (coreDNS, resolved, dnsmasq or whatever) and keep that ONE service configured for all major setups and not two more or less competing ones:

  • MUST resolve names for the internal docker network
  • MUST resolve mDNS
  • MUST resolve DNS with all variants (DoT/DoH/DNS)
  • SHOULD respect local search domains

I can surely take care of the configuration and extensive testing with all major client platforms. I can also test most hypervisors and I have a NUC to run it baremetal, I just can’t test on RPI and I am not experienced in python.

What do you think?

1 Like

So getting this output - not seeing the two errors you mentioned - is it safe to proceed?
image