Local DNS!

@CentralCommand Hey man, do you have any idea if this change will break sonofflan?

It uses mDNS to receive status updates from sonoff devices, as that’s the only way they respond to any commands.

I mean it shouldn’t? mdns doesn’t involve DNS resolvers. Devices that should respond on mdns names broadcast that on your network and others use that information to resolve queries. It has nothing to do with whatever DNS server you’re using, its a different mechanism entirely.

If you’re worried you can test it though. I use these commands for testing mdns queries on my network:

# Query an mdns name
alias dig-mdns='dig -p 5353 @224.0.0.251'

# Query an LLMNR name
alias dig-llmnr='dig +noedns -p 5355 @224.0.0.252'

# Browse and inspect broadcasted mdns services
alias mdns-browse='dns-sd -B _services._dns-sd._udp.'
alias mdns-query='dns-sd -B' # Type (ex. "_home-assistant")
alias mdns-inspect='dns-sd -L' # Name (ex. "Home") Type (ex. "_home-assistant")

Those ports and ip addresses are not unique to my network, they are standard. Same commands should work on yours.

Cool cheers, I’ve literally just got a pi this week (haven’t set it up yet, using core right now) - will find out when I set it up :slight_smile:

Excuse my ignorance but I don’t get it. No internal name is resolveable anymore since that change. Before everyting was nicely working when I manually removed the fallbck and my DNS servers where just handling it. How can I just have my DNS servers handling my DNS lookups, it can’t be that complicated, other apps/servers/systems can do it too.

Now my ha dns log is full of NXDOMAIN, howerver a nslookup and ping from any other machine works fine.

To be clear, there is no issue with giving local names to your systems. Just don’t use .local as that is reserved for mdns, any other tld will work fine. Feel free to read rfc 6762 if you want to learn more about why .local is special.

I can’t speak for how the other machines on your system work but I can assure you that the host HA is running on cannot resolve your .local names if you’ve simply added them to your DNS server and aren’t actually using mdns. You’re welcome to ssh into the host and try it yourself. The DNS plugin isnt the one resolving .local names, it is asking systemd-resolved on the host to do that for it. That was one of the big changes. So when plugin DNS returns nxdomain for a .local address that’s because the host system is doing that.

1 Like

Mike, sorry for my rude post.

I was frustrated that now none of my domain.local addresses where working anymore.
My LAN is 2 decades old, so the name ends in at that time even recommended .local.
And most probably I am not alone with that.

But there is a solution in coexistence with mDNS when the DNS requests are handled first.

I dived into coredns, and as you are cleaning up supervisor anyway, I have a working solution for the configs and an idea to implement it.

I understand that HASS is not intended to be used in a corporate or more business styled environment, however, as also many cheap NAS providers offers LDAP and DNS services many of us have a proper DNS setup at home. I can understand that the fallback has a very good reason, I can imagine there were some support requests based on misconfiguration. I am happy to take that support for some time and to change the docuemtation accordingly.

Supervisor currently just asks for the hostname, there should be either the option for a FQDN, or another field to specify the domain.
If a domain is specified, this domain part is now added to /usr/share/tempio/corefile in coredns container as the first entry along with it’s DNS servers:

mydomain.local:53 {
    log
    error
    forward . 192.168.111.11 192.168.111.12 192.168.103.13
    cache 30
    }

If no domain is specified, we stay at the usual solution.
mDNS people happy, no DNS people happy, DNS people happy :slight_smile:

Is that an option we can explore?

So just to clarify, the fallback isn’t involved in mdns queries. On a system with supervisor, here’s a rough overview of what happens:

  1. When you do ha dns info, the host system knows about the DNS server(s) listed in locals. The ones listed in servers are not known to it, plugin-dns/coredns handles those.
  2. When a query that ends in .local or a single-label name comes in to plugin-dns, it does not attempt to resolve it on its own. Instead it simply asks the host’s systemd-resolved for an answer over dbus. Keep in mind the host knows mdns, llmnr and the dns servers listed in locals. It can use any and all of this information in answering queries (though it may choose not to, we have no control over it)
  3. Plugin-dns returns whatever answer is provided by systemd-resolved in this case. It does not ask any other dns servers or answers regardless of the status code. Not the ones in servers, not fallback, nothing else is queried.

What your suggesting is actually pretty complex. I mean you’re asking for all the options required to configure a domain with totally separate handling from everything else. And even if we did that we probably wouldn’t allow .local since according to the spec we shouldn’t. The authority for .local on a network is mdns. I think if you need this much control over the network then tbh you would probably be better served with a container install then one with supervisor.

One thing that I suppose we could do here is an option to simply disable mdns/llmnr. So it would look kind of like this:

  1. Add a new option for “disable multicast dns” to supervisor’s DNS API
  2. Supervisor writes this option into config file for dns plugin
  3. DNS plugin omits this line based on this new option. Then no special treatment for MDNS and LLMNR queries is given.
  4. Update the CLI to support this new option
  5. Update API doc for this new option

In addition there would also need to be a check to mark the system as unsupported for usage of this setting in supervisor. There are definitely a bunch of dependencies on mdns that may break with this setting such as:

  1. ESPHome devices broadcast their name using mdns by default
  2. Home Assistant broadcasts its name via mdns and llmnr
  3. MDNS discovery enabled by default in HA
  4. Local Google Assistant relies on mdns (not sure if this would break or not)

This is a shortlist off the top of my head. Basically either the setting would simply mark a system as unsupported or an author would need to research all the combinations of things with that setting which would break and mark those combinations as unsupported. Since we don’t want issues caused by folks disabling mdns while also enabling features that rely on mdns.

If you or someone wants to PR this then I think that could work. I can add it to my list but tbh there’s a lot on there already unrelated to DNS so I’m not sure when I’ll get to it. That and I’m not 100% convinced we should be doing it. If you scan this thread and the others around local DNS issues you’ll see a common refrain is “Why is Home Assistant doing its own DNS and not just asking the host system?” Well .local is now a case where that’s exactly what we are doing. We’re not doing our own DNS and simply asking the host system. Seems very counterintuitive to remove that.

Puh, now you create ideas :slight_smile:

From a development POV I would love to do the ultimate configuration option, but based on the complexity of its range of settings and my experience with half educated users this may create unnecessary headache and should be IMHO, if, a subbutton aka Advanced Settings.

But should we really be able do disable mDNS? From my technical point of view more and more “smart” and not so smart devices will support it, Apple pushed it not just into home network but also into enterprise networks, and as it can co-exist I do not see an advantage when it is disabled - except maybe some less noisy network.
I know that the RFC meanwhile reserved .local for mDSN, Apple for example integrated in a way that it does not interfere with a mydomain.local additionally in place. The key is to respect the name server(s) for each zone and one of them can be the mydomain.local.

You may disable the possibility to set a XYZ.local domain but from my POV this is not needed, because in that solution they do not interfere and co-exist. And the interface does not prevent me from adding a wrong IP address, some certain idiocracy/liberty may be left on the user’s shoulders. In this case I can imagine that either a good quick start manual for settings with DHCP, own DNS and own domain can resolve many issues. I am up for that docu work as well.

Regarding your clarification. I understand that the FB is not involved in mDNS, and neither should be the local domain. I read through the possible configuration options of core DNS and found often in examples and sites about coreDNS the way of answering a specific zone with a forward to certain DNS servers, which I then adapted. Also, I addressed that in many configurations people do not run a TLS enabled server at home, but a simple bind or dnsmasq on their routers, NAS or small hypervisor, but I already thought about the button under the Domain Name: Enable DoH or DoT and Hostname

Yes, the authority for .local is mDNS, which does not exclude possible authority of mydomain.local to another NS, each subdomain can have its own name server(s).

Regarding possible solutions in case of (mis-)configurations. Improper DNS setting ruin pretty much everything, that’s clear. What could go wrong? Assuming we have the new “domain name” option under supervisor - system - network:

  • User enters .local as domain, system says .local is not allowed because of mDNS
  • User enters domain.local but his domain is called mydomain.local and therefore his DNS server refuses to answer for example esp-kitchen.domain.com and all other internal requests will end in NXDOMAIN and the log will contain
    172.30.32.1:57044 - 24079 “A IN espkitchen.domain.local. udp 38 false 512” NXDOMAIN qr,aa,rd 52 0.011588231s
    and obviously many other NXDOMAIN…
  • user enters his correct mydomain.local and DNS queries to his domain are served as well as mDNS.

A handy reset network settings button could be helpful. Resetting to DHCP or a new manual address and just restoring the original config, applying the newly entered settings and rewriting the domain (if applicable) into corefile… hm, is there some way of loading files for zone info, but a quick google did not bring too much on split config or load additional configs - it would be nice to not touch the corefile at all but load the zone forwarder additionally if needed. I will try to find a way for that or do you know?

What do you think, leaving everything as it is, just adding DNS forward for the domain set to the NS set?

And yes, I would PR that, I just need help connecting the supervisor UI to the config files…
And I maybe would ask extensively the users if they want it that way, I am experienced in heterogenous networks but I may miss a certain constellation that is very well in use and it could be taken care of.

Tbh I’m not sure I totally follow. But we are open to PRs here if you want to give it a go. I guess here’s the general ground rules for a PR in this area:

  1. We won’t add an option that’s basically just “inject custom corefile here”. We want specific options in the APIs for the features we support so we can ensure a finite support space.
  2. Start by adding new options to supervisor’s API here. Probably in dns.py. You should also reflect current values of these new options in the info API
  3. Generally all options for the plugin are captured in one or more properties here. You can see in there that is also where the config is written out that is passed to plugin-dns.
  4. In plugin DNS you can assume the option(s) are available and use that information to write the core file. Take a look at the template here to see how that works.
  5. New DNS API options must also be added to the CLI here. And documented here.
  6. You’ll need good test coverage for everything added to supervisor.
  7. Unfortunately we don’t have great options for automated testing of the plugins right now other then ensuring they start so you’ll have to do more manual testing there. I find its easiest to just run it either by pretending its an addon or building the image and running it locally and then test it with dig. You can also build the image and then retag it so it displaces the current image for DNS on a dev system and gets fully exercised as a DNS plugin somewhere.

I would recommend starting with a PR for supervisor and use that to lay out your plan then wait for feedback before proceeding with the other parts.

We may also ask for a new unsupported check depending on the content of the PR and expected interactions with the rest of the ecosystem. I don’t think we would do that in this case but others may think of something I forgot.

Yes, it got pretty mixed up, a short declutter:

  • Adding the option to handle the personal local domain eg. mydomain.home with the provided DNS server(s) while leaving mDNS and the fallback function untouched.
  • Optionally adding the option to reset the network settings to resole misconfigurations.

I went back to your post about the rough overview about systemd-resolved, checked resolvectl and tried a couple of local lookups, which all failed. Adding the search domain with resolvectl domain 3 mydomain.home resolved that issue, all mDNS and DNS requests are performed now properly. I used the enp0s3 interface.

Tbh I am now a bit lost with the architecture. I am pretty sure there are reasons for that kind of setup but I have rearely seens that kind of complexity for name resolution:

  • Why are there two more or less competing name lookup services? There is coreDNS wich is differently configured than the systemd-resolved.
  • Why do I need to set a search domain to properly lookup a FQND of the same domain? Because even though resolved knows about the name servers it does not look it up properly until a the search domain for that domain is set.

I can’t imagine this is practical from a maintenance and support point of view.
I would go now the other way round and clean it up from the requriements side, instead of fixing the existing: What NS service is needed, which service can take care of it (coreDNS, resolved, dnsmasq or whatever) and keep that ONE service configured for all major setups and not two more or less competing ones:

  • MUST resolve names for the internal docker network
  • MUST resolve mDNS
  • MUST resolve DNS with all variants (DoT/DoH/DNS)
  • SHOULD respect local search domains

I can surely take care of the configuration and extensive testing with all major client platforms. I can also test most hypervisors and I have a NUC to run it baremetal, I just can’t test on RPI and I am not experienced in python.

What do you think?

1 Like

So getting this output - not seeing the two errors you mentioned - is it safe to proceed?
image

ipv6 error

This is exactly the issue I mentioned. Your DNS server is not handling ipv6 correctly. If you disable the fallback then your system will be marked as unsupported because it’s likely many things will not work correctly, particularly updates.

In general you should never have anything in issues. If you do there is something wrong you need to look into. Really wish we had a UI here, someday…

I just added an entry for ha.home to my router and then popped into the host shell of an HAOS system and did this:
Screen Shot 2022-05-16 at 3.48.49 PM

This didn’t work for you without additional changes? .home shouldn’t require anything extra to function, it’s only .local that’s got special treatment due to mdns.

If you had to change the settings of systemd-resolved to get mydomain.home to work there may be another issue in your setup. You should be able to add that as an entry to your local DNS server and have no issues resolving that from the host HA is running on and the containers that make up HA.

We need resolved because its the only one that handles multicast DNS that’s readily available on HAOS and stock debian. DNS resolvers like coreDNS and dnsmasq don’t do that OOTB because those kinds of queries generally aren’t supposed to be directed at them. And we’re not implementing all the logic for that ourselves. That was tried in coreDNS already, it isn’t a good idea, multicast DNS is complicated and we don’t want to own that logic. Hence why the mdns plugin we made now simply asks resolved.

Theoretically we could use resolved for everything. The change that would need to be done is everything in network settings comes from dbus and gets handed over dbus to systemd-resolved. We already do that for some of the things, would need to do it for everything. It’s not impossible its mostly just work.

Although I should note I actually discuss that idea with pvizelli before making the mdns changes and he mentioned that we used to do that a while ago and had to change it. He couldn’t remember all the reasons at the time and I didn’t press it. But this is part of why I suggested just opening an initial PR to supervisor laying out your plan to get the opinions of a few others. Although this is starting to sound like it might warrant an ADR first instead.

Thanks. I was able to update my firewall (sophos xg) from v18 to v19 and it appears they resolved the issue. Appreciate all the work that went into this!

image

1 Like

Sure was fun going through my entire network changing .local to .somethingelse.

Glad this useful breaking change was added.

Was there anything wrong with local DNS first, then fallback to mdns (other than a slight delay for the failed resolution)?

You did?

In my case it is not impossible, but as Mike used: lots of work and many breaking points to forget:
certificates, conection strings, Active Directory, Exchange, DFS, scripts, ACLs…

Thank you for the insights!
That’s where I was lastly, isn’t it time to verify the architecture of name resolution in general.

Regarding: "entry for ha.home"… and I found another “bug” - well, actually I pointed that a bit hiddenly already out, to reproduce:

  • set a DHCP IP in the UI under /config/network
  • run resolvectl status, you will get Scopes, Protocols, Current and available DNS Servers AND the DNS Domain (if the search domain is part of the DHCP offer, which it is in my case)
  • a resolvectl query ha.home will work

  • now change to static IP in the UI
  • run resolvectl status, the DNS Domain is now missing
  • a resolvectl query ha.home will not work anymore
    The issue here is the missing search domain, and that explains also why I had to set it manually, because with a static IP it is not part of the DHCP offer and cannot be set through the UI.

I will post a bug report for above and mention it in the ADR discussion.

Yeah, going through 23 servers and updating connection strings wasn’t fun. I set all my hosts to resolve to x.local and x.somethingelse during the transition so I could gradually find problems. I could then remove .local from each host one-by-one until they were all moved over.

Luckily, I don’t have anything in AD that’s named “.local”. I’d learned my lesson before setting up an AD, so picked something else by then!

I still don’t understand why this change needed to happen. HA had been working fine the way it was for years.

I learned the lesson long time ago but the network exists even a bit longer, AD on Windows Server 2000 with a clear recommendation: companyname.local
Even in Windows SBS 2011 the recommendation was still .local
Long story short, the amount of .local out there was growing even after Apple wrote RFC and begged IANA to register the address.

Well, it is like it is. For all who arrive here now and plan to invest nights in changing domain names, connection strings and certificates the proper solution is close and can partially already work now, see: Name lookup service in Home Assistant - switch to systemd-resolved · Discussion #768 · home-assistant/architecture · GitHub

hello,
is there any fix for that issue? I am running the latest haas and occurring same issue …
ie
coredns
→ coredns -conf/corefile
is using 30-50% cpu constantly.

seems the fix is >

thanks