Local DNS!

Sure was fun going through my entire network changing .local to .somethingelse.

Glad this useful breaking change was added.

Was there anything wrong with local DNS first, then fallback to mdns (other than a slight delay for the failed resolution)?

You did?

In my case it is not impossible, but as Mike used: lots of work and many breaking points to forget:
certificates, conection strings, Active Directory, Exchange, DFS, scripts, ACLs…

Thank you for the insights!
That’s where I was lastly, isn’t it time to verify the architecture of name resolution in general.

Regarding: "entry for ha.home"… and I found another “bug” - well, actually I pointed that a bit hiddenly already out, to reproduce:

  • set a DHCP IP in the UI under /config/network
  • run resolvectl status, you will get Scopes, Protocols, Current and available DNS Servers AND the DNS Domain (if the search domain is part of the DHCP offer, which it is in my case)
  • a resolvectl query ha.home will work

  • now change to static IP in the UI
  • run resolvectl status, the DNS Domain is now missing
  • a resolvectl query ha.home will not work anymore
    The issue here is the missing search domain, and that explains also why I had to set it manually, because with a static IP it is not part of the DHCP offer and cannot be set through the UI.

I will post a bug report for above and mention it in the ADR discussion.

Yeah, going through 23 servers and updating connection strings wasn’t fun. I set all my hosts to resolve to x.local and x.somethingelse during the transition so I could gradually find problems. I could then remove .local from each host one-by-one until they were all moved over.

Luckily, I don’t have anything in AD that’s named “.local”. I’d learned my lesson before setting up an AD, so picked something else by then!

I still don’t understand why this change needed to happen. HA had been working fine the way it was for years.

I learned the lesson long time ago but the network exists even a bit longer, AD on Windows Server 2000 with a clear recommendation: companyname.local
Even in Windows SBS 2011 the recommendation was still .local
Long story short, the amount of .local out there was growing even after Apple wrote RFC and begged IANA to register the address.

Well, it is like it is. For all who arrive here now and plan to invest nights in changing domain names, connection strings and certificates the proper solution is close and can partially already work now, see: Name lookup service in Home Assistant - switch to systemd-resolved · Discussion #768 · home-assistant/architecture · GitHub

hello,
is there any fix for that issue? I am running the latest haas and occurring same issue …
ie
coredns
→ coredns -conf/corefile
is using 30-50% cpu constantly.

seems the fix is >

thanks

I could sure use some advice on what I am overlooking with HASS local DNS resolution. (HASS OS 2022.8.5)

My primary router’s DHCP is configured with a domain of “home.arpa” and static DNS host name mapping for several cameras - (lowerdeck, upperdeck, etc.) These hosts resolve locally but not within HASS.

Within HASS, I’ve configured IPV4 with static mapping to local gateway/DNS and there are no HASS resolution issues noted…

image

➜  ~ ha dns info   
fallback: true
host: 172.30.32.3
llmnr: true
locals:
- dns://192.168.0.1
mdns: true
servers:
- dns://192.168.0.1
update_available: false
version: 2022.04.1
version_latest: 2022.04.1
➜  ~ ha resolution info  
checks:
- enabled: true
  slug: supervisor_trust
- enabled: true
  slug: network_interface
- enabled: true
  slug: addon_pwned
- enabled: true
  slug: free_space
- enabled: true
  slug: dns_server_ipv6
- enabled: true
  slug: core_security
- enabled: true
  slug: dns_server
issues: []
suggestions: []
unhealthy: []
unsupported: []
HASS OS Terminal  
➜  ~ nslookup lowerdeck.home.arpa 
Server:         172.30.32.3
Address:        172.30.32.3#53

Non-authoritative answer:
** server can't find lowerdeck.home.arpa: NXDOMAIN

Note: I have IPv6 disabled in the HASS settings because my ISP has it disabled and I don’t see any way to enable in in the router. Also, changing the fallback to false did not help.

Thanks for any advice!

Please share the response of the following commands:

dig A lowerdeck.home.arpa
dig AAAA lowerdeck.home.arpa

Particularly the second one. If your DNS server is returning anything other then NOERROR as the status then you’re going to have an issue. I know you don’t use ipv6 but musl systems still care about this. If an NXDOMAIN response is received for a domain on one protocol it is considered non-existent on all protocols. Here was my more detailed explanation of why this is before.

Well, but anyway you must remove the fallback to make sure only local resolution is taking place in case your DNS does not answer quickly enough or at all.

Additionally please set the search domain in your router for the DHCP discovery and set HA to DHCP instead of fixed address, as there is still no other way to use your own search domain. I found out that in my case HA ignores the search domain if not pushed by DHCP (see: https://github.com/home-assistant/operating-system/issues/1916). Please also see https://github.com/home-assistant/operating-system/issues/1916 it might help you getting the right commands to further troubleshoot it.

Anyway, IMHO the entire DNS, search domain and multi-DNS service situation in HA is politely said suboptimal.

Edit: Can you try to check if your DNS is in general replying properly by using nslookup on a computer

nslookup lowerdeck.home.arpa 192.168.0.1

Just because the name can be resolved from computers or phones does not neccesarily mean your DNS is resolving that

Thanks Mike and Alexander for all your troubleshooting tips!

Ok, so I made the following changes:

  • configured DNS option --fallback=false and restarted DNS
  • Reconfigure HASS IPV4 network to use DHCP. (Router assigns a static IP/hostname)

From my local PC:

nslookup  lowerdeck.home.arpa 192.168.0.1
Server:  GTC-Router.home.arpa
Address:  192.168.0.1

Non-authoritative answer:
Name:    lowerdeck.home.arpa
Address:  192.168.0.17

From HASS OS terminal:

nslookup lowerdeck.home.arpa 192.168.0.1
Server:         192.168.0.1
Address:        192.168.0.1#53

Non-authoritative answer:
Name:   lowerdeck.home.arpa
Address: 192.168.0.17
** server can't find lowerdeck.home.arpa: NXDOMAIN

➜  ~ nslookup lowerdeck.home.arpa
Server:         172.30.32.3
Address:        172.30.32.3#53

** server can't find lowerdeck.home.arpa: NXDOMAIN
➜  ~ dig A lowerdeck.home.arpa   

; <<>> DiG 9.16.29 <<>> A lowerdeck.home.arpa
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 60386
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 2
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: a304903ad30059fd (echoed)
;; QUESTION SECTION:
;lowerdeck.home.arpa.           IN      A

;; ADDITIONAL SECTION:
lowerdeck.home.arpa.    104     IN      A       192.168.0.17

;; Query time: 0 msec
;; SERVER: 172.30.32.3#53(172.30.32.3)
;; WHEN: Tue Sep 06 12:51:16 CDT 2022
;; MSG SIZE  rcvd: 95

➜ ~ dig AAAA lowerdeck.home.arpa

; <<>> DiG 9.16.29 <<>> AAAA lowerdeck.home.arpa
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 42856
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 5fba0bf400a25ab2 (echoed)
;; QUESTION SECTION:
;lowerdeck.home.arpa.           IN      AAAA

;; AUTHORITY SECTION:
home.arpa.              300     IN      SOA     prisoner.iana.org. hostmaster.root-servers.org. 2002040800 1800 900 604800 604800

;; Query time: 177 msec
;; SERVER: 172.30.32.3#53(172.30.32.3)
;; WHEN: Tue Sep 06 12:52:27 CDT 2022
;; MSG SIZE  rcvd: 149

I’m running HASS OS 2022.8.5, which I understand is Alpine based. I can see “NOERROR” status for the DiG A command, and status “NXDOMAIN” for the DiG AAAA command. Is the recursion not available warning a problem?

The other troubleshooting steps require running resolvctl which apparently isn’t installed on HASSOS.

docker exec -t -i homeassistant /bin/bash
bash-5.1# resolvectl status
bash: resolvectl: command not found

I’m not clear what this involves "please set the search domain in your router for the DHCP discovery ".

The search domain is in your case the Domain Name, so that’s setup properly.

Edit: Your PC correctly resolves it and so does the nslookup and the dig A, but with the recursion error. 172.30.32.3 is AFAIK the coreDNS and this one seems to fail it. It’s been a while that I checked the config of coreDNS. Mike, do you have the coreDNS config in mind that might let it fail, otherwise I check it quickly.

Edit: Does HA get the search domain? Unfortunately I only know it with resolvectl to find it out

resolvectl flush-caches
resolvectl status 3

I am wondering now, why you do not have resolvectl in HASS OS It seems the HA CLI does not have resolvectl. Please try the same in the core-ssh (https://github.com/hassio-addons/addon-ssh) container.

1 Like

This is your problem. Doesn’t matter whether or not you disable the fallback, its not going to work because of this.

Since your DNS server responds with NXDOMAIN for AAAA requests rather then NOERROR all the alpine-based containers (read: most of HA) are going to say NXDOMAIN for that domain. This is exactly what I was looking for and talking about in my post above.

Technically this means your DNS server is behaving incorrectly. If a name exists it is always supposed to respond with NOERROR. If there are no answers for that name on the particular type of query (AAAA in this case) then it should return NOERROR with no answers. But if it returns NXDOMAIN then all the alpine containers (core, supervisor, most addons, etc.) will treat that as the answer for all queries on that domain (A and AAAA) due to this commit.

You’ll need to adjust your DNS server to handle this correctly according to spec or else you will have issues with it. If there’s no options around this in your DNS server then some other options to consider:

  1. Enable ipv6 on your network
  2. Install and use a different DNS server
  3. Don’t use a local-only internal domain. Use the same domain for internal and external access but tell your DNS server to resolve that domain to a local IP on your LAN. Also enable the fallback DNS so when your DNS server returns NXDOMAIN for AAAA requests the DNS plugin asks cloudflare for a different answer and gets a proper NOERROR response.

This isn’t true, at least the quickly enough part. I fixed that a while ago. The only time there can be a race condition between DNS servers is if you have multiple DNS servers configured. The fallback is only tried after everything else fails.

If your DNS server doesn’t respond at all then yes the fallback will happen. But that could actually work in your favor, see my solution #3 above.

2 Likes

Good to know, thanks for that, sadly I fall under the last category:

# resolvectl status 3
Link 3 (enp0s3)
    Current Scopes: DNS LLMNR/IPv4 mDNS/IPv4
         Protocols: +DefaultRoute +LLMNR +mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 192.168.103.33
       DNS Servers: 192.168.103.33 192.168.103.32 192.168.103.31
        DNS Domain: <bla>.local

Oh sorry to be clear I meant there’s a race between your configured DNS servers. Even if you enable the fallback it will only be tried after all of those. But I couldn’t tell you which one of those 3 you’ll get an answer from, it should be whichever responds first.

EDIT: Actually I forgot, there’s no races. It uses policy sequential:

I forgot we had to leave that so any DNS overrides added in HA by users via ha dns options --servers were used first. But either way the fallback is not in the forward list anymore, it only happens if all else fails.

This rural ISP here has pretty awful tech support. I’ll try to see if they can enable IPv6 or even have firmware updates that may address this DNS issue. Given that they refused to enable bridge mode on their provided required router so that I can use my ASUS/Merlin routers to manage connections, I have little hope they will help.

Sounds like my best option is to install the DNSMASQ addon and point everythingHASS to its own DNS.

Thanks again for all your help.

Following up… Local ISP updated Calix Gigacenter router firmware. No change. The agent then switched my router to IPv6 Internet connectivity only, saying (incorrectly) that having both IPv4 and IPv6 (dual-stack) wasn’t possible. And that now it has been switched to IPv6, they will lose remote connectivity for support.

it didn’t work. Internet access was dead. They had to send out a technician to come on site to diagnose it. He then configured it for dual-stack so IPv4 was back online, but IPv6 still not working. So back to IPv4-only. Painful, but worth trying. Hello DNSMASQ.

can this be automated when the container starts? it is really annoying to do this every time you restart the haos.

I suppose that yes, a copy command to the container can be done, but I changed my router to a OpenWrt one and since then the problem is gone to me.

This suggestion is quite old. The ability to disable the fallback DNS has been added a while ago. Not sure why you’d still need to do this.

EDIT: Oh I see, this isn’t about fallback. Guessing your local DNS server is responding with NXDOMAIN for AAAA queries on your local domain? Is there a bug open for the DNS software you use? That goes against the DNS spec so it should be one.

I think i’m facing a similar issue. While trying to establish TTS notifications, i noticed that Homeassistant is unable to connect to it’s own hostname. I’m using Duck DNS to get a certificate and Hostname that points to my dynamic WAN-IP. My Homeassistant is reachable from outside my local LAN and from the inside, as i have a local dns-resolver (Pi-Hole) that translates that hostname from Duck-DNS into the local LAN IP. However Homeassistant itself seems unable to resolve its own FQDN using the internal DNS-resolver. Homeassistant is configured to use Pihole as the DNS-Server.

From an SSH-shell on my Homeassistant-Box (Pi4), i tried explicit

nslookup <homeassistant-fqdn> <pi-hole-dns-server-ip> 

but there is no corresponding query in the pihole logs nad Homeassistant’s duckdns-fqdn gets resolved to its WAN-IP.
On every other machine in my network, this works and resolves correctly to the internal LAN-IP.
I assume, that Homeassistant is still forwarding its queries to an external resolver although it has been configured to use the internal and fallback-option to be disabled.

I just got it to work by using “SSH & Web Terminal”-Addon to enter the docker-container “hassio_dns” to edit the hosts-file. I just added the duck-dns-fqdn for my installation pointing to its local LAN-IP.
However: i don’t know if this has any other implications nor if it will survive reboot and updates.
There should be a simpler solution to this obscure black magic hacks.

Is there any documentation on how homeassistant’s name-resolving works in detail and how/when it uses the local dns-server that is configured via webgui and/or SSH?