I fell into this hassio core-dns rabbit hole a few days ago. I never had name resolution problems with HA. TBH I didn’t even know HA is using it’s own dns until then. In my case this seems to be triggered by switching from a “dnsmasq classic” name server to pihole for local name resolution. Since then, ha-dns seems to be regularly failing to do proper name resolution for local hosts after some time and is stuck with the hardcoded cloudflare dns until the dns container is restarted.
I don’t know, why ha-dns seems to be so finicky about this change. Maybe pihole is answering DNS requests a bit more slowly…? Or just because? I don’t know.
Unreliable name resolution in a local network, that is well-curated with proper host names and static dhcp leases is a huge PITA. Failing name resolution virtually breaks everything. In the worst case it’s leading to me sitting next to the broken DNS server with a laptop connected directly via ethernet cable and debugging it, because even wifi access points stop working properly after a couple of minutes when name resolution is down.
Because of that I’m deliberately running two name servers with failover functionality in my local network to avoid a SPOF for name resolution. HA breaks this redundancy by insisting of using it’s own nameserver. If that breaks, a lot of things in HA are going sideways, even HA core functionality like writing values to the recorder and influxdb databases.