HAOS mDNS failed last night

Just in case it helps anyone here, here’s how I detected, found and fixed the issue - it was in the end entirely my fault, but it was particularly tricky to diagnose.
I have homeassistant set up in a virtualbox VM running on a Mac Mini on my home network. It had all been working fine for many months. I have a unifi network. I upgraded my router from a USG to a UCG a few weeks ago, which also included separating devices on the network into VLANs. I made this change and throughout HA continued working fine. Yesterday I noticed homeassistant.local didn’t resolve. I chased my tail thinking it was a broader mDNS issue for some time and tried changing all the Unifi mDNS settings. I had set up mDNS proxy across VLANs because some of the IOT devices I use need it. But eventually I realised that printer.local and other mDNS endpoints were all working fine - it was just HA.
I updated HA OS and core, rebooted. The problem went away. For about 4 minutes. Then came back. In the HA Host logs I discovered a tight loop of systemd resolved conflict detected homeassistant12345 renamed to homeassistant12346. I then googled the error and came across a chain of closed HA github issues all of which referenced each other and each pointing out fixes that contradicted each other. I didn’t understand why this was all broken now.
Eventually, I found the issue when I tried to ssh into the Mac Mini running the HA VM. I blame Apple:
When I was setting up my new Unifi router, I created new VLANs and new SSIDs to separate devices and traffic. In order to set up devices on all these networks, I had to join my phone (an iPhone) to each of the networks to use the many different apps to configure all the IOT stuff I own. Because Apple has the nice ‘feature’ of saving all the network SSIDs and logins between all devices on your iCloud account, my Mac Mini also saved the SSID and login for every separate network on each VLAN. The Mac Mini also has a wired connection. So, randomly, the Mac Mini would sometimes connect to a different SSID putting it on a different VLAN to the wired connection. So HA host would see mDNS reflected queries coming from each interface and it interpreted this as a conflict. Probably causing lots of UDP churn traffic on the network at the same time. The fix was simply to turn wifi off on the Mac Mini.

2 Likes