Help: HA OS installation error/network DNS

Hi everyone,

So I decided to change from Supervised to HAOS. For that I got a new SSD, installed the OS using the RPi Imager and kept the old SSD just in case. The installation runs on a RPi4.
Which is why I assumed I could just remove the old SSD and connect the new one with HAOS.

Doing so HA starts with the usual message “Setup can take up to 20min”. But a couple minutes later my internet stops working. My router doesn’t get any connection with the ISP (sim card).

Upon checking the logs of the router (TP-Link Archer MR600V1) I find this:

On my router only the default DNS addresses are configured 212…15 + .14.
The other one 8.8.8.8 is nowhere assigned. And this error occurs everytime I startup the Pi (wired connection) with the new HA OS SSD.
As soon as I remove the Pi (ethernet cable) the router starts functioning again.

Afterwards I tried powering up the Pi without any internet connection, waited +20min, reconnected the Pi to the internet but the installation wasn’t finished. Still the setting up page.

My questions:

  • Do I have to reset the Pi completely? → reinstall the bootloader?
  • Does HA OS need Google’s DNS?
  • Does the installation process need an internet connection?

How can I resolve this?
Where do I look for further clues?

Thanks in advance!

HAOS requires a valid DNS server as well as all IP setting to be correct as per example screenshot below (settings → system → network). Your problem is most likely your DHCP server is not configured properly.

I can’t even configure those settings because the installation doesn’t go past the onboarding screen.
I doubt the router is the problem. Every other device connects perfectly to the internet.
Besides, the DNS servers (ISP) of my router are not configurable.

Please check your DHCP server, as per my suggestion above.

Here’s a screenshot of my settings:

Do you see something off?

It looks like you’re using either DoH or DoT since you have a DNS call to resolve dns.msftnsci.com. I’m not familiar with that particular DNS server but the name plus the sequence of calls looks like a typical DoH or DoT setup.

When you use DoH or DoT then there needs to be another DNS server that is just an IP address which resolves the hostname of the DoT or DoH server. Typically this is called the bootstrap DNS server in the settings of software that supports DoT or DoH. This bootstrap DNS is often defaulted to either 8.8.8.8 or 1.1.1.1 since those are well known, always available DNS servers and it is only used one single time to resolve the DoH or DoT servers hostname.

All this being said, HAOS does not support DoH or DoT, at least not user configured DoH or DoT. Supervisor used cloudflares DNS server via DoT as it’s fallback DNS by default and uses 1.1.1.1 as it’s bootstrap for this. But this isn’t configurable. Other then the fallback the only DNS servers come from DHCP or from user entering an IP address. Either way it’s using normal port 53 unencrypted DNS.

If you’re certain that traffic is coming from the HA machine then it seems like it might not have been properly reflashed. Something unknown must be running in that case.

EDIT: actually I’m not sure about 1.1.1.1 being the bootstrap DNS for supervisor, I’ll have to look that up. Either way it would only be to resolve dns.cloudflare.com, not dns.msftnsci.com

Your DHCP server DNS entries are blank. Try putting a public DNS server address (8.8.8.8 etc.) in there.

Interesting observation.
I checked the router settings but didn’t found anything related to DNS over HTTPS (DoH). Also the manufacturer’s manual doesn’t mention it.
When you say “HA machine might not be properly flashed”, do you mean that RPi might be causing the issue? Should I flash the bootloader again?

I left those blank since it says (optional).
Tomorrow I’ll try 1.1.1.1 & 8.8.8.8 and report back.

Yea I guess I was basically just wondering if the original operating system was still in operation somehow. Honestly its pretty confusing if you’re certain that traffic is coming from the HAOS machine and it has been flashed with HAOS. Debian definitely supports DoH with some of the software that can be installed and configured but HAOS really doesn’t. Unless you installed the adguard addon or something but you clearly can’t get that far.

For the record, I didn’t really think it was coming from the router. The settings you showed are clearly just asking for an IP address so it just wants to know what DNS server to provide for DHCP clients. That can only be a normal IP address to an unencrypted DNS server reachable on port 53. If the router had DoH or DoT settings then that would be in a separate spot. It would also only work if the router was providing its IP address as the DNS server to all DHCP clients and then forwarding that traffic to a separate DoT or DoH server.

It’s more likely some client on the network has DoH or DoT settings and is ignoring the DNS server provided by the router. Hence why I was wondering if it really was the HAOS machine or else if perhaps the original system was still running? Otherwise I’m really not sure how that could happen.

I believe your problem stems from the fact your DHCP address pool is too small (10-24). You only allow for up to 14 machines to get a DHCP address which is low, unless all your hosts use static IP. I would guess your HAOS cannot get a DHCP address as your IP address pool is exhausted.

Check the list of DHCP handed out to confirm or not my suspicion.

Suggested changes in red

It didn’t help. Still the same issue after configuring those DNS addresses.
Restarted the router afterwards and connected the Pi with the HAOS SSD but still fails.
I’m out of ideas…

EDIT: forgot to mention that the pool is actually from .100 to .240, so the Pi does receive an IP.

I defaulted the EEPROM of the Pi with an SD card and then I tried connecting again the SSD but the problem remains.
The Pi is directly wired to the router.
Do you think that even then another network device could be causing the trouble?

I suggest you attempt to install Debian or Ubuntu and see if the problem persists.

Also after you connection drops try to use an IP scanner to check what you can see on the network.

Also on the DHCP server enable IGMP Snooping. This setting related to multicasting, but sometimes vendors implement different things behind settings.

Hi, thanks for the suggestions. I implemented them but still having issues.
Recently I checkec the DNS logs on HA and saw this:

[INFO] 127.0.0.1:60303 - 61733 "NS IN . udp 17 false 512" NOERROR - 0 30.000412266s
[ERROR] plugin/errors: 2 . NS: dial tcp 1.1.1.1:853: i/o timeout
[INFO] 127.0.0.1:48031 - 40551 "NS IN . udp 17 false 512" NOERROR - 0 30.000554793s
[ERROR] plugin/errors: 2 . NS: dial tcp 1.0.0.1:853: i/o timeout
[INFO] 127.0.0.1:54727 - 45280 "NS IN . udp 17 false 512" NOERROR - 0 30.000337217s
[ERROR] plugin/errors: 2 . NS: dial tcp 1.1.1.1:853: i/o timeout
[INFO] 127.0.0.1:44864 - 14011 "NS IN . udp 17 false 512" NOERROR - 0 30.000352951s
[ERROR] plugin/errors: 2 . NS: dial tcp 1.0.0.1:853: i/o timeout
[INFO] 127.0.0.1:49387 - 32137 "NS IN . udp 17 false 512" NOERROR - 0 30.000445924s
[ERROR] plugin/errors: 2 . NS: dial tcp 1.0.0.1:853: i/o timeout

Is this somehow related to my problem?

When running:

ha network info

this is the output:

docker:
  address: 172.30.32.0/23
  dns: 172.30.32.3
  gateway: 172.30.32.1
  interface: hassio
host_internet: null
interfaces:
- connected: true
  enabled: true
  interface: eth0
  ipv4:
    address:
    - 192.168.1.115/24
    gateway: 192.168.1.1
    method: static
    nameservers:
    - 192.168.1.1
    - 1.1.1.1
    - 8.8.8.8
    ready: true
  ipv6:
    address:
    - fe80::xxxx:xxxx:xxxx:40a/64
    gateway: null
    method: disabled
    nameservers: []
    ready: true
  primary: true
  type: ethernet
  vlan: null
  wifi: null
supervisor_internet: false

Do you see some wrong config?