“I use the Hass.io Add-On” is the one that actually connects. But it always seems to eventually fail. I even deleted Node Red entirely, rebuilt it from the ground up (only had 4 flows that I previously exported). Same thing, after 15-20 minutes (seems to be on 5 minute socket resets?), connection fails. All using the same auth key from the ENV of the container, HASSIO_TOKEN
I am fully aware of that. My router doesn’t support reverse NAT addressing, but I do have my FQDN setup as a internal mapped host for internal name resolutions, i.e. mysite.com resolves to the internal IP of my hassio box.
I tried the internal IP, the FQDN, the hassio docker host. I’m trying everything. The only thing that actually connects is the hassio alias. But, it doesn’t stay connected.
My SSL certificate is bound to *.mydomain.com, so I tried the FQDN as a precaution that maybe the “Accept Unauthorized SSL Certificates” may not have worked, either.
I’m flipping all the switches - nothing’s working. NR is failing after n number of reconnections (n seems to be ambiguous. I’m not seeing any “x” number patterns).
Ah, it did accept that URL, but still in a loop trying to connect. I used both the FQDN and IP address with “Accept Unauthorized SSL Certificates” checked.
No luck. I even tried insecure, since it’s using ws:// anyway:
The only thing that actually connects is using “I have Hass.io.” It just doesn’t stay connected. It is using the HASSIO_TOKEN env variable. It’s not the token, either - it does connect. But it fails on reconnection:
1 Jan 20:31:03 - [debug] [server:Home Assistant] WebSocket Connecting http://hassio/homeassistant
1 Jan 20:31:03 - [debug] [server:Home Assistant] config server event listener connecting
1 Jan 20:31:03 - [info] Server now running at http://127.0.0.1:46836/
1 Jan 20:31:03 - [info] [server:Home Assistant] WebSocket Connected to http://hassio/homeassistant
1 Jan 20:31:03 - [debug] [server:Home Assistant] config server event listener connected
[20:31:04] INFO: Starting NGinx...
I don’t have a hass.io environment setup at the moment to do any testing. If you have the ability you could try using a normal docker install of NR and the ha-websocket package and see if that stays connected.
Given your hit on hassio_supervisor, I decided to look at that. We might have something… Apparently, a failed connection is crashing supervisor:
20-01-02 04:21:42 INFO (MainThread) [hassio.api.proxy] WebSocket access from a0d7b954_nodered
20-01-02 04:21:42 INFO (MainThread) [hassio.api.proxy] Home Assistant WebSocket API request running
20-01-02 04:23:30 INFO (MainThread) [__main__] Stopping Hass.io
20-01-02 04:23:30 INFO (MainThread) [hassio.misc.forwarder] Stop DNS forwarding
20-01-02 04:23:30 INFO (MainThread) [hassio.api.proxy] Home Assistant WebSocket API error: Received message 257:None is not str
20-01-02 04:23:30 INFO (MainThread) [hassio.api.proxy] Home Assistant WebSocket API connection is closed
20-01-02 04:23:30 INFO (MainThread) [hassio.api] Stop API on 172.30.32.2
20-01-02 04:23:30 INFO (MainThread) [hassio.core] Hass.io is down
20-01-02 04:23:30 INFO (MainThread) [__main__] Close Hass.io
It crashed ?! And then restarted. It happened right as NR got the failed auth connection.
PS: Actually, it might be the other way around: hassio_supervisor is crashing, disconnecting the sockets, causing NR to try and reconnect. The multiple attempts happen as hassio_supervisor restarts.
There’s a point when supervisor is restarting that will cause NR to get back auth_invalid.
Except, it’s the other way around, the proxy is crashing while HA stays online:
pz@hermes:~$ sudo docker ps | grep ass
bbfbcaa86c43 homeassistant/amd64-hassio-dns:1 "coredns -conf /conf…" 1 second ago Up Less than a second hassio_dns
b36eb4b06a34 homeassistant/amd64-hassio-supervisor "/bin/entry.sh pytho…" 3 seconds ago Up 2 seconds hassio_supervisor
ee9d83c273c3 hassioaddons/node-red-amd64:5.0.7 "/init" About an hour ago Up About an hour addon_a0d7b954_nodered
c1460fe27f0c sabeechen/hassio-google-drive-backup-amd64:0.98.3 "python3 -m backup" 5 hours ago Up 5 hours 0.0.0.0:1627->1627/tcp, 8099/tcp addon_cebe7a76_hassio_google_drive_backup
1e7fe4e5f7b2 homeassistant/amd64-addon-samba:9.0 "/run.sh" 5 hours ago Up 5 hours addon_core_samba
dab2cb7832b7 hassioaddons/aircast-amd64:2.2.1 "/init" 5 hours ago Up 5 hours addon_a0d7b954_aircast
e1c88b2e7b5c homeassistant/qemux86-64-homeassistant:0.103.5 "/bin/entry.sh pytho…" 2 days ago Up 5 hours homeassistant
My HA has been online for 5+ hours. hassio_supervisor and hassio_dns continue to “crash” and restart for some reason. I don’t know why. It happens right around the 5 minute mark, too, same time NR loses its socket.
I finally found the root cause to my issue: it was my watchtower container.
I took at look at it’s logs and boom!, found it:
time="2020-01-02T16:38:41Z" level=info msg="Creating /hassio_supervisor"
Failed to send notification email: dial tcp 172.217.212.108:25: connect: connection timed out
time="2020-01-02T16:43:21Z" level=info msg="Found new homeassistant/amd64-hassio-supervisor:latest image (sha256:032972a1a170fba081dc67f359866683e23101010f50c73bc657d72d15b20b88)"
time="2020-01-02T16:43:27Z" level=info msg="Stopping /hassio_supervisor (9310315752ceaec2ae84db7c53c77fefd7d6b361d442e887ffffad3918936cb0) with SIGTERM"
time="2020-01-02T16:43:28Z" level=info msg="Creating /hassio_supervisor"
Failed to send notification email: dial tcp 172.217.212.108:25: connect: connection timed out
time="2020-01-02T16:48:29Z" level=info msg="Found new homeassistant/amd64-hassio-supervisor:latest image (sha256:032972a1a170fba081dc67f359866683e23101010f50c73bc657d72d15b20b88)"
time="2020-01-02T16:48:35Z" level=info msg="Stopping /hassio_supervisor (ca7ea8346e051c6b34c2e9489db2f1f88fc00559fc221a3a4e522706adff9369) with SIGTERM"
time="2020-01-02T16:48:51Z" level=info msg="Creating /hassio_supervisor"
SMTP isn’t working, thus why I never got informed of this (time to fix that). But, looks like all this time it was an issue with Watchtower not properly pulling down the new image for the new rebuild.
Talk about a chain of events: Node Red goes offline, caused by auth_invalid token, caused by hassio_supervisor restarting every 5 minutes, caused by Watchtower failing to pull down the latest hassio_supervisor docker image, resulting an constant destroy and rebuild every 5 minutes.
I’m a bit late to this thread, but have also just run into this issue. I’m running HA, Mosquito and Node Red in Docker all on a named bridge network.
HA Version core-2021.3.4
19 Apr 15:16:31 - [info] Node-RED version: v1.3.2
I set a long life access token for nodered and then it works great for a few minutes (i.e. not just a single call, all the states are updated and it can make service api calls back into HA.
Then I get WebSocket Closed to my server address.
I’m sure it’s the auth issue that is discussed above, since to ‘fix’ it, I need to restart Node red container, when it again works for a few minutes.
See my last reply. The root of my issue was a third-party application (Watchtower) updating the containers. This resulted in mixed up networking that supervisor could not handle.
My only solution was to stop HA completely, delete all existing containers, and then rebooting the box, letting supervisor restore everything back to the way it expected. I also had to “readd” my addons, since they’re managed individually by supervisord.
Thank you Philip for the suggestion. I read that with interest but I am not using Watchtower nor a supervisor process that I am aware of. Unless this is part of Docker itself?
I am using portainer to manage my docker containers.
The HA stack does not like third-parties interfering with it. HA creates static assignments within its network, and when a third-party starts downing/upping the containers, the container ip addresses start changing, resulting in the errors I had.