Home Assistant cloud and Home Assistant container

I recently moved my HA installation from QEMU to Docker due to a hardware change and all the challenges that come with having LibVirt and Docker on the same machine (looking at you, NetworkManager!).

Anyway, being familiar with HA, the migration has been smooth sailing so far, and I’ve had the opportunity to redo many automations, scripts and UIs that have ‘evolved’ over the years.

When the big moment arrived, I expected to simply unregister the NabuCasa registration on the old instance and register the new instance to enable the remote UI for the new installation. It registers fine, the registration shows up on the Nabu Casa website, but it shows up as disconnected. In Home Assistant it says Cloud status: connected but remote is being prepared. It’s been stuck in this state for >24h and the troubleshooting doesn’t help much either (wait or disable ipv6, which isn’t even enabled). The log shows that one of the letsencrypt endpoints isn’t available. I wouldn’t mind for now, but this error prevents iOS clients from registering with this instance.

Any suggestions? Is there a way to use an alternate letsencrypt endpoint?

Here’s the log:

Logger: hass_nabucasa.remote
Source: /usr/local/lib/python3.12/site-packages/hass_nabucasa/remote.py:533
First occurred: 12:01:07 (1 occurrences)
Last logged: 12:01:07

Unexpected error in Remote UI loop
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/urllib3/util/connection.py", line 95, in create_connection
    raise err
  File "/usr/local/lib/python3.12/site-packages/urllib3/util/connection.py", line 85, in create_connection
    sock.connect(sa)
TimeoutError: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 715, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 404, in _make_request
    self._validate_conn(conn)
  File "/usr/local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 1058, in _validate_conn
    conn.connect()
  File "/usr/local/lib/python3.12/site-packages/urllib3/connection.py", line 363, in connect
    self.sock = conn = self._new_conn()
                       ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/urllib3/connection.py", line 179, in _new_conn
    raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x7fc165736930>, 'Connection to acme-v02.api.letsencrypt.org timed out. (connect timeout=45)')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 799, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='acme-v02.api.letsencrypt.org', port=443): Max retries exceeded with url: /directory (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fc165736930>, 'Connection to acme-v02.api.letsencrypt.org timed out. (connect timeout=45)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/hass_nabucasa/remote.py", line 495, in _certificate_handler
    if not await self.load_backend():
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/hass_nabucasa/remote.py", line 264, in load_backend
    await self._acme.issue_certificate()
  File "/usr/local/lib/python3.12/site-packages/hass_nabucasa/acme.py", line 422, in issue_certificate
    await self.cloud.run_executor(self._create_client)
  File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/hass_nabucasa/acme.py", line 238, in _create_client
    directory = client.ClientV2.get_directory(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/acme/client.py", line 330, in get_directory
    return messages.Directory.from_json(net.get(url).json())
                                        ^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/acme/client.py", line 705, in get
    self._send_request('GET', url, **kwargs), content_type=content_type)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/acme/client.py", line 647, in _send_request
    response = self.session.request(method, url, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/requests/adapters.py", line 507, in send
    raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='acme-v02.api.letsencrypt.org', port=443): Max retries exceeded with url: /directory (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fc165736930>, 'Connection to acme-v02.api.letsencrypt.org timed out. (connect timeout=45)'))

edit:
A few more observations:

  1. IPv6 is disabled

    0bb67c2416d6:/config# ip addr
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
    47: eth0@if48: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
        link/ether 02:42:ac:38:05:0a brd ff:ff:ff:ff:ff:ff
        inet 172.56.5.10/16 brd 172.56.255.255 scope global eth0
           valid_lft forever preferred_lft forever
    
  2. name resolution and ICMP works for any other target in the container

    0bb67c2416d6:/config# ping 9.9.9.9
    PING 9.9.9.9 (9.9.9.9): 56 data bytes
    64 bytes from 9.9.9.9: seq=0 ttl=59 time=13.625 ms
    64 bytes from 9.9.9.9: seq=1 ttl=59 time=13.518 ms
    64 bytes from 9.9.9.9: seq=2 ttl=59 time=13.699 ms
    64 bytes from 9.9.9.9: seq=3 ttl=59 time=15.253 ms
    
    0bb67c2416d6:/config# nslookup letsencrypt.org
    Server:         127.0.0.11
    Address:        127.0.0.11#53
    
    Non-authoritative answer:
    Name:   letsencrypt.org
    Address: 52.58.254.253
    Name:   letsencrypt.org
    Address: 35.156.224.161
    Name:   letsencrypt.org
    Address: 2a05:d014:58f:6202::64
    Name:   letsencrypt.org
    Address: 2a05:d014:275:cb01::c8
    
  3. ping does not work for acme-v02.api.letsencrypt.org endpoint, but name resolution does

    0bb67c2416d6:/config# ping acme-v02.api.letsencrypt.org
    PING acme-v02.api.letsencrypt.org (172.65.32.248): 56 data bytes
    ^C
    --- acme-v02.api.letsencrypt.org ping statistics ---
    5 packets transmitted, 0 packets received, 100% packet loss
    
    0bb67c2416d6:/config# nslookup acme-v02.api.letsencrypt.org
    Server:         127.0.0.11
    Address:        127.0.0.11#53
    
    Non-authoritative answer:
    acme-v02.api.letsencrypt.org    canonical name = prod.api.letsencrypt.org.
    prod.api.letsencrypt.org        canonical name = ca80a1adb12a4fbdac5ffcbc944e9a61.pacloudflare.com.
    Name:   ca80a1adb12a4fbdac5ffcbc944e9a61.pacloudflare.com
    Address: 172.65.32.248
    Name:   ca80a1adb12a4fbdac5ffcbc944e9a61.pacloudflare.com
    Address: 2606:4700:60:0:f53d:5624:85c7:3a2c
    
  4. I can ping the (cloudflare protected) endpoint from the docker host

    ping -I enp1s0f0 172.65.32.248
    PING 172.65.32.248 (172.65.32.248) von 192.168.65.2 enp1s0f0: 56(84) Bytes an Daten.
    64 Bytes von 172.65.32.248: icmp_seq=1 ttl=56 Zeit=105 ms
    64 Bytes von 172.65.32.248: icmp_seq=2 ttl=56 Zeit=105 ms
    64 Bytes von 172.65.32.248: icmp_seq=3 ttl=56 Zeit=105 ms
    

tl;dr: I can’t establish a connection to a single particular letsencrypt endpoint that is required to sign the NabuCasa/RemoteUI certs for my instance from the latest docker image. However every other connection (cloud integrations, HACS) from my home assistant container is accessible.

What name servers are you using in /etc/resolv.conf?
I use this

nameserver 1.1.1.1
nameserver 192.168.31.103

container

cat /etc/resolv.conf
search plogas.dmz plogas.local plogas.iot
nameserver 127.0.0.11
options edns0 trust-ad ndots:0

host

cat /etc/resolv.conf
nameserver 127.0.0.53
options edns0 trust-ad
search plogas.dmz plogas.local plogas.iot

I did not specify any additional NS for the container.

Try to put cloudflare dns like I did and reboot comp. You can restart service, but reboot it.
I messed up this few days ago playing around and got similar problems like you have.

I prefer not to use Cloudflares DNS, but I’m using Quad9 on a network level.
It’s not NS related, as the ping resolves the name perfectly fine - yet pinging does not work from within the container:

8517f9268977:/config# ping -I eth0 172.65.32.248
PING 172.65.32.248 (172.65.32.248): 56 data bytes
^C
--- 172.65.32.248 ping statistics ---
68 packets transmitted, 0 packets received, 100% packet loss

I can ping the the cloudlfare protected letsencrypt endpoint from the docker host, however:

ping -I enp4s0f4u1u3u2 172.65.32.248
PING 172.65.32.248 (172.65.32.248) von 192.168.56.2 enp4s0f4u1u3u2: 56(84) Bytes an Daten.
64 Bytes von 172.65.32.248: icmp_seq=1 ttl=56 Zeit=105 ms
64 Bytes von 172.65.32.248: icmp_seq=2 ttl=56 Zeit=105 ms
64 Bytes von 172.65.32.248: icmp_seq=3 ttl=56 Zeit=105 ms
64 Bytes von 172.65.32.248: icmp_seq=4 ttl=56 Zeit=105 ms
64 Bytes von 172.65.32.248: icmp_seq=5 ttl=56 Zeit=105 ms
64 Bytes von 172.65.32.248: icmp_seq=6 ttl=56 Zeit=105 ms

So, why is this particular host not accessible from inside the container, but every other host on the internet?

I tried to ping some outside host from my docker containers, one is iin host mode another is in bridge mode. Both are working fine.
It might be your docker internal network. I use network brige or host for my containers.
Check other containers in docker network does they work…

It might be your docker internal network.

That was my initial suspicion too, but I can ping / resolve ANY other host from inside the container except this particular one.

I can even curl the regular letsencrypt.org website from within the container

8517f9268977:/config# curl -v https://letsencrypt.org/
* Host letsencrypt.org:443 was resolved.
* IPv6: 2a05:d014:275:cb02::c8, 2a05:d014:58f:6200::64
* IPv4: 3.72.140.173, 52.58.254.253
*   Trying 3.72.140.173:443...
* Connected to letsencrypt.org (3.72.140.173) port 443
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / X25519 / id-ecPublicKey
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=lencr.org
*  start date: Jan 27 21:30:22 2024 GMT
*  expire date: Apr 26 21:30:21 2024 GMT
*  subjectAltName: host "letsencrypt.org" matched cert's "letsencrypt.org"
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify ok.
*   Certificate level 0: Public key type EC/prime256v1 (256/128 Bits/secBits), signed using sha256WithRSAEncryption
*   Certificate level 1: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
*   Certificate level 2: Public key type RSA (4096/152 Bits/secBits), signed using sha256WithRSAEncryption
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://letsencrypt.org/
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: letsencrypt.org]
* [HTTP/2] [1] [:path: /]
* [HTTP/2] [1] [user-agent: curl/8.5.0]
* [HTTP/2] [1] [accept: */*]
> GET / HTTP/2
> Host: letsencrypt.org
> User-Agent: curl/8.5.0
> Accept: */*
...

´´´

Leave it for 24 hours. Maybe there was too much queries.

Probably the best advice. According to a very similar issue reported in the letsencrypt forums I likely ran into a blocklist with my attempts.

So I waited 48h just to be sure. Just tried it again and I see this error message again. So it seems container related

Did you tried to contact nabu casa with this problem?
I don’t think that two of us will find a solution for this problem.

Yes, I opened a support ticket yesterday

I run home assistant container in host mode. If this is the same with you the only thing that cross my mind in adguard or pihole or maybe something else that is blocking access.
I don’t know what else could it be.