Kubernetes: Homematic-Callbacks blocked in Home-Assistant

ulehner · September 7, 2023, 7:37am

Hey there. I’m facing a slight issue with the callbacks from my Homematic-CCU when running in Kubernetes. But let’s at first stick to the process:

My Goal
I’m running my Home-Assistant-instance inside my local Kubernetes-Cluster. I also have my Homematic CCU3 running side by side to my cluster. Now I want to integrate my CCU3 into my Home-Assistant-instance.

My setup
I’m running a K3s-cluster locally on a bunch of Raspis. For the sake of network segmentation the cluster is running in an own network 10.22.50.0/24, while all my home-automation devices are placed in the network 10.22.40.0/24.

I deployed Home-Assistant in version 2023.8.4 in my Kubernetes-Cluster and everything is working fine so far except the Homematic-callbacks.

The deployment of my Home-Assistant instance looks like the following:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: home-assistant
spec:
  serviceName: home-assistant
  selector:
    matchLabels:
      app.kubernetes.io/component: home-automation
      app.kubernetes.io/name: home-assistant
  replicas: 1
  minReadySeconds: 30
  template:
    metadata:
      name: home-assistant
      labels:
        app.kubernetes.io/component: home-automation
        app.kubernetes.io/name: home-assistant
    spec:
      containers:
      - name: home-assistant
        image: ghcr.io/home-assistant/home-assistant:2023.8.4

        ports:
          - name: home-assistant
            containerPort: 8123
            protocol: TCP
          - name: callbacks-tcp
            containerPort: 8060
            protocol: TCP

        env:
          - name: TZ
            value: "Europe/Berlin"
        
        securityContext:
            privileged: true
            runAsUser: 0

        resources:
          {{- toYaml (.Values.resources) | nindent 12 }}

        volumeMounts:
          - name: home-assistant-config
            mountPath: /config
          - name: localtime
            mountPath: /etc/localtime
            readOnly: true

        livenessProbe:
          {{- toYaml (.Values.livenessProbe) | nindent 12 }}
        readinessProbe:
          {{- toYaml (.Values.readinessProbe) | nindent 12 }}
        startupProbe:
          {{- toYaml (.Values.startupProbe) | nindent 12 }}

      volumes:
        - name: home-assistant-config
          persistentVolumeClaim:		
            claimName: home-assistant-config
        - name: localtime
          hostPath:
            path: /etc/localtime            


      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            preference:
              matchExpressions:
              - key: node-role.kubernetes.io/master
                operator: DoesNotExist

      terminationGracePeriodSeconds: 10

---

apiVersion: v1
kind: Service
metadata:
  name: home-assistant
spec:
  selector: 
    app.kubernetes.io/component: home-automation
    app.kubernetes.io/name: home-assistant
  type: ClusterIP
  ports:
    - port: 8123
      targetPort: 8123

---

apiVersion: v1
kind: Service
metadata:
  name: home-assistant-callbacks
spec:
  selector:
    app.kubernetes.io/component: home-automation
    app.kubernetes.io/name: home-assistant
  type: LoadBalancer
  ports:
    - port: 8060
      targetPort: 8060
      name: callback-tcp

---

kind: Ingress
apiVersion: networking.k8s.io/v1
metadata:
  name: home-assistant-ui
  annotations:
    kubernetes.io/ingress.class: "traefik"
spec:
  rules:
    - host: home-assistant.home-automation.hansa-net.intra
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: home-assistant
                port:
                  number: 8123
  tls:
  - hosts: 
    - home-assistant.home-automation.hansa-net.intra
    secretName: home-assistant-domain-cert

(Secret containing the TLS certs omitted as not relevant to this case. Access is restricted to intranet solely, therefore no link to Let’s encrypt)

The configuration.yaml of my Home-Assistant instance looks like the following:

# Loads default set of integrations. Do not remove.
default_config:

# Load frontend themes from the themes folder
frontend:
  themes: !include_dir_merge_named themes

automation: !include automations.yaml
script: !include scripts.yaml
scene: !include scenes.yaml

http:
  use_x_forwarded_for: true
  trusted_proxies:
    - 10.42.0.0/16
    - 10.43.0.0/16
    - 10.22.50.0/24

homematic:
  local_port: 8060
  interfaces:
    ip:
      host: 10.22.40.5
      port: 2010
      ssl: false
      callback_ip: 10.22.50.22
      callback_port: 8060

The issue
In my Home-Assistant-instance I do see all the Homematic-IP devices with their provided values. So integrating Homematic into my Home-Assistant-instance basically works so far. Issues appear as soon as I want to e.g. turn on or off a Homematic switch by Home-Assistant. The callback from Homematic providing the new state fails.

The logs of Home-Assistant don’t tell much:

s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service legacy-services: starting
services-up: info: copying legacy longrun home-assistant (no readiness notification)
s6-rc: info: service legacy-services successfully started
2023-09-07 08:26:58.478 WARNING (MainThread) [aiounifi.models.message] Unsupported message key session-metadata:sync
2023-09-07 08:26:58.488 WARNING (MainThread) [aiounifi.models.message] Unsupported message {'meta': {'message': 'session-metadata:sync', 'rc': 'ok'}, 'data': {'client_mac': 'd8:3a:dd:0e:8c:62'}}
2023-09-07 08:27:46.922 WARNING (MainThread) [homeassistant.config_entries] Config entry 'Radio Browser' for radio_browser integration not ready yet: Could not connect to Radio Browser API; Retrying in background
2023-09-07 08:27:57.173 ERROR (MainThread) [homeassistant.components.homeassistant_alerts] Timeout fetching homeassistant_alerts data
2023-09-07 08:43:17.133 ERROR (MainThread) [homeassistant.components.analytics] Timeout sending analytics to https://analytics-api.home-assistant.io/v1

(Srsly, that’s all)

When having a look into the logs of my Homematic-CCU, I do see the following:

2023-09-07 09:07:10,659 de.eq3.cbcs.legacy.bidcos.rpc.LegacyServiceHandler INFO  [vert.x-worker-thread-1] (un)registerCallback on LegacyServiceHandler called from url: http://10.22.50.22:8060 
2023-09-07 09:07:10,664 de.eq3.cbcs.legacy.bidcos.rpc.internal.LegacyBackendNotificationHandler INFO  [homeassistant-ip_WorkerPool-1] SYSTEM: LegacyBackendNotificationHandler Verticle or Worker started 
2023-09-07 09:07:10,666 de.eq3.cbcs.legacy.bidcos.rpc.LegacyServiceHandler INFO  [homeassistant-ip_WorkerPool-1] init finished 
2023-09-07 09:07:10,667 de.eq3.cbcs.legacy.bidcos.rpc.internal.InterfaceInitializer INFO  [vert.x-worker-thread-3] Added InterfaceId: homeassistant-ip 
...
2023-09-07 09:09:22,971 de.eq3.cbcs.legacy.bidcos.rpc.internal.InterfaceInitializer ERROR [vert.x-worker-thread-3] IO Exception: Could not add interface: homeassistant-ip

de.eq3.cbcs.legacy.communication.rpc.RpcIOException: java.net.ConnectException: Connection timed out (Connection timed out)

    at de.eq3.cbcs.legacy.communication.rpc.internal.transport.http.HttpTransport.sendRequest(HttpTransport.java:110) ~[HMIPServer.jar:?]
...
Caused by: java.net.ConnectException: Connection timed out (Connection timed out)

    at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_202]
...

What I did to track down the issue

My first thought when facing that issue was, that I have some trouble with my network segmentation. As logs didn’t show any issues AND my node-red-instance running in my Kubernetes-cluster and interacting with my CCU in the same way works like a charm, I placed another Home-Assistant instance in my cluster network with the same configuration running as Docker container inside a VM inside ProxMox. Here the callbacks from the CCU to this second Home-Assistant instance are registered so that everything is working well with this second Home-Assistant instance.

So I turned to the Kubernetes-deployment. Here I can see that the callbacks from the CCU land inside the home-assistant-0-pod. So the deployment doesn’t seem to be the problem as well. I also changed the deployment from a LoadBalancer-approach to a NodePort-approach in order to avoid eventual issues that might appear due to the bridging under the hood. But that didn’t change anything.

Leaves the Home-Assistant instance itself as root cause.

My thoughts

Might it have to do something with the proxying? I mean, in order to get access to the frontend, I had to enter the cluster internal CIDR addresses as trusted proxies. Might that filtering also appeal for the callback calls?

danielperna84 · September 8, 2023, 10:14pm

You should consider switching to the new integration. That wouldn’t solve your problem though.
You can change the log-level (in Home Assistant) to be more verbose. Have a look at the Logger documentation.
I don’t know anything about Kubernetes. But your main problem is:

(un)registerCallback on LegacyServiceHandler called from url: http://10.22.50.22:8060
...
Connection timed out (Connection timed out)

The HomeMatic integration first tells the CCU you can push events to http://10.22.50.22:8060. The CCU then tries to do that, but fails with the Connction time out error.

To clarify: the CCU needs to be able to initiate new connections the the IP+port combination you have specified. Thinking in iptables-terms, the is not a related connection, it’s a new connection. So it could be a firewall blocking the requests.
Furthermore on the network level, the CCU needs IP connectivity to Home Assistant. So when using different networks, make sure to either add the Home Assistant network to the CCUs routing table, or verify the router inbetween properly forwards requests to the Home Assistant network.

ulehner · September 29, 2023, 1:07pm

Hi Daniel,

at first thank you for your reply and sorry for my late reaction.

I’m currently trying to switch to the New integration but right now I’m facing trouble with the setup dialogue when I want to integrate my CCU (it just doesn’t appear in Version 1.42.0).

But concerning my original issues: I hat the same thought concerning the networks. But: In this setup, Node-Red runs like a charm. Nonetheless I moved the CCU into my cluster network. The CCU got the IP 10.22.50.40. I configured Home Assistant and Node-Red accordingly. But still no joy: Node-Red kept on running while Home Assistant didn’t. So the network isn’t an issue.

I’ll turn up the log level and report here. And I got the idea to put a Wiremock-container as sidecar listening on Port 8060 in my deployment in order to see, what endpoints are called by the CCU.

ulehner · October 20, 2023, 2:30pm

Hi everyone,

had a lot around my head the last days, so sorry for the late bump. But I gained some more insights - although they’re even more confusing.

So what did I do. At first I switched to the new integration as you suggested, Daniel. Thanks for that.
By the way: The window for setting up the CCU did not appear in the beginning but it turned out to be some issues with my local dns-server which in turn knocked out the pip when starting the new integration.

But even with the new Homematic integration the issue concerning the callback stays.

In order to verify that there is really no network issue and the Kubernetes-setup is doing well so far, I configured the Kubernetes-Pod that way that it starts a Wiremock-Server as sidecar which listens to port 8060 (my callback port for the CCU).

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: home-assistant
spec:
  serviceName: home-assistant
  selector:
    matchLabels:
      app.kubernetes.io/component: home-automation
      app.kubernetes.io/name: home-assistant
  replicas: 1
  minReadySeconds: 30
  template:
    metadata:
      name: home-assistant
      labels:
        app.kubernetes.io/component: home-automation
        app.kubernetes.io/name: home-assistant
    spec:
      containers:
      - name: home-assistant
        image: ghcr.io/home-assistant/home-assistant:2023.09

        ports:
          - name: home-assistant
            containerPort: 8123
            protocol: TCP

        env:
          - name: TZ
            value: "Europe/Berlin"
        
        securityContext:
            privileged: true
            runAsUser: 0

...

       - name: wiremock
         image: wiremock/wiremock:3.0.4
         ports:
           - name: callback
             containerPort: 8080
             protocol: TCP

...

apiVersion: v1
kind: Service
metadata:
  name: home-assistant-callbacks
spec:
  selector:
    app.kubernetes.io/component: home-automation
    app.kubernetes.io/name: home-assistant
  type: LoadBalancer
  ports:
    - port: 8060
      targetPort: 8080
      name: callback-tcp
      protocol: TCP

Then I did a curl from the CCU to my cluster on port 8060 (in expectation that the Wiremock-Server responds). And in deed, it replies:

# curl -X POST http://10.22.50.20:8060

                                               Request was not matched
                                               =======================

-----------------------------------------------------------------------------------------------------------------------
| Closest stub                                             | Request                                                  |
-----------------------------------------------------------------------------------------------------------------------
                                                           |
GET                                                        | POST                                                <<<<< HTTP method does not match
/hello                                                     | /                                                   <<<<< URL does not match
                                                           |
                                                           |
-----------------------------------------------------------------------------------------------------------------------

So, the CCU can access the callback-port of my Home-Assistant-instance on port 8060 although my Home-Assistant-instance is running in a Kubernetes-cluster.

Nevertheless, when having a look into the logs of my CCU, I do see the following …

2023-10-20 16:15:04,039 de.eq3.cbcs.legacy.bidcos.rpc.LegacyServiceHandler INFO  [vert.x-worker-thread-3] (un)registerCallback on LegacyServiceHandler called from url: http://10.22.50.20:8060 
2023-10-20 16:15:04,045 de.eq3.cbcs.legacy.bidcos.rpc.internal.LegacyBackendNotificationHandler INFO  [Hansa-Net-CCU-HmIP-RF_WorkerPool-1] SYSTEM: LegacyBackendNotificationHandler Verticle or Worker started 
2023-10-20 16:15:04,051 de.eq3.cbcs.legacy.bidcos.rpc.internal.InterfaceInitializer INFO  [vert.x-worker-thread-1] Added InterfaceId: Hansa-Net-CCU-HmIP-RF 
2023-10-20 16:15:04,053 de.eq3.cbcs.legacy.bidcos.rpc.LegacyServiceHandler INFO  [Hansa-Net-CCU-HmIP-RF_WorkerPool-1] init finished 
2023-10-20 16:16:04,487 io.vertx.core.impl.BlockedThreadChecker WARN  [vertx-blocked-thread-checker] Thread Thread[vert.x-worker-thread-1,5,main] has been blocked for 60433 ms, time limit is 60000 ms 
...
2023-10-20 16:17:15,609 de.eq3.cbcs.legacy.bidcos.rpc.internal.InterfaceInitializer ERROR [vert.x-worker-thread-1] IO Exception: Could not add interface: Hansa-Net-CCU-HmIP-RF 
de.eq3.cbcs.legacy.communication.rpc.RpcIOException: java.net.ConnectException: Connection timed out (Connection timed out)
...

I’m completely out of clue what this might be.
Perhaps a bug in the CCU?

danielperna84 · October 20, 2023, 7:38pm

As I don’t know anything about Kubernetes I can’t really help with this. But I agree, that the error is confusing. On the one hand the CCU receives the inbound connection to do the initialization, on the other hand it has connectivity issues afterwards. The only times I have seen such behavior was when there were two hosts on a network sharing a single IP address. Or maybe something with DHCP or DNS.

As the connection error happens some time later I’d suggest to run a continuous ping from the CCU to Home Assistant. At some point it might not get a response, which would be the same time the connection error appears. The remaining question would then be why this is happening.

ulehner · December 18, 2023, 12:03pm

Hi,

at first sorry for the late reply again. In the meantime the whole setup started to work - out of the blue! The only thing I did were several hard power offs of the CCU. Before I just did a restart via the web UI and just one power off.

I don’t get was went wrong in the CCU software. Nonetheless, everything worked with home assistant 2023.9 and homematicip 1.43.1 and right now works with ha 2023.12 and homematicip 1.50.0.

So, generally speaking: home assistant running with homematicip in a Kubernetes cluster (k3s) works so far with a CCU3.