Hey there. I’m facing a slight issue with the callbacks from my Homematic-CCU when running in Kubernetes. But let’s at first stick to the process:
My Goal
I’m running my Home-Assistant-instance inside my local Kubernetes-Cluster. I also have my Homematic CCU3 running side by side to my cluster. Now I want to integrate my CCU3 into my Home-Assistant-instance.
My setup
I’m running a K3s-cluster locally on a bunch of Raspis. For the sake of network segmentation the cluster is running in an own network 10.22.50.0/24, while all my home-automation devices are placed in the network 10.22.40.0/24.
I deployed Home-Assistant in version 2023.8.4 in my Kubernetes-Cluster and everything is working fine so far except the Homematic-callbacks.
The deployment of my Home-Assistant instance looks like the following:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: home-assistant
spec:
serviceName: home-assistant
selector:
matchLabels:
app.kubernetes.io/component: home-automation
app.kubernetes.io/name: home-assistant
replicas: 1
minReadySeconds: 30
template:
metadata:
name: home-assistant
labels:
app.kubernetes.io/component: home-automation
app.kubernetes.io/name: home-assistant
spec:
containers:
- name: home-assistant
image: ghcr.io/home-assistant/home-assistant:2023.8.4
ports:
- name: home-assistant
containerPort: 8123
protocol: TCP
- name: callbacks-tcp
containerPort: 8060
protocol: TCP
env:
- name: TZ
value: "Europe/Berlin"
securityContext:
privileged: true
runAsUser: 0
resources:
{{- toYaml (.Values.resources) | nindent 12 }}
volumeMounts:
- name: home-assistant-config
mountPath: /config
- name: localtime
mountPath: /etc/localtime
readOnly: true
livenessProbe:
{{- toYaml (.Values.livenessProbe) | nindent 12 }}
readinessProbe:
{{- toYaml (.Values.readinessProbe) | nindent 12 }}
startupProbe:
{{- toYaml (.Values.startupProbe) | nindent 12 }}
volumes:
- name: home-assistant-config
persistentVolumeClaim:
claimName: home-assistant-config
- name: localtime
hostPath:
path: /etc/localtime
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: node-role.kubernetes.io/master
operator: DoesNotExist
terminationGracePeriodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
name: home-assistant
spec:
selector:
app.kubernetes.io/component: home-automation
app.kubernetes.io/name: home-assistant
type: ClusterIP
ports:
- port: 8123
targetPort: 8123
---
apiVersion: v1
kind: Service
metadata:
name: home-assistant-callbacks
spec:
selector:
app.kubernetes.io/component: home-automation
app.kubernetes.io/name: home-assistant
type: LoadBalancer
ports:
- port: 8060
targetPort: 8060
name: callback-tcp
---
kind: Ingress
apiVersion: networking.k8s.io/v1
metadata:
name: home-assistant-ui
annotations:
kubernetes.io/ingress.class: "traefik"
spec:
rules:
- host: home-assistant.home-automation.hansa-net.intra
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: home-assistant
port:
number: 8123
tls:
- hosts:
- home-assistant.home-automation.hansa-net.intra
secretName: home-assistant-domain-cert
(Secret containing the TLS certs omitted as not relevant to this case. Access is restricted to intranet solely, therefore no link to Let’s encrypt)
The configuration.yaml of my Home-Assistant instance looks like the following:
# Loads default set of integrations. Do not remove.
default_config:
# Load frontend themes from the themes folder
frontend:
themes: !include_dir_merge_named themes
automation: !include automations.yaml
script: !include scripts.yaml
scene: !include scenes.yaml
http:
use_x_forwarded_for: true
trusted_proxies:
- 10.42.0.0/16
- 10.43.0.0/16
- 10.22.50.0/24
homematic:
local_port: 8060
interfaces:
ip:
host: 10.22.40.5
port: 2010
ssl: false
callback_ip: 10.22.50.22
callback_port: 8060
The issue
In my Home-Assistant-instance I do see all the Homematic-IP devices with their provided values. So integrating Homematic into my Home-Assistant-instance basically works so far. Issues appear as soon as I want to e.g. turn on or off a Homematic switch by Home-Assistant. The callback from Homematic providing the new state fails.
The logs of Home-Assistant don’t tell much:
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service legacy-services: starting
services-up: info: copying legacy longrun home-assistant (no readiness notification)
s6-rc: info: service legacy-services successfully started
2023-09-07 08:26:58.478 WARNING (MainThread) [aiounifi.models.message] Unsupported message key session-metadata:sync
2023-09-07 08:26:58.488 WARNING (MainThread) [aiounifi.models.message] Unsupported message {'meta': {'message': 'session-metadata:sync', 'rc': 'ok'}, 'data': {'client_mac': 'd8:3a:dd:0e:8c:62'}}
2023-09-07 08:27:46.922 WARNING (MainThread) [homeassistant.config_entries] Config entry 'Radio Browser' for radio_browser integration not ready yet: Could not connect to Radio Browser API; Retrying in background
2023-09-07 08:27:57.173 ERROR (MainThread) [homeassistant.components.homeassistant_alerts] Timeout fetching homeassistant_alerts data
2023-09-07 08:43:17.133 ERROR (MainThread) [homeassistant.components.analytics] Timeout sending analytics to https://analytics-api.home-assistant.io/v1
(Srsly, that’s all)
When having a look into the logs of my Homematic-CCU, I do see the following:
2023-09-07 09:07:10,659 de.eq3.cbcs.legacy.bidcos.rpc.LegacyServiceHandler INFO [vert.x-worker-thread-1] (un)registerCallback on LegacyServiceHandler called from url: http://10.22.50.22:8060
2023-09-07 09:07:10,664 de.eq3.cbcs.legacy.bidcos.rpc.internal.LegacyBackendNotificationHandler INFO [homeassistant-ip_WorkerPool-1] SYSTEM: LegacyBackendNotificationHandler Verticle or Worker started
2023-09-07 09:07:10,666 de.eq3.cbcs.legacy.bidcos.rpc.LegacyServiceHandler INFO [homeassistant-ip_WorkerPool-1] init finished
2023-09-07 09:07:10,667 de.eq3.cbcs.legacy.bidcos.rpc.internal.InterfaceInitializer INFO [vert.x-worker-thread-3] Added InterfaceId: homeassistant-ip
...
2023-09-07 09:09:22,971 de.eq3.cbcs.legacy.bidcos.rpc.internal.InterfaceInitializer ERROR [vert.x-worker-thread-3] IO Exception: Could not add interface: homeassistant-ip
de.eq3.cbcs.legacy.communication.rpc.RpcIOException: java.net.ConnectException: Connection timed out (Connection timed out)
at de.eq3.cbcs.legacy.communication.rpc.internal.transport.http.HttpTransport.sendRequest(HttpTransport.java:110) ~[HMIPServer.jar:?]
...
Caused by: java.net.ConnectException: Connection timed out (Connection timed out)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_202]
...
What I did to track down the issue
My first thought when facing that issue was, that I have some trouble with my network segmentation. As logs didn’t show any issues AND my node-red-instance running in my Kubernetes-cluster and interacting with my CCU in the same way works like a charm, I placed another Home-Assistant instance in my cluster network with the same configuration running as Docker container inside a VM inside ProxMox. Here the callbacks from the CCU to this second Home-Assistant instance are registered so that everything is working well with this second Home-Assistant instance.
So I turned to the Kubernetes-deployment. Here I can see that the callbacks from the CCU land inside the home-assistant-0-pod. So the deployment doesn’t seem to be the problem as well. I also changed the deployment from a LoadBalancer-approach to a NodePort-approach in order to avoid eventual issues that might appear due to the bridging under the hood. But that didn’t change anything.
Leaves the Home-Assistant instance itself as root cause.
My thoughts
Might it have to do something with the proxying? I mean, in order to get access to the frontend, I had to enter the cluster internal CIDR addresses as trusted proxies. Might that filtering also appeal for the callback calls?