Unreliable MQTT Messaging - Pulling My Hair Out

After two years of working on my own home-brew smarthome projects, this is the first time I have posted a request for help on a support forum. I have never been unable to find the answer I was looking for already answered somewhere on the internet. And it’s taken me 2 days to convince myself that I am out of options. I have been troubleshooting this issue for nearly 2 weeks, and I would greatly appreciate anyone that could shed some light on my problem.

NATURE OF PROBLEM
MQTT Messaging Appears VERY Unstable / Unreliable. It HAS WORKED, but it won’t STAY working.

WHAT I SUSPECT MAY BE CAUSING IT:
Some sort of conflict with networking / IP / DNS / Authentication. I believe it can be solved with configuration, but I cannot find the knob that fixes it.

HARDWARE AND SOFTWARE

Brand New Dedicated “Generic” Intel Mini PC
Celeron J4125 4 core
8GB DDR4 RAM
256GB mSATA SSD

BIOS American Megatrends 5.13
BIOS Mode: UEFI

IP Address of Mini PC: 192.168.1.101
Connected to newtork via LAN port, not Wi-Fi.

Windows 10 Pro 10.0.19043
VirtualBox 6.1.34

Home Assistant OS 7.6
Home Assistant Core 2022.4.7

IP Address of Virtual Machine: 192.168.1.108
I configured my router to reserved DHCP addresses so that the PC and the HA VM always have the same IP addresses.

ADD-ONS
Studio Code Server 5.0.1
Mosquitto Broker 6.0.2
Node-RED 11.1.2

BACKGROUND (TL;DR below)

I’ve been tinkering with electronics and code for decades. That said, I am not arrogant about my knowledge. I know that when I get something to work properly it’s partly a happy accident, because I have an idea of how it operates, but I can’t “show my work”, as my math teacher used to say.

A few years ago, I started tinkering with Arduino and Raspberry Pi projects. Eventually that led me to start building DIY 8266 IoT projects using Raspian (or RPi OS), Mosquitto, and Node-RED (WITHOUT HOME ASSISTANT).

For a couple of years, I had a VERY STABLE automation system that included temperature sensors, HVAC controls, a fairly complex irrigation system w/pump & sensors, motorized window roller shades and mini-blinds, cooktop burner control, and what I called the “FrankenWasher”, where I retrofitted the failing electromechanical timer switch in my clothes washer with an Arduino/ESP-01 based interface using a relay board.

None of these projects used “plug-and-play” solutions. I bought the components, breadboarded the design, debugged the code, and then soldered a permanent board when I had it working. I even modeled from scratch and then 3D printed the NEMA 17 stepper motor housing I used to power the roller shade.

Then… 2 months ago, I moved.

After the usual shuffle of finding the right place for the router and other networking gear in the new house, I went to fire up the Raspberry Pi (3 B+), and the SD card was corrupted.

I had cloned the card to create a copy, but the backup was old enough that it didn’t include most of my recent (and fairly complex) Node-RED flows for several newer projects.

TL;DR? – You can start here. Sorry, I know it’s still long. (Synopsis: I was using Mosquitto and Node-RED on a Raspberry Pi 3 B+ without HA for almost 2 years with almost no problems.)

I wasn’t necessarily all that upset because I had been planning to move to Home Assistant anyway. And this was a new house, a fresh start, and as good of a reason as any to dive in with HA.

So, I flashed the HA OS image to a (new) SD card, fired it up and installed the Mosquitto and Node-RED add-ons right off the bat. Everything appeared to be configured fine.

I was starting to build my first flow for testing purposes in Node-RED, but it appeared things weren’t working smoothly. For instance, the simplest part of the system, the temperature sensors, weren’t being reported in the debug window as frequently as they should have been. Of the 4 sensors, none of them consistently reported on the programmed interval. It was as if the broker was hard of hearing; only catching the messages the sensors were publishing occasionally.

Further, I have several Feit Electric smart outlets (that I hacked, replacing the factory chip with an ESP-01), that weren’t responding to the MQTT on/off messages. And this wasn’t sporadic. They were completely unresponsive.

After spending several hours attempting to troubleshoot the MQTT problem to no avail, I decided to give myself a break. One of the MAIN REASONS I wanted to migrate to Home Assistant was because of ESPHome. So I decided to install the Add-On and see what it was all about. I was able to get to the point of compiling my first device configuration, and the Raspberry Pi… how do I say this nicely? The Pi shat the bed.

Well, that’s not good.

After some research, it appeared that unless you were running the most basic of Home Assistant setups, a Pi really wasn’t up to the challenge. So, I landed on buying a dedicated Intel/Windows 10 Mini-PC and running Home Assistant in a VirtualBox VM.

Unfortunately, the same MQTT messaging issues I was having on the Pi followed me to the Mini PC.

Thinking that I did something wrong during install, I blew away the virtual machine (a couple of times), and started from scratch. The problems persist, and I can’t figure out why.

Since I am using 100% home-brew 8266 devices (using PubSubClient.h and ESP8266WiFi.h), I began to think that maybe it was my code causing the problems.

Previously, my 8266 code DID NOT include MQTT username and password, or the ESP8266mDNS.h library. And after updating my sketches to code that did, Node-RED starting seeing messages more consistently. In fact, I updated 3 temperature sensors and one of the smart outlets and it was working beautifully. Each sensor reported the temperature every 15 seconds FLAWLESSLY overnight (at least as near as I could tell by the Node-RED debug window) and the smart outlet worked every time I turned it on or off. Note that these were the only 4 smart devices connected to Home Assistant.

So, AS A TEST, I re-flashed ONE of the 8266 temperature sensors with my OLD code (without user, pass, and the ESP8266mDNS.h library), and Home Assistant instantly started acting erratically and missing messages again.

It seems like this person’s question more or less describes the behavior I am seeing, but nobody answered the question.

https://community.home-assistant.io/t/mosquitto-allows-my-devices-to-connect-but-will-publish-sub-messages-once-per-3-8-connections/395185

I am at my wits end. I am happy to post any code snippets, logs, or screen grabs anyone would like to see. Just ask!

Supervisor log from most recent boot.

22-05-03 15:32:41 INFO (MainThread) [supervisor.dbus.manager] Load dbus interface org.freedesktop.timedate1
22-05-03 15:32:41 INFO (MainThread) [supervisor.dbus.manager] Load dbus interface org.freedesktop.NetworkManager
22-05-03 15:32:41 INFO (MainThread) [supervisor.dbus.manager] Load dbus interface de.pengutronix.rauc
22-05-03 15:32:41 INFO (MainThread) [supervisor.dbus.manager] Load dbus interface org.freedesktop.resolve1
22-05-03 15:32:41 INFO (MainThread) [supervisor.host.info] Updating local host information
22-05-03 15:32:41 INFO (MainThread) [supervisor.host.services] Updating service information
22-05-03 15:32:42 INFO (MainThread) [supervisor.host.sound] Updating PulseAudio information
22-05-03 15:32:42 INFO (MainThread) [supervisor.host.manager] Host information reload completed
22-05-03 15:32:42 INFO (MainThread) [supervisor.host.network] Updating local network information
22-05-03 15:32:42 WARNING (MainThread) [supervisor.host.network] Requested to update interface enp0s3 which does not exist or is disabled.
22-05-03 15:32:42 INFO (MainThread) [supervisor.host.apparmor] Loading AppArmor Profiles: {'hassio-supervisor'}
22-05-03 15:32:42 INFO (SyncWorker_0) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/amd64-hassio-cli with version 2022.05.0
22-05-03 15:32:42 INFO (MainThread) [supervisor.plugins.cli] Starting CLI plugin
22-05-03 15:32:43 INFO (SyncWorker_0) [supervisor.docker.cli] Starting CLI ghcr.io/home-assistant/amd64-hassio-cli with version 2022.05.0 - 172.30.32.5
22-05-03 15:32:43 INFO (SyncWorker_1) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/amd64-hassio-dns with version 2021.06.0
22-05-03 15:32:43 INFO (MainThread) [supervisor.plugins.dns] Starting CoreDNS plugin
22-05-03 15:32:44 INFO (SyncWorker_1) [supervisor.docker.dns] Starting DNS ghcr.io/home-assistant/amd64-hassio-dns with version 2021.06.0 - 172.30.32.3
22-05-03 15:32:44 INFO (MainThread) [supervisor.plugins.dns] Updated /etc/resolv.conf
22-05-03 15:32:44 INFO (SyncWorker_0) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/amd64-hassio-audio with version 2022.04.0
22-05-03 15:32:44 INFO (MainThread) [supervisor.plugins.audio] Starting Audio plugin
22-05-03 15:32:44 INFO (SyncWorker_0) [supervisor.docker.audio] Starting Audio ghcr.io/home-assistant/amd64-hassio-audio with version 2022.04.0 - 172.30.32.4
22-05-03 15:32:44 INFO (SyncWorker_1) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/amd64-hassio-observer with version 2021.10.0
22-05-03 15:32:44 INFO (SyncWorker_1) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/amd64-hassio-multicast with version 2022.02.0
22-05-03 15:32:44 INFO (MainThread) [supervisor.plugins.multicast] Starting Multicast plugin
22-05-03 15:32:45 INFO (SyncWorker_0) [supervisor.docker.multicast] Starting Multicast ghcr.io/home-assistant/amd64-hassio-multicast with version 2022.02.0 - Host
22-05-03 15:32:45 INFO (MainThread) [supervisor.updater] Fetching update data from https://version.home-assistant.io/stable.json
22-05-03 15:32:47 INFO (MainThread) [supervisor.homeassistant.secrets] Loaded 1 Home Assistant secrets
22-05-03 15:32:47 INFO (SyncWorker_0) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/qemux86-64-homeassistant with version 2022.4.7
22-05-03 15:32:47 INFO (MainThread) [supervisor.os.manager] Detect Home Assistant Operating System 7.6 / BootSlot A
22-05-03 15:32:47 INFO (MainThread) [supervisor.store.git] Loading add-on /data/addons/git/a0d7b954 repository
22-05-03 15:32:47 INFO (MainThread) [supervisor.store.git] Loading add-on /data/addons/core repository
22-05-03 15:32:47 INFO (MainThread) [supervisor.store.git] Loading add-on /data/addons/git/5c53de3b repository
22-05-03 15:32:48 INFO (MainThread) [supervisor.store] Loading add-ons from store: 65 all - 65 new - 0 remove
22-05-03 15:32:48 INFO (MainThread) [supervisor.addons] Found 3 installed add-ons
22-05-03 15:32:48 INFO (SyncWorker_2) [supervisor.docker.interface] Attaching to homeassistant/amd64-addon-mosquitto with version 6.0.2
22-05-03 15:32:48 INFO (SyncWorker_0) [supervisor.docker.interface] Attaching to ghcr.io/hassio-addons/node-red/amd64 with version 11.1.2
22-05-03 15:32:48 INFO (SyncWorker_1) [supervisor.docker.interface] Attaching to ghcr.io/hassio-addons/vscode/amd64 with version 5.0.1
22-05-03 15:32:48 INFO (MainThread) [supervisor.backups.manager] Found 0 backup files
22-05-03 15:32:48 INFO (MainThread) [supervisor.discovery] Loaded 1 messages
22-05-03 15:32:48 INFO (MainThread) [supervisor.ingress] Loaded 8 ingress sessions
22-05-03 15:32:48 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state CoreState.SETUP
22-05-03 15:32:48 INFO (MainThread) [supervisor.resolution.check] System checks complete
22-05-03 15:32:48 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.SETUP
22-05-03 15:32:48 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
22-05-03 15:32:48 INFO (MainThread) [supervisor.jobs] 'ResolutionFixup.run_autofix' blocked from execution, system is not running - CoreState.SETUP
22-05-03 15:32:48 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.SETUP
22-05-03 15:32:48 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
22-05-03 15:32:48 INFO (MainThread) [__main__] Running Supervisor
22-05-03 15:32:48 INFO (MainThread) [supervisor.os.manager] Rauc: A - marked slot kernel.0 as good
22-05-03 15:32:48 INFO (MainThread) [supervisor.addons] Phase 'AddonStartup.INITIALIZE' starting 0 add-ons
22-05-03 15:32:48 INFO (MainThread) [supervisor.addons] Phase 'AddonStartup.SYSTEM' starting 1 add-ons
22-05-03 15:32:49 INFO (SyncWorker_2) [supervisor.docker.addon] Starting Docker add-on homeassistant/amd64-addon-mosquitto with version 6.0.2
22-05-03 15:32:50 INFO (MainThread) [supervisor.services.modules.mqtt] Set core_mosquitto as service provider for mqtt
22-05-03 15:32:54 INFO (MainThread) [supervisor.addons] Phase 'AddonStartup.SERVICES' starting 1 add-ons
22-05-03 15:32:54 INFO (SyncWorker_1) [supervisor.docker.addon] Starting Docker add-on ghcr.io/hassio-addons/vscode/amd64 with version 5.0.1
22-05-03 15:32:59 INFO (MainThread) [supervisor.core] Start Home Assistant Core
22-05-03 15:33:00 INFO (SyncWorker_2) [supervisor.docker.interface] Starting homeassistant
22-05-03 15:33:00 INFO (MainThread) [supervisor.homeassistant.core] Wait until Home Assistant is ready
22-05-03 15:33:05 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.STARTUP
22-05-03 15:33:05 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
22-05-03 15:33:06 INFO (MainThread) [supervisor.store.git] Update add-on https://github.com/hassio-addons/repository repository
22-05-03 15:33:06 INFO (MainThread) [supervisor.store.git] Update add-on https://github.com/esphome/home-assistant-addon repository
22-05-03 15:33:06 INFO (MainThread) [supervisor.store.git] Update add-on https://github.com/home-assistant/addons repository
22-05-03 15:33:07 INFO (MainThread) [supervisor.store] Loading add-ons from store: 65 all - 0 new - 0 remove
22-05-03 15:33:07 INFO (MainThread) [supervisor.store] Loading add-ons from store: 65 all - 0 new - 0 remove
22-05-03 15:33:10 INFO (MainThread) [supervisor.homeassistant.api] Updated Home Assistant API token
22-05-03 15:33:15 INFO (MainThread) [supervisor.homeassistant.core] Detect a running Home Assistant instance
22-05-03 15:33:15 INFO (MainThread) [supervisor.addons] Phase 'AddonStartup.APPLICATION' starting 1 add-ons
22-05-03 15:33:16 INFO (SyncWorker_0) [supervisor.docker.addon] Starting Docker add-on ghcr.io/hassio-addons/node-red/amd64 with version 11.1.2
22-05-03 15:33:21 INFO (MainThread) [supervisor.misc.tasks] All core tasks are scheduled
22-05-03 15:33:21 INFO (MainThread) [supervisor.core] Supervisor is up and running
22-05-03 15:33:21 INFO (MainThread) [supervisor.host.info] Updating local host information
22-05-03 15:33:21 INFO (MainThread) [supervisor.updater] Fetching update data from https://version.home-assistant.io/stable.json
22-05-03 15:33:21 INFO (MainThread) [supervisor.resolution.check] Starting system checks with state CoreState.RUNNING
22-05-03 15:33:21 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.SECURITY/ContextType.CORE
22-05-03 15:33:21 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.FREE_SPACE/ContextType.SYSTEM
22-05-03 15:33:21 INFO (MainThread) [supervisor.resolution.module] Create new suggestion SuggestionType.CREATE_FULL_BACKUP - ContextType.SYSTEM / None
22-05-03 15:33:21 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.PWNED/ContextType.ADDON
22-05-03 15:33:21 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.DNS_SERVER_IPV6_ERROR/ContextType.DNS_SERVER
22-05-03 15:33:21 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.DNS_SERVER_FAILED/ContextType.DNS_SERVER
22-05-03 15:33:21 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.TRUST/ContextType.SUPERVISOR
22-05-03 15:33:22 INFO (MainThread) [supervisor.host.services] Updating service information
22-05-03 15:33:22 INFO (MainThread) [supervisor.host.network] Updating local network information
22-05-03 15:33:22 INFO (MainThread) [supervisor.host.sound] Updating PulseAudio information
22-05-03 15:33:22 INFO (MainThread) [supervisor.host.manager] Host information reload completed
22-05-03 15:33:23 INFO (MainThread) [supervisor.resolution.check] System checks complete
22-05-03 15:33:23 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.RUNNING
22-05-03 15:33:24 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
22-05-03 15:33:24 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state CoreState.RUNNING
22-05-03 15:33:24 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete
22-05-03 15:33:30 INFO (MainThread) [supervisor.auth] Auth request from 'core_mosquitto' for 'mqtt_user'
22-05-03 15:33:31 INFO (MainThread) [supervisor.auth] Successful login for 'mqtt_user'
22-05-03 15:33:35 INFO (MainThread) [supervisor.api.proxy] Home Assistant WebSocket API request initialize
22-05-03 15:33:35 INFO (MainThread) [supervisor.api.proxy] WebSocket access from a0d7b954_nodered
22-05-03 15:33:35 INFO (MainThread) [supervisor.api.proxy] Home Assistant WebSocket API request running

Mosquitto Broker Log (Forgot to remove the quotes from mqtt_client_name in the 8266 sketch. Oops! :roll_eyes:)

1651615401: New connection from 172.30.32.2 on port 1883.
1651615401: Socket error on client <unknown>, disconnecting.
1651615408: Socket error on client mqtt_client_name, disconnecting.
1651615408: New connection from 192.168.1.115 on port 1883.
1651615408: New client connected from 192.168.1.115 as mqtt_client_name (p2, c1, k15).
1651615413: Socket error on client Upstairs_Temp, disconnecting.
1651615413: New connection from 192.168.1.124 on port 1883.
1651615413: New client connected from 192.168.1.124 as Upstairs_Temp (p2, c1, k15).
1651615417: Socket error on client Bedroom_Temp, disconnecting.
1651615417: New connection from 192.168.1.116 on port 1883.
1651615417: New client connected from 192.168.1.116 as Bedroom_Temp (p2, c1, k15).
1651615440: Socket error on client mqtt_client_name, disconnecting.
1651615440: New connection from 192.168.1.115 on port 1883.
1651615440: New client connected from 192.168.1.115 as mqtt_client_name (p2, c1, k15).
1651615444: Socket error on client Upstairs_Temp, disconnecting.
1651615444: New connection from 192.168.1.124 on port 1883.
1651615444: New client connected from 192.168.1.124 as Upstairs_Temp (p2, c1, k15).
1651615448: Socket error on client Bedroom_Temp, disconnecting.
1651615448: New connection from 192.168.1.116 on port 1883.
1651615448: New client connected from 192.168.1.116 as Bedroom_Temp (p2, c1, k15).
1651615471: Socket error on client mqtt_client_name, disconnecting.
1651615471: New connection from 192.168.1.115 on port 1883.
1651615471: New client connected from 192.168.1.115 as mqtt_client_name (p2, c1, k15).
1651615475: Socket error on client Upstairs_Temp, disconnecting.
1651615475: New connection from 192.168.1.124 on port 1883.
1651615475: New client connected from 192.168.1.124 as Upstairs_Temp (p2, c1, k15).
1651615480: Socket error on client Bedroom_Temp, disconnecting.
1651615480: New connection from 192.168.1.116 on port 1883.
1651615480: New client connected from 192.168.1.116 as Bedroom_Temp (p2, c1, k15).
1651615502: Socket error on client mqtt_client_name, disconnecting.
1651615502: New connection from 192.168.1.115 on port 1883.
1651615502: New client connected from 192.168.1.115 as mqtt_client_name (p2, c1, k15).
1651615506: Socket error on client Upstairs_Temp, disconnecting.
1651615506: New connection from 192.168.1.124 on port 1883.
1651615506: New client connected from 192.168.1.124 as Upstairs_Temp (p2, c1, k15).
1651615511: Socket error on client Bedroom_Temp, disconnecting.
1651615511: New connection from 192.168.1.116 on port 1883.
1651615511: New client connected from 192.168.1.116 as Bedroom_Temp (p2, c1, k15).
1651615521: New connection from 172.30.32.2 on port 1883.
1651615521: Socket error on client <unknown>, disconnecting.
1651615533: Socket error on client mqtt_client_name, disconnecting.
1651615533: New connection from 192.168.1.115 on port 1883.
1651615533: New client connected from 192.168.1.115 as mqtt_client_name (p2, c1, k15).
1651615537: Socket error on client Upstairs_Temp, disconnecting.
1651615537: New connection from 192.168.1.124 on port 1883.
1651615537: New client connected from 192.168.1.124 as Upstairs_Temp (p2, c1, k15).
1651615542: Socket error on client Bedroom_Temp, disconnecting.
1651615542: New connection from 192.168.1.116 on port 1883.
1651615542: New client connected from 192.168.1.116 as Bedroom_Temp (p2, c1, k15).
1651615564: Socket error on client mqtt_client_name, disconnecting.
1651615564: New connection from 192.168.1.115 on port 1883.
1651615564: New client connected from 192.168.1.115 as mqtt_client_name (p2, c1, k15).
1651615568: Socket error on client Upstairs_Temp, disconnecting.
1651615568: New connection from 192.168.1.124 on port 1883.
1651615568: New client connected from 192.168.1.124 as Upstairs_Temp (p2, c1, k15).
1651615573: Socket error on client Bedroom_Temp, disconnecting.
1651615573: New connection from 192.168.1.116 on port 1883.
1651615573: New client connected from 192.168.1.116 as Bedroom_Temp (p2, c1, k15).
1651615595: Socket error on client mqtt_client_name, disconnecting.
1651615595: New connection from 192.168.1.115 on port 1883.
1651615595: New client connected from 192.168.1.115 as mqtt_client_name (p2, c1, k15).
1651615599: Socket error on client Upstairs_Temp, disconnecting.
1651615599: New connection from 192.168.1.124 on port 1883.
1651615599: New client connected from 192.168.1.124 as Upstairs_Temp (p2, c1, k15).
1651615604: Socket error on client Bedroom_Temp, disconnecting.
1651615604: New connection from 192.168.1.116 on port 1883.
1651615604: New client connected from 192.168.1.116 as Bedroom_Temp (p2, c1, k15).
1651615626: Socket error on client mqtt_client_name, disconnecting.
1651615626: New connection from 192.168.1.115 on port 1883.
1651615626: New client connected from 192.168.1.115 as mqtt_client_name (p2, c1, k15).
1651615630: Socket error on client Upstairs_Temp, disconnecting.
1651615630: New connection from 192.168.1.124 on port 1883.
1651615630: New client connected from 192.168.1.124 as Upstairs_Temp (p2, c1, k15).
1651615635: Socket error on client Bedroom_Temp, disconnecting.
1651615635: New connection from 192.168.1.116 on port 1883.
1651615635: New client connected from 192.168.1.116 as Bedroom_Temp (p2, c1, k15).
1651615641: New connection from 172.30.32.2 on port 1883.
1651615641: Socket error on client <unknown>, disconnecting.
1651615657: Socket error on client mqtt_client_name, disconnecting.
1651615657: New connection from 192.168.1.115 on port 1883.
1651615657: New client connected from 192.168.1.115 as mqtt_client_name (p2, c1, k15).
1651615661: Socket error on client Upstairs_Temp, disconnecting.
1651615661: New connection from 192.168.1.124 on port 1883.
1651615661: New client connected from 192.168.1.124 as Upstairs_Temp (p2, c1, k15).
1651615666: Socket error on client Bedroom_Temp, disconnecting.
1651615666: New connection from 192.168.1.116 on port 1883.
1651615666: New client connected from 192.168.1.116 as Bedroom_Temp (p2, c1, k15).
1651615688: Socket error on client mqtt_client_name, disconnecting.
1651615688: New connection from 192.168.1.115 on port 1883.
1651615688: New client connected from 192.168.1.115 as mqtt_client_name (p2, c1, k15).
1651615692: Socket error on client Upstairs_Temp, disconnecting.
1651615692: New connection from 192.168.1.124 on port 1883.
1651615692: New client connected from 192.168.1.124 as Upstairs_Temp (p2, c1, k15).
1651615697: Socket error on client Bedroom_Temp, disconnecting.
1651615697: New connection from 192.168.1.116 on port 1883.
1651615697: New client connected from 192.168.1.116 as Bedroom_Temp (p2, c1, k15).
1651615719: Socket error on client mqtt_client_name, disconnecting.
1651615719: New connection from 192.168.1.115 on port 1883.
1651615719: New client connected from 192.168.1.115 as mqtt_client_name (p2, c1, k15).
1651615723: Socket error on client Upstairs_Temp, disconnecting.

Node-RED Log

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 00-banner.sh: executing... 
-----------------------------------------------------------
 Add-on: Node-RED
 Flow-based programming for the Internet of Things
-----------------------------------------------------------
 Add-on version: 11.1.2
 You are running the latest version of this add-on.
 System: Home Assistant OS 7.6  (amd64 / qemux86-64)
 Home Assistant Core: 2022.4.7
 Home Assistant Supervisor: 2022.05.0
-----------------------------------------------------------
 Please, share the above information when looking for help
 or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
[cont-init.d] 00-banner.sh: exited 0.
[cont-init.d] 01-log-level.sh: executing... 
[cont-init.d] 01-log-level.sh: exited 0.
[cont-init.d] customizations.sh: executing... 
[cont-init.d] customizations.sh: exited 0.
[cont-init.d] nginx.sh: executing... 
[cont-init.d] nginx.sh: exited 0.
[cont-init.d] node-red.sh: executing... 
patching file nodes/ui_base.html
Hunk #1 succeeded at 1164 (offset 633 lines).
up to date, audited 1 package in 1s
found 0 vulnerabilities
[cont-init.d] node-red.sh: exited 0.
[cont-init.d] done.
[services.d] starting services
[services.d] done.
[15:33:21] INFO: Starting Node-RED...
> start
> node $NODE_OPTIONS node_modules/node-red/red.js "--settings" "/etc/node-red/config.js"
3 May 15:33:24 - [info] 
Welcome to Node-RED
===================
3 May 15:33:24 - [info] Node-RED version: v2.2.2
3 May 15:33:24 - [info] Node.js  version: v16.14.2
3 May 15:33:24 - [info] Linux 5.10.108 x64 LE
3 May 15:33:25 - [info] Loading palette nodes
3 May 15:33:29 - [info] Dashboard version 3.1.6 started at /endpoint/ui
Traceback (most recent call last):
  File "/opt/node_modules/node-red-node-pi-gpio/testgpio.py", line 3, in <module>
    import RPi.GPIO as GPIO
  File "/usr/lib/python3.9/site-packages/RPi/GPIO/__init__.py", line 23, in <module>
    from RPi._GPIO import *
RuntimeError: This module can only be run on a Raspberry Pi!
3 May 15:33:30 - [warn] rpi-gpio : Raspberry Pi specific node set inactive
3 May 15:33:30 - [info] Settings file  : /etc/node-red/config.js
3 May 15:33:30 - [info] Context store  : 'default' [module=memory]
3 May 15:33:30 - [info] User directory : /config/node-red/
3 May 15:33:30 - [warn] Projects disabled : editorTheme.projects.enabled=false
3 May 15:33:30 - [info] Flows file     : /config/node-red/flows.json
3 May 15:33:30 - [info] Server now running at http://127.0.0.1:46836/
3 May 15:33:30 - [info] Starting flows
[15:33:30] INFO: Starting NGinx...
3 May 15:33:30 - [info] Started flows
3 May 15:33:30 - [info] [mqtt-broker:e3b7652ba154c5a4] Connected to broker: mqtt://192.168.1.108:1883
3 May 15:33:35 - [info] [server:Home Assistant] Connecting to http://supervisor/core
3 May 15:33:35 - [info] [server:Home Assistant] Connected to http://supervisor/core

From /config/info/

System Health

version: core-2022.4.7
installation_type: Home Assistant OS
dev: false
hassio: true
docker: true
user: root
virtualenv: false
python_version: 3.9.9
os_name: Linux
os_version: 5.10.108
arch: x86_64
timezone: America/Denver


logged_in: false
can_reach_cert_server: ok
can_reach_cloud_auth: ok
can_reach_cloud: ok


host_os: Home Assistant OS 7.6
update_channel: stable
supervisor_version: supervisor-2022.05.0
docker_version: 20.10.9
disk_total: 31.3 GB
disk_used: 3.8 GB
healthy: true
supported: true
board: ova
supervisor_api: ok
version_api: ok
installed_addons: Studio Code Server (5.0.1), Mosquitto broker (6.0.2), Node-RED (11.1.2)


dashboards: 1
resources: 0
views: 1
mode: storage

Here’s what the Node-RED debug window looks like. I powered up these 3 temp sensors each 5 seconds apart. Each one is supposed to be sending the temperature every 15 seconds. So we should be seeing them report in the same sequence over and over again, each message 5 seconds apart. But as you can see, messages are getting missed.

NEW 8266 temp sensor code (works better-ish)

#include <ESP8266WiFi.h>
#include <PubSubClient.h>
#include <ESP8266mDNS.h>
#include <OneWire.h>
#include <DallasTemperature.h>

const char* ssid = "realSSID";
const char* password = "realPassword";
const char* mqtt_server = "192.168.1.108";
const int mqtt_port = 1883;
const char *mqtt_user = "real_mqtt_user";
const char *mqtt_pass = "real_mqtt_pass";
const char *mqtt_client_name = "Bedroom_Temp";
const char *mqtt_topic = "indoor/temperature/bedroom";

WiFiClient bedroomTemp;
PubSubClient client(bedroomTemp);

const int oneWireBus = 2;
OneWire oneWire(oneWireBus);
DallasTemperature sensors(&oneWire);
int temp;


void setup_wifi() {
  WiFi.begin(ssid, password);
  WiFi.mode(WIFI_STA);
  while (WiFi.status() != WL_CONNECTED) {
    delay(500);
  }
}


void reconnect() {
  while (!client.connected()) {
    if (client.connect(mqtt_client_name, mqtt_user, mqtt_pass)) {
    } else {
      delay(2500);
    }
  }
}

void setup() {
  
  sensors.begin();  
  setup_wifi();
  client.setServer(mqtt_server, mqtt_port);

}

void loop() {

  if (!client.connected()) {
    reconnect();
  }

  if(!client.loop())
    client.connect(mqtt_client_name);

    sensors.requestTemperatures();  
    temp = sensors.getTempFByIndex(0);

    static char sensorTemp[3];
    dtostrf (temp, 2, 0, sensorTemp);
    
    client.publish(mqtt_topic, sensorTemp);

    delay(15000);
} 

OLD 8266 temp sensor code (mostly doesn’t work at all)

#include <ESP8266WiFi.h>
#include <PubSubClient.h>
#include <OneWire.h>
#include <DallasTemperature.h>

const char* ssid = "realSSID";
const char* password = "realPassword";
const char* mqtt_server = "192.168.1.108";

WiFiClient bedroomTemp;
PubSubClient client(bedroomTemp);

const int oneWireBus = 12;
OneWire oneWire(oneWireBus);
DallasTemperature sensors(&oneWire);
int temp;


void setup_wifi() {
  delay(10);
  WiFi.begin(ssid, password);
  WiFi.mode(WIFI_STA);
  while (WiFi.status() != WL_CONNECTED) {
    delay(500);
  }
}


void reconnect() {
  while (!client.connected()) {
    if (client.connect("ESP_Bedroom_Temp")) {
    } else {
      delay(2500);
    }
  }
}

void setup() {
  sensors.begin();  
  setup_wifi();
  client.setServer(mqtt_server, 1883);
}

void loop() {

  if (!client.connected()) {
    reconnect();
  }

  if(!client.loop())
    client.connect("ESP_Bedroom_Temp");

    sensors.requestTemperatures();  
    temp = sensors.getTempFByIndex(0);

    static char sensorTemp[3];
    dtostrf (temp, 2, 0, sensorTemp);
    
    client.publish("indoor/temperature/bedroom", sensorTemp);

    delay(15000);

} 

If anyone makes it this far, I greatly appreciate your willingness to help.

THANK YOU!!

Jason

And since as a noob I only get one image per post. Here’s another.broker-options

And another. For whatever good it will do. :slight_smile:
HA_command_line

After 3 more days of troubleshooting, I figured out what it was. It was the 8266 sketch. :roll_eyes:

delay

The

delay(15000);

was the culprit.

Forgive my lack of understanding as I’m not a network engineer, but as far as I can tell, the host was losing communication with the client due to the delay, and closing the connection. This caused some of the MQTT messages to get lost.

I used this instead, allowing !client.loop to keep running, and maintaining a persistent connection between the host and client.

void loop() {

  if (!client.connected()) {
    reconnect();
  }
  if(!client.loop())
    client.connect(mqtt_client_name);

    sensors.requestTemperatures();  
    temp = sensors.getTempFByIndex(0);

    static char sensorTemp[3];
    dtostrf (temp, 2, 0, sensorTemp);
    
    if (boot == true){
    client.publish(mqtt_topic, sensorTemp);
    boot = false;      
    }

    millisNow = millis();
    if (millisNow - elapsed > 15000){
    client.publish(mqtt_topic, sensorTemp);
    elapsed = millis();
    }

} 

Keep in mind this discovery was after I spent hours trying to install Mosquitto directly on the Windows machine hosting the HA VirtualBox installation. It turned out to be missing .dll files that were preventing the Mosquitto installation (error 0xc000007b was the primary one). I’m not sure what solution actually fixed it, but I made updates to the .NET Framework, Visual C++, and DirectX, as well as manually installing vcruntime140.dll. I found this to be a very thorough tutorial:

https://appuals.com/fix-error-0xc00007b-application-was-unable-to-start-correctly/

Once I had Mosquitto installed on the Windows host, I used this method of configuration:

Then pointed Home Assistant’s MQTT integration and Node-RED’s MQTT broker to port 1883 of the host machine’s IP address.

But the MQTT messaging problem still persisted. I installed MQTT Explorer and Wireshark and after looking at the logs, a friend of mine (much smarter than I), pointed me toward the client as the root of the problem.

Once we figured out the 8266 coding issue, MQTT messages and the Wireshark logs looked pristine.

So I moved Mosquitto back inside Home Assistant using the add-on, and then changed back the 1883 port IP address, and everything was fine.

At least for now. I still have about 10 more 8266-based devices yet to bring online, but I have a feeling that won’t happen without some troubleshooting.