ESPHome update 2024.6.1 device keeps crashing

Hi,

I’ve updated my ESPHome to the newest version 2024.6.1 and have a weird issue now. One of my devices keeps rebooting after a few seconds with no clear error. I updated the code with the new platform and one wire and dallas_temp adjustments, nothing else. It was working perfectly fine before the update.

It’s a climate device with 4 fans connected to the pwm output. When the device is booted, the fans start spinning but after a few seconds, not always the same, it stops, disconnects as show in the logs and in HA, reboots and start spinning again and everything happens again.

Anay suggestions what this might be? I’m kinda lost here…

My (new) code is the following:

esphome:
  name: patchkast-ventilatie
  friendly_name: Patchkast Ventilatie

esp8266:
  board: nodemcuv2

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "######"

ota:
  password: "#####"
  platform: esphome

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Patchkast-Ventilatie"
    password: "#####"

one_wire:
  - platform: gpio
    pin: GPIO4
  #TEMP1: 0x15a7dd5309646128
  #TEMP2: 0xa0f6c35309646128
    
sensor:
  - platform: dallas_temp
    address: 0x15a7dd5309646128
    name: "Kast temperatuur"
    update_interval: 10s
    id: Temp1
  - platform: dallas_temp
    address: 0xa0f6c35309646128
    name: "Input temperatuur"
    update_interval: 10s
    id: Temp2

output:
  - platform: esp8266_pwm
    pin: D5
    frequency: 25000 Hz
    id: ventilatie_output

fan:
  - platform: speed
    output: ventilatie_output
    name: "Ventilatie"
    id: fan_1

climate:
  - platform: thermostat
    name: "Patchkast Ventilatie"
    sensor: Temp1
    min_cooling_off_time: 300s
    min_cooling_run_time: 300s
    min_fanning_off_time: 0s
    min_fanning_run_time: 0s
    min_fan_mode_switching_time: 0s 
    min_idle_time: 30s
    cool_overrun: 1
      
#Climate actions
    cool_action:
      - fan.turn_on: fan_1
      - output.set_level:
          id: ventilatie_output
          level: 90%
    off_mode:
      - fan.turn_off: fan_1
    idle_action:
      - fan.turn_on: fan_1
      - output.set_level:
          id: ventilatie_output
          level: 30%
    fan_only_action:
      - fan.turn_on: fan_1
      - output.set_level:
          id: ventilatie_output
          level: 50%
#Ventilatie niveau's
    fan_mode_on_action:
      - output.set_level:
          id: ventilatie_output
          level: 40%
    fan_mode_off_action:
      - fan.turn_off: fan_1
    fan_mode_quiet_action:
      - output.set_level:
          id: ventilatie_output
          level: 40%
    fan_mode_low_action:
      - output.set_level:
          id: ventilatie_output
          level: 60%
    fan_mode_medium_action:
      - output.set_level:
          id: ventilatie_output
          level: 80%  
    fan_mode_high_action:
      - output.set_level:
          id: ventilatie_output
          level: 100%
#Preset levels
    default_preset: Auto
    preset:
      - name: Auto
        default_target_temperature_high: 26 °C
        mode: COOL
        fan_mode: QUIET
      - name: PC aan
        default_target_temperature_high: 26 °C
        mode: COOL
        fan_mode: MEDIUM
      - name: Denon aan
        default_target_temperature_high: 26 °C
        mode: COOL
        fan_mode: LOW
      - name: MAX
        default_target_temperature_high: 26 °C
        mode: COOL
        fan_mode: HIGH


captive_portal:
    

Hi,
The evidence of increased current load before a lock-up could suggest a hardware issue. Don’t know why an ESP update could increase the problem, but I’ve seen stranger things happen.

Could the fans drawing a higher current are causing the power supply voltage to “droop” causing a brown-out? The higher motor speeds could also cause RF noise / spikes on the power rails and crash the ESP.

Adding 1000nF + 100pF decoupling capacitors directly on the ESP Vcc + Gnd might help, but I’d measure the ESP voltage on both DC and AC ranges. DC might show the “droop”, and AC might show a ripple caused by the fans.

Other than that, could the timing or set-up of the GPIO have changed in the new release? That the issue doesn’t happen the same every time suggests hardware at least in part.

Can you connect a console and get a log of the full boot log? (Be careful with different power supplies and voltages.)

If this helps, :heart: this post!

Hi,

Thanks for the quick reply. I measured the VCC-Gnd and this stays rock solid when the fans switch on/off. During testing the ESP rebooted like before.

I have no 1000nF and 100pF decoupling capacitors laying around so can’t try that now.

Is there a software way to boost the esp chip? Maybe if I tweak this a bit it stays on?

Maybe something to do with this issue ?

Others are having crashing issues

I measured the VCC-Gnd and this stays rock solid when the fans switch on/off.

Notice the suggestion to measure both DC and AC volts to look for ripple - it could be RF noise rather than droop / voltage drop.

Is there a software way to boost the esp chip? Maybe if I tweak this a bit it stays on?

There might be ways to alter the timing of a watchdog timer, but it’s hard to fix hardware issues in software!

You might be able to alter the PSU load by testing with less fans/ lower PWM frequency / lower PWM duty cycle.

If this helps, :heart: this post!

@FloatingBoater, yeah, my bad, also measured in AC, no change at all. @Holdestmade, also looked at that post and though some have problems I can’t realy connect the two, but I’m gonna follow that one anyway, thanks.

I took the device to the desk and I’m beginning to think its not realy crashing but has some API interference or something, It seems that the ESP itself doesn’t reboot. (The blue light doesn’t start to flicker).

I had to alter the part for OTA in the code, might this be connected?

Interesting find - after using DS18x20 and similar digital serial number chips for {checks Dallas data sheet books} 30 years :flushed:, I’m surprised the code is wobbly! (suppose it was just re-factored to handle the non-temp sensors)

Do you get any boot logs connecting via USB (again, assuming isolated PSUs, not mains, etc)? How about if you disable 1-Wire and reflash?

I’ve had to connect via serial then manually button the uP to find stuff before. Typically debug hardware setup took an hour chasing serial level issues, then a minute to see the rather obvious root cause! :slight_smile: :man_mage:

I get multiple warnings saying the API and the Dallas_temp is slow, although the temps I had before, the API delay is new.


[16:29:15][I][thermostat.climate:1021]: Custom preset Auto requested
[16:29:15][D][climate:396]: 'Patchkast Ventilatie' - Sending state:
[16:29:15][D][climate:399]:   Mode: COOL
[16:29:15][D][climate:401]:   Action: IDLE
[16:29:15][D][climate:404]:   Fan Mode: QUIET
[16:29:15][D][climate:413]:   Custom Preset: Auto
[16:29:15][D][climate:419]:   Current Temperature: 21.75°C
[16:29:15][D][climate:425]:   Target Temperature: 26.00°C
[16:29:15][I][thermostat.climate:1031]: Custom preset Auto applied
[16:29:15][D][climate:396]: 'Patchkast Ventilatie' - Sending state:
[16:29:15][D][climate:399]:   Mode: COOL
[16:29:15][D][climate:401]:   Action: IDLE
[16:29:15][D][climate:404]:   Fan Mode: QUIET
[16:29:15][D][climate:413]:   Custom Preset: Auto
[16:29:15][D][climate:419]:   Current Temperature: 21.75°C
[16:29:15][D][climate:425]:   Target Temperature: 26.00°C
[16:29:15][W][component:237]: Component api took a long time for an operation (84 ms).
[16:29:15][W][component:238]: Components should block for at most 30 ms.
[16:29:15][D][dallas.temp.sensor:054]: 'Kast temperatuur': Got Temperature=21.8°C
[16:29:15][D][sensor:093]: 'Kast temperatuur': Sending state 21.75000 °C with 1 decimals of accuracy
[16:29:15][D][climate:396]: 'Patchkast Ventilatie' - Sending state:
[16:29:15][D][climate:399]:   Mode: COOL
[16:29:15][D][climate:401]:   Action: IDLE
[16:29:15][D][climate:404]:   Fan Mode: QUIET
[16:29:15][D][climate:413]:   Custom Preset: Auto
[16:29:15][D][climate:419]:   Current Temperature: 21.75°C
[16:29:15][D][climate:425]:   Target Temperature: 26.00°C
[16:29:15][W][component:237]: Component dallas_temp.sensor took a long time for an operation (57 ms).
[16:29:15][W][component:238]: Components should block for at most 30 ms.
[16:29:17][D][climate:011]: 'Patchkast Ventilatie' - Setting
[16:29:17][D][climate:028]:  Custom Preset: PC aan
[16:29:17][I][thermostat.climate:1021]: Custom preset PC aan requested
[16:29:17][D][climate:396]: 'Patchkast Ventilatie' - Sending state:

BTW, it’s no longer shutting down the fans after a few seconds… :face_with_raised_eyebrow:

I have the same problem on ESP32S3. After getting all the code adjusted for the update to one_wire, it seems to work (with multiple hubs and sensors), however after some time, the ESP32S3 will crash. Don’t think it is related to bad quality of the ESP32S3 as I also tested with another chip and Power Supply. Also don’t forget, before the update the ESP32S3 did NOT crash.

Another test on a ESP32 WORM 32E seems to be stable.

I had to roll back the update as this needs to be solved first.

Jupp, I really think it has something to do with the backend of the update. Somehow I only have it on 1 NodeMCUv2 ESP8266. Have 10 more in the house that haven’t had a single loss in connection since the update.

Don’t want to go back, because I had to adjust al my files for the new OTA, one_wire and dallas_temp, took me a long time to adjust them all… :expressionless:

Ok, the problem is still here unfortunatly. I left the fans off for a day and the connection was never lost. Turned them back on and immediately got the same disconnects. So indeed @FloatingBoater it has something to do with the fans.

What is visible in the charts is the misreadings in the temperature before or after the disconnect. See attached image. How to go further from this though, no idea. :thinking:

Is RF noise from the fans being picked up via the temp sensor cabling?

Shielded cabling and decoupling caps can help, but a quick hack for a test would be some aluminium foil wrapped around and grounded - just be careful about shorting something out.

The old telecoms trick is to get a AM radio and tune to static - move the radio about and listen for the RF interference.

My own projects suffered more from noise via the PSU rather than sensor cabling, hence the previous thoughts. As Dave Jones (EEVBlog) says - “Measure voltages First!”

1 Like