Wireless update fails every time

Im having trouble getting my ESPhome device to do anything after a wireless update, updating via USB works OK but im going to have to send someone a bill to replace the screw terminals soon!

the update appears to go as planned but the device never wakes up.

the wireless update log (from device connected via USB, but doing update wireless to see whats actually happening)

[D][api:102]: Accepted 192.168.1.105
[D][api.connection:861]: Home Assistant 2022.6.6 (192.168.X.XXX): Connected successfully
[D][ota:139]: Starting OTA Update from 192.168.X.XXX...
[D][ota:308]: OTA in progress: 0.3%
[D][ota:308]: OTA in progress: 16.6%
[D][ota:308]: OTA in progress: 36.6%
[D][ota:308]: OTA in progress: 54.2%
[D][ota:308]: OTA in progress: 74.2%
[D][ota:308]: OTA in progress: 94.1%
[I][ota:341]: OTA update finished!
[I][app:133]: Rebooting safely...
[W][wifi_esp8266:482]: Event: Disconnected ssid='XXXXXXX' bssid=76:5A:B0:XX:XX:XX[redacted] reason='Association Leave'

 ets Jan  8 2013,rst cause:2, boot mode:(3,6)

load 0x4010f000, len 3460, room 16 
tail 4
chksum 0xcc
load 0x3fff20b8, len 40, room 4 
tail 4
chksum 0xc9
csum 0xc9
v0007a100
@cp:B0
ld
e:3
 ets Jan  8 2013,rst cause:3, boot mode:(3,6)

So still struggling with one device and this issue,

looking at this list there is no listing for reset cause 3, digging in the ESP docs its possibly a Software watchdog reset.

not sure where to go from here…

Did you try safe mode?

What is your router model?

What’s your board?

board is a d1-mini

the device doesn’t boot at all after the OTA update, so safe mode doesn’t work either… I do use this on other devices and it works well normally, but I use it in a slightly different situation where it does connect but i’m spamming the logs and OTA updates fail because of that.

the board does work when uploading via USB, it also appears to complete correctly during the OTA update, it just doesn’t boot afterwards. for testing there is nothing connected to the pins at all, and ive tried defining other pins to make sure its not something in the code.

I have another d1-mini running exactly the same code now, so im wondering if its a flash issue, just strange I cant decipher what’s going on from the uart log above…

I think the appropriate thing to do is call the board junk and hit it with a blowtorch…

Hi @chris.huitema

We know your code is good as you use it on another device. :heavy_check_mark:
We know it responds because you connect via USB. :heavy_check_mark:

Thing we are unsure of is WI-Fi connection…so :thinking:

When you upload via USB does it then connect via Wi-Fi? or do you have it connected directly to the device running HA and it is connecting via USB.

or

When doing OTA update do you see the upload from 0 to 100%. If you do then Wi-Fi is good.

If it connects via Wi-Fi then we know that is working. If not some thinks to check

I know you probably done all of this below but maybe check it again and go through the motions.

If not connecting via Wi-Fi

1/ Maybe check your Wi-Fi info. We all done it before just one letter or something stupid. Maybe copy / paste from another ESP.
2/ Do you have a OTA password. It could be that. Maybe remove it… try it
3/ Do you have a API password. It could be that. Maybe remove it… try it

Things to try after Wi-Fi or if Wi-Fi is good

1/ Clean Build Files. 3 dots for that ESP and select clean build files. Just encase something is not right.
2/ Delete it and reinstall it. Make sure you remove it in ESPHome and in Settings - Device & Services - ESPHome
3/ Your appropriate option “Blowtorch” :pensive:

OTA flashing requires at least half the flash memory to be free.
Because the currently running instance has to write the new one to flash, then set a pointer to boot to it next time, then reboots.
USB/serial flashing relies on only the bootloader running, and paves over the old code with your new code.
So, if your new image is a tiny bit too large, it may be just barely passing the “is it too big?” test that OTA does, but is just enough too big to be unbootable.
So, try leaving a component or feature (e.g. web server) out of the new code, just to see if it (being smaller) can OTA flash successfully.
Another possible cause for failure to flash that I’ve seen is weak power. Note that the device will require more power when WiFi is running (e.g. during OTA flash) than it does when only the bootloader is running (e.g. during USB flash).

So when performing the OTA update, it progresses from 0 to 100% without any issue, and wifi is solid

I had tried clean build a few times, with the same result. i also tried an erase using esptool to clear anything left behind on the board

the size of the code was fairly small, 42% Ram, 48% Flash

interestingly I created a new device in ESPhome, and load the default blank code via USB, then I can upload the other code via OTA successfully, but subsequent OTA updates always fail.

I did manage to get some additional logs, right after doing the install via USB I closed the window and connected via USB to see the logs. after the exception it seems to restart and work OK. the exception seems to be a ‘storeprohibitedcause’ ie invalid write address. given its just this one device this happens on Im guessing some alzheimer’s in the flash…

--------------- CUT HERE FOR EXCEPTION DECODER ---------------

Exception (29):
epc1=0x40260cbf epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000040 depc=0x00000000

>>>stack>>>

ctx: sys
sp: 3fffd580 end: 3fffffb0 offset: 0190
3fffd710:  2c642520 6f6f7220 6425206d 00000a20  
3fffd720:  73616c66 65722068 65206461 202c7272  
1000's of lines redacted as it probably contains sensitive information
<<<stack<<<

--------------- CUT HERE FOR EXCEPTION DECODER ---------------

 ets Jan  8 2013,rst cause:2, boot mode:(3,6)

load 0x4010f000, len 3460, room 16 
tail 4
chksum 0xcc
load 0x3fff20b8, len 40, room 4 
tail 4
chksum 0xc9
csum 0xc9
v0007b500
~ld
sy���nb�nl�|��n|pl�|n����⒂b�n`rl[I][logger:243]: Log initialized
[C][ota:461]: There have been 1 suspected unsuccessful boot attempts.
[I][app:029]: Running through setup()...

and the code im trying to load

esphome:
  name: test

esp8266:
  board: d1_mini

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "xxxxx"

ota:
  password: "xxxxx"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Test Fallback Hotspot"
    password: "xxxxx"

captive_portal:
    
switch:
  - platform: gpio
    pin:
      number: D3
      inverted: false
    id: relay_pin
    # If ESP reboots, do not attempt to restore switch state
    restore_mode: always off
    on_turn_on:
    - delay: 500ms
    - switch.turn_off: relay_pin
    
binary_sensor:
  - platform: gpio
    pin:
      number: D5
      mode:
        input: true
        pullup: true
    name: "Test Open Reed"
    id: open_switch
    filters:
      - delayed_on: 150ms
      - delayed_off: 150ms
    on_release:
      - cover.template.publish:
          id: test_door
          current_operation: "closing"
    on_press:
      - cover.template.publish:
          id: test_door
          current_operation: "idle" 
  - platform: gpio
    pin:
      number: D7
      mode:
        input: true
        pullup: true
    name: "Test Close Reed"
    id: close_switch
    filters:
      - delayed_on: 150ms
      - delayed_off: 150ms
    on_release:
      - cover.template.publish:
          id: test_door
          current_operation: "opening"
    on_press:
      - cover.template.publish:
          id: test_door
          current_operation: "idle"    
      
cover:
  - platform: template
    name: "Test Door"
    id: test_door
    device_class: garage
    lambda: |-
      if (id(close_switch).state) {
        return COVER_CLOSED;
      } else if (id(open_switch).state) {
        return COVER_OPEN;
      } else {
        return {};
      }
    open_action:
      - switch.turn_on: relay_pin
    close_action:
      - switch.turn_on: relay_pin
    stop_action:
      - switch.turn_on: relay_pin
    optimistic: true
    
sensor:
  - platform: uptime
    name: test Uptime Sensor
    id: uptime_sensor
    update_interval: 60s

Not sure if mine is the same issue but when using ESP8266 and over wifi it will cause the full of Home Assistant to reboot. Plugging in directly is fine.

This can be just writing the default flash for the first time.

Can’t say I’ve ever had home assistant reboot when uploading via wifi, it’s just the esp that’s not booting up…