Is there any way to auto update the firmware updates to all esp devices?

Edwin_D · August 5, 2024, 8:08pm

That is an assumption. The RCA is not fully completed afaik, but the preliminary report suggests they did test to a level that was considered to be adequate. But tests too may fail to detect everything.

What is telling though, is that when only a small percentage of Windows machines is affected, the disruption can be so invasive. It shows how dependent we’ve all become on technology that is vulnerable. And it is ironic that it was swift response to threat was now a threat itself.

CaptTom · August 5, 2024, 11:52pm

It seems to me that the fact that it went so catastrophically badly for these organizations is prima facie evidence that their testing wasn’t adequate - for them.

Yes, we’re all pretty heavily dependent on technology. But we know this. Each organization knows which are their mission-critical devices. Maybe times have changed, but blasting out a defective change to mission-critical devices, especially to all of them at once, would have gotten me fired back when I was doing this kind of work.

Chrisalbertson · August 6, 2024, 12:50am

Lifetime write limit? If you update every week for 100 years there would only be 5200 writes. At that rate I’m good for almost 2,000 years.

ESP32 docs say the flash ROM is good for over 100,000 updates. So the device should last until the year 3947. But I bet it will last to the year 4,000. We just have to wait and see.

WallyR · August 6, 2024, 1:24am

I do not know when you did this work, but the last 30 years I have worked with IT security it has always been a worse event to have a security breach than a service failure.
The world is more dependent on digital services today, so the impact is bigger today, but back then the systems were generally of a poorer security standard, so the risk of a security breach were higher.
So the decision to protect against zeroday security risks in the wild would be the same and that means no time install without testing.

tmjpugh · August 6, 2024, 1:32am

Sadly many were fired.

Chrisalbertson · August 6, 2024, 1:38am

The problem is they were using MS Windows with auto-update for critical infrastructure. I posted this to another forum and some IT “expert” said the othe other option is macOS, He was unfamiliar with any other OS. Much of the IT systems that failed were maintained by people with very little formal training, they thought if it was patched and up to date it must be OK. The entire idea of stress testing in an isolated, sandbox and then a gradual rollout was foreign to them. Notice that people who used anything other than “Windows on auto-update” were unaffected.

WallyR · August 6, 2024, 2:24am

You have two choice here.

Stress test the updates and avoid a service failure, but risk a security breach
Not stress test and prevent the security breach, but risk a service failure.

Option 2 is by far the lesser evil here, especially with focus on privacy today.
You do not only risk losing brand value today, but also getting fined by data protection agencies, especially if you deal in Europe.

And the outage had nothing to do with Windows updates.
It was updates to the CrowdStrike software and only users of that software were affected. CrowdStrike is just having the leading product in that field, so many big firms use it.
The problem could just as well have occurred on a MacOS.

juronja · August 6, 2024, 7:59am

oh ok, no worries about Cloudflare then.

Thank you I saw this yes, but I think I I’ll try the suggested approach here not to update all the time just for the sake of firmware being up to date.

I’ll mark a solution that gave me the best idea to do things differently, but thank you for the ideas to everyone. Really helpful.

I’ll stop flashing firmware updates ALL the time
just keep the yaml updated ( but not push it to esp if no functional changes)
I have made a crude automation for skipping updates (will refine this in the future)

much obliged

–

Oh, and I found this podcast today with Keith It’s quite on topic of what you guys are on about:

Hellis81 · August 6, 2024, 8:05am

I don’t think that will work.
I know when I was adding IR remote commands I used the debug to get the codes then I added them to the yaml.
Went back to the debug window to get more codes but then it refused to start the debug because the device and yaml was not the same.
If I recall this correctly.

WallyR · August 6, 2024, 8:15am

I do that all the time, but maybe I do not change some of the basic connection settings, like api keys, password, IP address and so on.

Hellis81 · August 6, 2024, 8:22am

I did not change that either. I just added more remote codes.
I just wanted to not have to flash the device after each code I copied but that didn’t work.
Perhaps it’s when something changes that will be reflected in HA when it causes issues.
When I add a remote code this was added as a button, if this button does not exist on the board then I guess that is the issue.

I don’t want to make any changes to my devices just to prove the point since most of my devices has not been updated in a year or more. It will cause a lot of issues most likely.

WallyR · August 6, 2024, 8:32am

I will have to test that later.
It should be possible to just make a copy of the YAML before the test and then see if changes makes the debug fail. A paste of the original code just revert it to the initial working state and if not then I just flash that single devboard.

Edwin_D · August 6, 2024, 9:15am

For those who have dozens of similar ESP’s, and dislike to go through a Youtube video and put it on pause to see the code: here are some convenience scripts.

Note that you need to use the ESPHome addon for this, otherwise you won’t have the update entities. I took s slightly different approach from the video and made sure to exclude updating ESPHome itself, should you have a reason not to want to do that.

Also note: there is no need, nor is it wise, to update your ESP every time. So maybe use the skip script more often, and the update script less.

Remember to test updating one ESP of each type first, before you update all, to avoid Crowdstriking all devices you depend on all at once

alias: Skip all ESP updates
sequence:
  - service: update.skip
    metadata: {}
    data: {}
    target:
      entity_id: >-
        {{ states.update | selectattr('attributes.title','eq','ESPHome') |
        selectattr('state','eq','on') | map(attribute='entity_id') |
        reject('eq','update.esphome_update') | list }}
description: ""
icon: mdi:memory

alias: Unskip all ESP updates
sequence:
  - service: update.clear_skipped
    metadata: {}
    data: {}
    target:
      entity_id: >-
        {{ states.update | selectattr('attributes.title','eq','ESPHome') |
        map(attribute='entity_id') | reject('eq','update.esphome_update') | list
        }}
description: ""
icon: mdi:memory

alias: Install all non-skipped ESP updates
sequence:
  - service: update.install
    metadata: {}
    data: {}
    target:
      entity_id: >-
        {{ states.update | selectattr('attributes.title','eq','ESPHome') |
        selectattr('state','eq','on') | map(attribute='entity_id') |
        reject('eq','update.esphome_update') | list }}
description: ""
icon: mdi:memory

juronja · August 6, 2024, 12:22pm

Thank you for the script ideas. So I tried the second one, since all esps are currently up-to-date. But I get this error.

I’ll probably rather go with a label and use a label entity {{ label_entities('ID') }}.

Although that does not seem to work either. I have to dig in to documentation more.

Ellcon · August 6, 2024, 12:23pm

I don’t think there is a right or wrong answer to this. I have 20+ devices and happily update with ESP updates. I do check for breaking updates and test accordingly. My clean HA instance is on an rpi but I use my backup instance in a VM for test and debugging.

I also have my yaml’s as separate files for wifi, device general settings, and device specific settings so hopefully only one file needs updating for breaking changes.

Edwin_D · August 6, 2024, 12:42pm

Sorry, I use the beta. I tested it with that. The coming release renames service to action, and this was the new format. I’ll rename it back for now. If you update HA tomorrow evening the action in the script will work (and so will the ‘old’ service)

I guess this makes me guilty of ‘insufficient’ testing and releasing beta code to production

juronja · August 6, 2024, 1:38pm

With local packages? (I just googled this be possible )

oh makes sense, thank you! Your script works flawlessly. Great solution for skipping updates.

Have you ever tried with labels? Mine do not seem to work. The label entity list values are the same though when testing.

alias: ESPHome Skip Updates
sequence:
  - service: update.skip
    metadata: {}
    data: {}
    target:
      label_id: esphome
description: ""
icon: mdi:chip