WTH Check for Breaking Changes before installing Updates

The Title says it all.
It was a big Step to inform Users in UI what broke.
Now for a longer Time already, sometimes a Warning gets issued when a breaking Change is foreseeable, including the Version when it will break existing Stuff.

If the Goal is, to have a HA that is always up and running, it would be perfect if an Update can only be installed, if the break of functions is either solved or actively dismissed.

That information is only tracked in the blog posts, the code has no idea what’s breaking or not. I doubt this will ever get added beyond the current repair system. I.e. the code only knows it’s broken after the update occurs.

2 Likes

I interpreted the wth more as kinda Brainstorming - what would be really needed, really awesome.

However, sure it isn’t trivial. And I can’t provide solid solutions.
But, ignoring that it might take quite some CPU time, can’t be all yaml code just looked through for statements indicating a problem? Can’t versions of add-ons etc be checked against compatibility?
Or could the breaking changes be tried to get simulated - if the new call works it’s fine, if not, but the old one, it indicates a breaking Change?

We chose to go to the Moon. Not because it’s easy, but because it’s hard :smiley:

2 Likes

It’s not a CPU issue, it’s a management issue. Who’s going to track all these changes, how will the data be gathered, and where will it be gathered from. Right now it takes 1 person a week to tabulate the breaking changes (and we often get it wrong), right before the release is created. You’re essentially asking this person to double their work in within an already limited time frame.

5 Likes

I am not asking anybody who doesn’t have time to have less.
Im describing what i consider a big problem, or, positive said, what might be a solution.
If we only think about time, then everything in Homeassistant consumed time.

Maybe a breaking Change tracker (basically list) would be an improvement? It has to be noted somewhere anyhow.
Why not a checklist, where all possible breaking changes have to get noted - mandatory check, to pass a pr?
And if already knowing what is breaking, this could get fed into that list?

3 Likes

Well that’s the thing, you are. Even if the process is automated and a required step in the PR, people still have to know it’s a breaking change and add it to a list. Secondly PR’s are cherry picked into builds, someone manages that. One large list that is built before the build is made will not satisfy your WTH. Because the breaking change isn’t linked to a version. So making it a requirement for the PR doesn’t help because someone has to go through after the fact and mark what version it gets merged into.

The solution that comes to my mind is:

  • If a PR has a breaking change, require that it also list the change in some kind of pre-update-check system.
  • Automation can pull those together for the build based on what PRs are included.
  • Before updating, HA runs the pre-update checks listed in the target build. If any fail, abort the build unless it gets explicitly ignored by the user.
  • The check isn’t present in just one build; it’s present in all future builds for some time period (a year?) to ensure if a user jumps over this build they still run the check.
  • After the metadata is in the PR, there is no manual effort on anyone to collect breaking changes for a release, it should be 100% automated.

However, that’s a lot of work to set up a new pre-update-check system. It’s difficult to enforce that all PRs with breaking changes comply. It can be confusing to new users if their update fails by a pre-update check.

The upside is that diligent users like me don’t need to read the “Breaking Changes” section and compare it to our own memory of what we use. The system could handle it for us. It would also make me personally feel safer setting up automatic updates. But implementation is a large undertaking.

1 Like

I think the problem is that a breaking change i HA cause issues in third party integrations and many of them might be unknown to the HA devs.
Just because a third party integration use the HA code with the breaking change it might still not use the exact code that the breaking change affect.

For core stuff, or known breaking changes, I think the MVP and reasonably achievable level of functionality here would be that the update gives a confirmation pop-up that lists the in-use integrations that have breaking changes, and asks the user to confirm before upgrading or back out to resolve the breaking changes.
I don’t think it would be reasonable for HA to have to determine if any of the specific breaking changes will impact dashboards, automatons, etc.
It wouldn’t be a guarantee that the breaking changes would affect the user’s instance, just that they’re using things that do have breaking changes. It can be easy to miss something when scanning the release notes.

2 Likes

I like your Aproach! Aside it is constructive, you show the Balance of investment and outcome.
So, i understood it might be a lot of Work (maybe a Hell of) to accomplish, and it will never be 100% - only when thinking about inofficial Addons.
From my Perspective as a User:
The more a System is Failsafe, the better.
The more a User is convinced about its System, the sooner an Update might be installed.
The more the System is up-to-date, the safer it is.

Sure, at some Point we need to talk about how high the ‘Price’ is, in terms of on-time-investment and regulary. And how much the System as a whole benefits from it.

Maybe the Investment is too high.
But maybe at the End, it is not taking a Week each Release, but only 2 Days (as it is semi-automated) to do manually quality Checks. And, at least for the Core- and Official System, Changes that break the System are 90% Past, as they can’t get installed.

Having said i am no Dev but just an average User, this might be a really stupid Idea, but it like to share it anyway :smiley:

What, if a second instance of the same HA would be running (lowest priority), logging a fresh Boot. Then installing the Update, and logging again.
Then those two Logs could be compared. We could identify Problems by only showing the Differences, while the Main System would be still up and running.

That would mean it would probably take, lets say 12hrs, to do only the Process of checking. Personally, i would be happy to accept that, if at the end i know it safe - or where should be looking at.

Just an out-of-the-box Idea that came to my Mind :smiley:

It’s nice to know I’m not alone on this planet with my feature request, I maintain that this is along similar lines. And it’s not about testing the 2000 HACS integration, but maybe installing the 30 most used and installing the next update to see if HA survives :wink:

So the Alexa Media Player doesn’t work again, luckily I don’t use the thing, but it’s going in the same direction again, “HA breaks the player”, but it’s not HA that’s the problem, it’s the third party addon.

HA Core/Supervisor/OS Update Safeguard Holds - Feature Requests - Home Assistant Community

1 Like

The first half of what you described is very common in the software industry for things like web apps: start a new instance (receiving no traffic), make sure it starts up okay, then cut over traffic to the new instance and shut down the old.

Unfortunately, software has to be written in a specific way to make this possible. In most cases (and definitely with HA), running multiple instances together will cause problems for both of the instances. They’ll be trying to do the same things at the same time and will conflict with each other. Refactoring the code to deal with this would be a very very large undertaking. Maybe even a ground-up rewrite.

So that’s probably not going to happen, haha!

1 Like

If they are sandboxed? Second low perform - container?

But it’s linked to a pull request, right? Compiling a list of PRs tagged with “breaking changes” seems like exactly the kind of thing that could be automated via a github actions job that’s added to the release process. And maybe that’s already happening?

One plausible path to implementing something like this could be:

  • When building each release, something in CI compiles a list of PRs in this release which were tagged with breaking changes, and stores a list of [integration tagged in the PR, PR number] pairs somewhere like a json file that becomes one of the github release artifacts. I suspect something very close to this is already automated, but I don’t know the details.
  • Whatever it is that shows available updates in HAOS (I use the docker container, so I don’t have it) can compare that file (and the files for intermediate versions that are being skipped) to the list of integrations the user has installed, and present a list of “possibly incompatible” changes to the user in some sort of appropriate UI. If you don’t have the listed integration installed, then don’t bother showing it.

This would not be perfect, but it would be an easy way to get most of the way there, in a “here’s what you might want to watch out for” kind of way.

The reason the “canary” solution @smartin mentioned would be so much work is because home assistant is not hermetically sealed. It wouldn’t be very useful if it was! It interacts with many resources like its database, a z-wave/zigbee stick, an MQTT broker, all your other add-ons, various cloud services, etc. None of those things are designed to be able to support multiple copies of the same HA at the same time. And there are so many different ways to configure home assistant that it doesn’t seem reasonable to try to simulate them all.

2 Likes

WTH is it so hard to "catch up" if you've skipped a lot of HA updates? (re: breaking changes) WTH to go with my prior comment in the thread…

I just got bit by this today. I upgraded Zigbee2MQTT and now my light sensors and Hue remotes don’t work, meaning I now have to use the phone app to turn lights on and off.

It’s bad enough checking this for HA — open a browser, find the blog, find the blog post with the release notes, then scroll all the way down to the end where the breaking changes are tucked away (you’d think something that could break your setup would be at the top!)

(Yes, I know there’s an anchor link — but I actually use an RSS reader to save having to try to find the blog post, and it uses a Reader view which annoyingly doesn’t support anchor links :)

But when you have other add-ons, and there isn’t even a link to the release notes, that’s a lot of spelunking.

At the bare minimum, there should be a way for add-on maintainers to display a link to the release notes for the current version, if they’ve written any — and preferably a “breaking” flag which would put a suitably noticeable warning on the screen.

Even better would be if this was some sort of RSS feed with the version number as the title, and breaking changes listed as the article body.

HA would need to display multiple items if upgrading by multiple versions (example: I am on 1.4.1. I didn’t upgrade to 1.4.2 or 2.0.0 because I was away; the current version is 2.0.1. The update screen should show the items for 1.4.2, 2.0.0, and 2.0.1 in this case).

I don’t know how much this needs in the way of changes to the existing setup, though — and it depends on addon authors providing the release notes link (and the breaking changes, in the case of the RSS feed).

But simply allowing someone to hit the ‘update’ button and break their entire setup really doesn’t fit with the aim of making HA more usable for everyone.

There is a link to release notes for Z2M in the recent update.

It may not be clear but that’s what the addon developer decided to put in that message.

1 Like

Why would have they do that after the fact?

If the list of breaking changes is stored in git and expanded by every PR that is identified as breaking then it’s right there, all the time, in every git revision.

Of course, if the breakage is only found later that is more difficult to track.

I think you’re assuming that the maintainer of the Zigbee2MQTT addon is also directly involved in the development of the upstream container and software, which may not be the case.

1 Like