The Title says it all.
It was a big Step to inform Users in UI what broke.
Now for a longer Time already, sometimes a Warning gets issued when a breaking Change is foreseeable, including the Version when it will break existing Stuff.
If the Goal is, to have a HA that is always up and running, it would be perfect if an Update can only be installed, if the break of functions is either solved or actively dismissed.
That information is only tracked in the blog posts, the code has no idea whatâs breaking or not. I doubt this will ever get added beyond the current repair system. I.e. the code only knows itâs broken after the update occurs.
I interpreted the wth more as kinda Brainstorming - what would be really needed, really awesome.
However, sure it isnât trivial. And I canât provide solid solutions.
But, ignoring that it might take quite some CPU time, canât be all yaml code just looked through for statements indicating a problem? Canât versions of add-ons etc be checked against compatibility?
Or could the breaking changes be tried to get simulated - if the new call works itâs fine, if not, but the old one, it indicates a breaking Change?
We chose to go to the Moon. Not because itâs easy, but because itâs hard
Itâs not a CPU issue, itâs a management issue. Whoâs going to track all these changes, how will the data be gathered, and where will it be gathered from. Right now it takes 1 person a week to tabulate the breaking changes (and we often get it wrong), right before the release is created. Youâre essentially asking this person to double their work in within an already limited time frame.
I am not asking anybody who doesnât have time to have less.
Im describing what i consider a big problem, or, positive said, what might be a solution.
If we only think about time, then everything in Homeassistant consumed time.
Maybe a breaking Change tracker (basically list) would be an improvement? It has to be noted somewhere anyhow.
Why not a checklist, where all possible breaking changes have to get noted - mandatory check, to pass a pr?
And if already knowing what is breaking, this could get fed into that list?
Well thatâs the thing, you are. Even if the process is automated and a required step in the PR, people still have to know itâs a breaking change and add it to a list. Secondly PRâs are cherry picked into builds, someone manages that. One large list that is built before the build is made will not satisfy your WTH. Because the breaking change isnât linked to a version. So making it a requirement for the PR doesnât help because someone has to go through after the fact and mark what version it gets merged into.
If a PR has a breaking change, require that it also list the change in some kind of pre-update-check system.
Automation can pull those together for the build based on what PRs are included.
Before updating, HA runs the pre-update checks listed in the target build. If any fail, abort the build unless it gets explicitly ignored by the user.
The check isnât present in just one build; itâs present in all future builds for some time period (a year?) to ensure if a user jumps over this build they still run the check.
After the metadata is in the PR, there is no manual effort on anyone to collect breaking changes for a release, it should be 100% automated.
However, thatâs a lot of work to set up a new pre-update-check system. Itâs difficult to enforce that all PRs with breaking changes comply. It can be confusing to new users if their update fails by a pre-update check.
The upside is that diligent users like me donât need to read the âBreaking Changesâ section and compare it to our own memory of what we use. The system could handle it for us. It would also make me personally feel safer setting up automatic updates. But implementation is a large undertaking.
I think the problem is that a breaking change i HA cause issues in third party integrations and many of them might be unknown to the HA devs.
Just because a third party integration use the HA code with the breaking change it might still not use the exact code that the breaking change affect.
For core stuff, or known breaking changes, I think the MVP and reasonably achievable level of functionality here would be that the update gives a confirmation pop-up that lists the in-use integrations that have breaking changes, and asks the user to confirm before upgrading or back out to resolve the breaking changes.
I donât think it would be reasonable for HA to have to determine if any of the specific breaking changes will impact dashboards, automatons, etc.
It wouldnât be a guarantee that the breaking changes would affect the userâs instance, just that theyâre using things that do have breaking changes. It can be easy to miss something when scanning the release notes.
I like your Aproach! Aside it is constructive, you show the Balance of investment and outcome.
So, i understood it might be a lot of Work (maybe a Hell of) to accomplish, and it will never be 100% - only when thinking about inofficial Addons.
From my Perspective as a User:
The more a System is Failsafe, the better.
The more a User is convinced about its System, the sooner an Update might be installed.
The more the System is up-to-date, the safer it is.
Sure, at some Point we need to talk about how high the âPriceâ is, in terms of on-time-investment and regulary. And how much the System as a whole benefits from it.
Maybe the Investment is too high.
But maybe at the End, it is not taking a Week each Release, but only 2 Days (as it is semi-automated) to do manually quality Checks. And, at least for the Core- and Official System, Changes that break the System are 90% Past, as they canât get installed.
Having said i am no Dev but just an average User, this might be a really stupid Idea, but it like to share it anyway
What, if a second instance of the same HA would be running (lowest priority), logging a fresh Boot. Then installing the Update, and logging again.
Then those two Logs could be compared. We could identify Problems by only showing the Differences, while the Main System would be still up and running.
That would mean it would probably take, lets say 12hrs, to do only the Process of checking. Personally, i would be happy to accept that, if at the end i know it safe - or where should be looking at.
Itâs nice to know Iâm not alone on this planet with my feature request, I maintain that this is along similar lines. And itâs not about testing the 2000 HACS integration, but maybe installing the 30 most used and installing the next update to see if HA survives
So the Alexa Media Player doesnât work again, luckily I donât use the thing, but itâs going in the same direction again, âHA breaks the playerâ, but itâs not HA thatâs the problem, itâs the third party addon.
The first half of what you described is very common in the software industry for things like web apps: start a new instance (receiving no traffic), make sure it starts up okay, then cut over traffic to the new instance and shut down the old.
Unfortunately, software has to be written in a specific way to make this possible. In most cases (and definitely with HA), running multiple instances together will cause problems for both of the instances. Theyâll be trying to do the same things at the same time and will conflict with each other. Refactoring the code to deal with this would be a very very large undertaking. Maybe even a ground-up rewrite.
But itâs linked to a pull request, right? Compiling a list of PRs tagged with âbreaking changesâ seems like exactly the kind of thing that could be automated via a github actions job thatâs added to the release process. And maybe thatâs already happening?
One plausible path to implementing something like this could be:
When building each release, something in CI compiles a list of PRs in this release which were tagged with breaking changes, and stores a list of [integration tagged in the PR, PR number] pairs somewhere like a json file that becomes one of the github release artifacts. I suspect something very close to this is already automated, but I donât know the details.
Whatever it is that shows available updates in HAOS (I use the docker container, so I donât have it) can compare that file (and the files for intermediate versions that are being skipped) to the list of integrations the user has installed, and present a list of âpossibly incompatibleâ changes to the user in some sort of appropriate UI. If you donât have the listed integration installed, then donât bother showing it.
This would not be perfect, but it would be an easy way to get most of the way there, in a âhereâs what you might want to watch out forâ kind of way.
The reason the âcanaryâ solution @smartin mentioned would be so much work is because home assistant is not hermetically sealed. It wouldnât be very useful if it was! It interacts with many resources like its database, a z-wave/zigbee stick, an MQTT broker, all your other add-ons, various cloud services, etc. None of those things are designed to be able to support multiple copies of the same HA at the same time. And there are so many different ways to configure home assistant that it doesnât seem reasonable to try to simulate them all.