Home Assistant keeps growing and the importance of being more user friendly will become a key factor to make room for a more non developer users.
One thing that I believe needs to be done is to remake the system log so it makes sence to people with all levels of technical backgrounds.
Today, it doesn’t even give you a hint of what to do to fix it or what kind of error it is.
For someone that is not a developer, It could just as well be in some long forgotten language.
This is a good example.
An error has occurred 1300 times, but I still have no clue what the problem is.
We where 1 million users last year and 2 millions this year. To give tools so all level of users can fix as much as possible on their own will both create less frustration and fewer support issues.
I completely agree that improving the system log is essential for making Home Assistant more user-friendly. It’s frustrating when errors are vague and don’t provide actionable insights, especially for non-developers. A clearer log that explains errors in simple terms and offers troubleshooting steps could significantly enhance the user experience. With the growing user base, such changes would help reduce frustration and support requests.
This is a very understandable question. The problem with these errors in the log though, is that most of them were not expected by the developer too. The one mentioned is a bug. Had the developer anticipated it, the exception would not have happened.
Often the developer does not know why they occur until they investigate and fix it. Expected problems usually already have nicer error handling. The errors in the log are not supposed to happen at all.
Imagine you try to start your car and nothing happens. If would be nice to have an actionable message on the dashboard saying you need to replace a broken car part. But that is not possible right? Some one needs to take a look to see what is going on.
IF Answer = "YES" THEN CALL DO_YES()
ELSE IF Answer = "NO" THEN CALL DO_NO()
ELSE CALL LogError("Error_1234:This will never happen")
END IF
Our LogError function includes a GUID parameter. If the error is triggered a record is inserted in the database [LOG] table. At that point if an error definition (for the GUID) does not exist then one is created with Status “Needs details adding”
Also, when the original code is created our “template” for adding the CALL LogError() function includes some one-time code to run, which uses the GUID (and error message) to pre-create an error definition record (along with a couple of other parameters - for example a value for the priority of fixing any issue that arose).
We don’t bother to use that for “Never expected to happen” errors - they are just a catch, they cause a fatal error (of that sub-process at least), the user will probably contact “support”, the database has logged the error - Support can see if that error got hit once, or repeatedly, and so on.
In contrast an example of a pre-defined error might be for a legacy parameter being used in a call to a function. “We thought we’d found-and-fixed them all, but we left the old parameter in, for now, and programmed it for backward-compatibility, but we’d like to LOG if anything does actually trigger it.” And when nothing has, for X months, we can consider physically removing the now-redundant function parameter
I’m sure all this is obvious to programming folk … but I just mention it in case it is useful for creative thought. If I had my time over amongst the first things I would have built was an error logging function that had all the abilities that my current system does. It would be nice if the stuff I wrote decades ago did not have awful and useless error logging …
And now imagine building a database of exception that for all native integrations, all custom integrations, all forks of integrations, …
Plus, adding an id where the exception was caught might be linked to an entirely different cause for some one else. Lets say a template error, … And an error stating a specific but the wrong cause can be more devastating than a generic one.
Do not fully agree… There are cases, where log entries can be greatly improved. One I’m struggling a lot is SNMP:
2025-05-09 23:30:15.252 WARNING (MainThread) [homeassistant.components.sensor] Updating snmp sensor took longer than the scheduled update interval 0:00:05
If specific SNMP sensor does not respond, what is the problem to include at least IP address of respective device?
There are tons of similar cases, where troubleshooting is not possible (or very hard), just because simple information about target device/entity is missing…
Unfortunately that error message “Updating << sensorname >> sensor took longer than the scheduled update interval” applies to other types of sensors, for which IP address would not be the identifying information.
I do agree that cryptic messages without any indication of which device or integration it came from, certainly makes it hard (and sometimes impossible) for users to guess at how to fix. And makes it difficult (especially for non-developers) to provide all the relevant info on a github PR
I think the most practical step would be to review error messages as part of a QA phase … though that would mostly apply to those integrations seeking higher Quality rating.