Request to bring back YAML - arguments against The Future of YAML

nitobuendia · June 7, 2020, 9:18am

Important Note: This post is a response to The Future of YAML which intends to deprecate YAML support for device integrations. All the contents were published there, but moderators decided to split the conversation without author’s consent. If you need context on the conversation, you may want to read the original post first.

Read full and updated version on GitHub.

What’s the reason to drop YAML support from integration

In the post The future of YAML, Home-Assistant team explains the decision of ADR0010.

This decision effectively makes support for YAML on Integrations maintenance mode only:

Any new integration that communicates with devices and/or services, must use configuration via the UI. Configuration via YAML is only allowed in very rare cases, which will be determined on a case by case basis.

Existing integrations that communicate with devices and/or services, are allowed and encouraged to implement configuration via the UI and remove YAML support.

We will no longer accept any changes to the YAML configuration for existing integrations that communicate with devices and/or services.

Some of the decisions to make this decision are:

If you do not want to read the whole thing, jump to the conclusions; and only read the sections where you do not agree.

Making thing easier

“Making things easier” by “enabling and empowering people with managing their Home Assistant instance via the user interface”.

It is undoubtedly true that using UI is much easier for most basic cases, non-Tech savvy people and the ones who prefer convenience; all of which are reasons mentioned in the article.

Making UI first is a great decision for Home-Assistant. However, making UI only is a problem as it breaks a few user flows.

TODO(nitobuendia): add article on “broken user flows” article.

As such, it is not true that this makes things easier; it also makes things harder for many users and use cases. Some of which also happen to be contributing to the ecosystem of components or add-ons.

Breaking changes

Advanced users: the right to break your system

We have all been there, the system broke because of an update as the interfaces are changed. It is painful. Home-Assistant team is claiming that the UI approach is the solution.

First and foremost, this is not just about not breaking the system, but about choice.

Today, in lovelace you can manage your Dashboards in the UI. However, you can also change it to manual mode and configure them in YAML. Of course, doing so, you can break the UI by implementing the wrong code. However, this is supported.

In other words, it is possible to have a UI-first method for those who do not want to break the system and want the ease; and a YAML method with other advantages, but with the disclaimer that it is at your own risk.

As such, it is not enough reason that breaking the system should be the main driver as this is an option that users may voluntarily opt-in when the UI is the first system.

The fix is not exclusive to the UI system

Yes, the new UI config flow will solve the breaking changes. However, they are not telling you why or how the UI is able to solve the breaking changes.

The problem comes from a change in schema, new fields may be required or dropped, or existing ones changed. Suddenly, the data you have does not match the data you need.

The way this is handled is by the UI system is by creating a migration. This is an example from ps4 component:

async def async_migrate_entry(hass, entry):
    """Migrate old entry."""
    config_entries = hass.config_entries
    data = entry.data
    version = entry.version

    _LOGGER.debug("Migrating PS4 entry from Version %s", version)

    reason = {
        1: "Region codes have changed",
        2: "Format for Unique ID for entity registry has changed",
    }

    # Migrate Version 1 -> Version 2: New region codes.
    if version == 1:
        loc = await location.async_detect_location_info(
            hass.helpers.aiohttp_client.async_get_clientsession()
        )
        if loc:
            country = loc.country_name
            if country in COUNTRIES:
                for device in data["devices"]:
                    device[CONF_REGION] = country
                version = entry.version = 2
                config_entries.async_update_entry(entry, data=data)
                _LOGGER.info(
                    "PlayStation 4 Config Updated: \
                    Region changed to: %s",
                    country,
                )

    # Migrate Version 2 -> Version 3: Update identifier format.
    if version == 2:
        # Prevent changing entity_id. Updates entity registry.
        registry = await entity_registry.async_get_registry(hass)

        for entity_id, e_entry in registry.entities.items():
            if e_entry.config_entry_id == entry.entry_id:
                unique_id = e_entry.unique_id

                # Remove old entity entry.
                registry.async_remove(entity_id)

                # Format old unique_id.
                unique_id = format_unique_id(entry.data[CONF_TOKEN], unique_id)

                # Create new entry with old entity_id.
                new_id = split_entity_id(entity_id)[1]
                registry.async_get_or_create(
                    "media_player",
                    DOMAIN,
                    unique_id,
                    suggested_object_id=new_id,
                    config_entry=entry,
                    device_id=e_entry.device_id,
                )
                entry.version = 3
                _LOGGER.info(
                    "PlayStation 4 identifier for entity: %s \
                    has changed",
                    entity_id,
                )
                config_entries.async_update_entry(entry)
                return True

    msg = f"""{reason[version]} for the PlayStation 4 Integration.
            Please remove the PS4 Integration and re-configure
            [here](/config/integrations)."""

    hass.components.persistent_notification.async_create(
        title="PlayStation 4 Integration Configuration Requires Update",
        message=msg,
        notification_id="config_entry_migration",
    )
    return False

What this method is doing is reading the all the configuration, making changes in structure of the data and updating the version so it can be used in the current schema.

This is great, it is a great idea and way to handle it. However, what they do not tell you is that this is not something that can only be done with the new .storage JSON files.

The data in config_entries = hass.config_entries and data = entry.data are structured data similar to a dictionary. This data could also come from the YAML files and be supported exactly the same way.

As such, while this is a great improvement, it is not an improvement that is exclusive to the new UI methodology, but something that can easily be ported to YAML configuration as well.

Privacy and Security

One of the claims to make the new changes is privacy and security. Home-Assistant has access to many APIs that can affect your home, life and expose your personal information.

It is true that this data should be protected as much as possible. The reasoning here is that having this data in YAML is unsafe. However, the new system stores and leverage this data as much as other systems.

This is are some excerpts from core.config_entries:

Hue:

  {
      "connection_class": "local_poll",
      "data": {
          "bridge_id": "00178-redacted-FE6A3B07",
          "host": "192.168.1.2",
          "username": "zA2ksQx-redacted-hBaXnzEO80g1KKjKxrDaYpNao"
      },
      "domain": "hue",
      "entry_id": "dd73b58e701f4b0486be84a80c18d592",
      "options": {},
      "source": "user",
      "system_options": {
          "disable_new_entities": false
      },
      "title": "Living Room",
      "unique_id": "0017886a3b07",
      "version": 1
  },

Spotify:

  "data": {
      "auth_implementation": "spotify",
      "id": "fv-user-redacted--dia",
      "name": "fv-user-redacted--dia",
      "token": {
          "access_token": "BQAUdEGaP1elVLZvPRcYUG-token-redacted-GvIxuCJX4weTy2jMSoeX0bh4ntHkjt_pjBx3MLgxVWRcQFiFUaq6pgdS3e5w2J5e25V0f3S76Fr8X-Br8-GKRuznjd4kC4",
          "expires_at": 1591521655.1966686,
          "expires_in": 3600,
          "refresh_token": "AQC7-z8BykITxy06fOx_YkP67u-refresh-token-redacted-_rMj6F4CJdep-XeWNAsS9IytKkcAc18x-9N6LvX1O4o-ddxKnhkx9veLxvqN",
          "scope": "user-modify-playback-state user-read-playback-state user-read-private",
          "token_type": "Bearer"
      }
  },

The current system still stores the same sensitive data like tokens, or usernames.

The only use case where this would be true is when sensitive data like username and password is required only once and may not thereafter. In this cases, this should not be part of the YAML configuration either. It should create an OAuth flow that allows you to store retrieve the data and store it safely (e.g. tokens).

Now, this is a good idea. Separating PII and sensitive details like usernames and passwords (why are we using those in the first place?!) or tokens and storing them securely is a priority. However, they need to be store as they need to be used.

And yes, of course, this requires UI, no one said the opposite; but the basic configuration still remains in YAML and can be used anywhere.

As such, this point is about improving the onboarding OAuth process, not about whether data should be stored in JSON under .storage or YAML. The approach of having the configuration in YAML and storing sensitive data in private files. Even better if we start encrypting and protecting this data rather than just hiding it and hoping to get security through obscurity.

A big maintenance cost

One of the main reasons is that doing all that was said before and maintaining a dual system would yield high maintenance cost for the contributors.

Some contributors have decided to remove the YAML support to reduce their maintenance and support burden. The amount of energy that needs to be put in (to maintain both capabilities) can be too much and is complex. We have to understand and accept that. If we do not do that, a contributor could simply stop contributing.

It is partially true, but only partially.

Where is the data?

What we have not told is where those complains are, or who are those contributors. Not with the objective of pointing fingers, but with the objective of transparency:

How many contributors have complained?
How many contributors have left because of YAML support?
How does this compare to the ones that are complaining against the removal of YAML?

Checking the first ~100 responses on the blog post, these are some of the numbers that we get:

Blog post: 1
  + Positive: 7
  + Neutral: 8
  + Negative: 18
  + Replies to others: 8
= Total Unique: 42
  + Repeated: 55
= Total Responses: 97

Crunching the numbers:

43% of the unique users in the first 100 posts are against the change.
2.5x users are against the change compared to those in favour.

Extending the numbers:

At the moment, there are 600 responses on the post. Assuming the same ratio:
- In 100 responses, 18 were unique users with a negative view of the change.
- In 600 responses, this could be 108 users.

We could say that 108 users are not that many. However, let’s not forget that this represents 42% of the ones who participated and that for every user who is actively complaining there are many who are not. For examples, some studies say that only 1 in 25 of annoyed users would complain. If this is true, we may have over 2,500+ users who are not happy with the decision.

YAML is not the cost

YAML is just a structured language. This is currently being replaced with JSON files. Both are equivalent and inter-exchangeable if they have the same or equivalent structure.

Example:

host: 127.0.0.1
name: Device

{
  "host": "127.0.0.1",
  "name": "Device"
}

{
  'host': '127.0.0.1',
  'name': 'Device',
}

All these representations are equivalent and can be translated to each other. As such, this is not an inherent problem to YAML itself.

Additionally, YAML support is still available in other areas. So the capabilities to making that translation are still in the system and maintained. In any case, no one would complain if YAML gets changed to JSON or any other structured language. The argument here is to have an easily configurable format that can be created, edited or copied over.

As such, it is not a problem with YAML, but with how it is being used to create devices; and this can be refuted below.

A simple solution

There are a few ways this can be solved. The easiest one would be an Adapter Pattern which basically translates the YAML configuration into whatever same data the UI flow produces.

This an example of an implementation of async_setup_platform that reads the data from configuration.yaml and passes it to PS4 component (that can only be configured in the UI:

Config = collections.namedtuple(
    'Config', f'{const.CONF_ENTRY_ID} {const.CONF_DATA}')


async def async_setup_platform(
        hass, config, async_add_entities, discovery_info=None):
  """Loads configuration and delegates to official integration."""
  # Load configuration.yaml
  host = config.get(const.CONF_HOST)
  name = config.get(const.CONF_NAME, const.DEFAULT_NAME)
  region = config.get(const.CONF_REGION)
  token = config.get(const.CONF_TOKEN)

  # Format it in the new config_entry JSON format.
  config_entry = Config(
      util.slugify(name),
      {
          const.CONF_TOKEN: token,
          const.CONF_DEVICES: [
              {
                  const.CONF_HOST: host,
                  const.CONF_NAME: name,
                  const.CONF_REGION: region,
              },
          ],
      }
  )

  await ps4_media_player.async_setup_entry(
      hass, config_entry, async_add_entities)

  return True

This is much simpler logic that the one that was added for the migration across versions. Additionally, this only needs to be updated if there ever are changes in schema.

TODO(nitobuendia): add further explanation on how this could be solved by Home-Assistant team.

Respecting contributors time

In the post, Home-Assistant team celebrates the work of contributors.

Those contributors do this in their free spare time, for which we all are eternally grateful. It is their work that enables Home Assistant to do what it can right now. It is what automates your home.

This is great. We all are grateful and love this work. They go even further to say that:

Unfortunately, such a move creates breaking changes and often leads to a few pretty de-motivating comments, towards the contributor and the project in general. This is harmful to everybody, as the contributors get demotivated or, even worse, don’t want to implement new features or create a breaking change.

This sounds tough. No one wants to be in that situation. However, this is Home-Assistant decision on the same post:

Any new integration that communicates with devices and/or services, must use configuration via the UI. Configuration via YAML is only allowed in very rare cases, which will be determined on a case by case basis.

Existing integrations that communicate with devices and/or services, are allowed and encouraged to implement configuration via the UI and remove YAML support.

We will no longer accept any changes to the YAML configuration for existing integrations that communicate with devices and/or services.

In other words, if you are contributor who wants to dedicate your “spare time” to “enables Home Assistant to do what it can right now”, you are not allowed. This is “de-motivating” for the contributor who wants to create value; risking the “contributors [to] get demotivated” and “harmful to everybody” as it affects both the users (who want the YAML support) and the contributors who are willing to invest the time to add it.

If adding YAML support is really costly and troublesome (which we already explained that it doesn’t need to be), then let’s make it optional. Contributors are not required to implement it. However, if a contributor is willing to spend their spare time, let’s allow and encourage them to do it as it creates values for the users.

As such, if we really care about users and contributors, we should encourage contributors to create value; not discourage them from contributing.

Conclusions

UI makes it easier for many users; but removing YAML breaks some user flows.
The migration system implemented to avoid breaking changes can also work for YAML configuration.
The problem of breaking exists in other systems like Lovelace Dashboards; this duality of UI with manual advance configuration empowers both “convenient” and “advanced” users with different benefits and costs for each.
Yes, storing sensitive data and token is problematic. Splitting configuration and sensitive data in a good onboarding flow is not exclusive to UI and can live with YAML support. Privacy and security should go beyond just that.
Using an Adapter Pattern can make it very easy to add YAML support without high maintenance cost. If we do not want to place burden in one contributor, let’s make it optional so only Contributors who want to add it will; instead of preventing all from implementing it.
The numbers on the YAML deprecation post show that 43% of the unique users are unhappy with the decision.

nitobuendia · June 7, 2020, 9:19am

On it. I’ve shared it before. It works.

The caveat is how to get the token; I want to accommodate for that too, but this is already 99% of what we need.

This approach is the “per component”. I’ve also thought of a generic custom_component for all configurations and use the forward mechanism mentioned before to implement other services.

eifinger · June 7, 2020, 11:42am

You made some very good points some of them most accurately describe my personal experiences as a contributor.

I am the creator of the here_travel_time integration. I would like to add a pretty simple option to supply origin and destination of the sensor as an address instead of GPS coordinates. I have already implemented the functionality in the external lib and it would just be a few lines of code to add it to homeassistant. Unfortunately I would have to add an option to the yaml config (non breaking) which is now prohibited.
I currently can’t find the time to sit down and properly think of the correct way to implement a UI config flow for a travel_time integration. This means I am not allowed to add additional functionality to an existing component.

Another example is my PR to add the here_weather integration. It hadn’t been reviewed for some months and suddenly a few hours after this blog post went live I got the review to remove the yaml config and add a config flow instead. I didn’t get a proper review whether the integration as such is okay, only that I would have to remove a working functionality.

I do understand that the reviewer will only spend time on a PR if it meets all requirements. It is just pretty demotivating that I submitted a PR and only got the review months later after when the rules had changed and I suddenly was told to remove a working feature.

I like the (yaml) config as I am a strong promoter of config as code and I am currently demotivated to spent more time on maintaining my contributions because if I would maintain them I would have to remove functionality I want to use.
EDIT: I do want to support UI configuration, it is great and a great experience for new users. But I don’t want to remove the feature for yaml configuration.

These are my personal experiences, feelings and opinions which are in no way meant to hurt, attack or criticise another person or their contributions.

nitobuendia · June 7, 2020, 11:47am

Thanks for sharing, @eifinger. It’s really important to bring contrasting points to the statement on the original post about this:

Unfortunately, such a move creates breaking changes and often leads to a few pretty de-motivating comments, towards the contributor and the project in general. This is harmful to everybody, as the contributors get demotivated or, even worse, don’t want to implement new features or create a breaking change.

You seem to be in this situation, but as a result of this decision and not vice versa.

Of course, as a custom_component that would be solved; but it is sad to see. I would love to hear what the team has to say to your situation.

Let us know if we (non-Home-Assistant team) can help in anyway

finity · June 7, 2020, 8:27pm

Excellent post!!

Very well written, reasoned and explained.

It will be interesting to see what the response will be (if any).

rak · June 8, 2020, 8:54am

Excelent post. Thanks.

nitobuendia · June 8, 2020, 11:01am

@finity @rak @eifinger

Thank you. I hope they do reply. The post itself is on GitHub and you can contribute to the messaging and perspective.

I am also working on capturing all the broken user flows and expand on the potential fixes; although, of course, I do not work in the architechture, so the input from the Home-Assistant team would especially be needed there.

balloob · June 8, 2020, 8:46pm

You are incorrectly stating that we can write to YAML files. We can’t without losing structure of “include” tags and preserving comments. We’ve did this once with Lovelace and got a lot of anger from the community.

About your same data in JSON and YAML. You’re greatly oversimplifying things. It’s not just OAuth2 flows.

Thank you for agreeing that it increases the maintenance cost. It’s not just that however. It’s contributors that don’t want to maintain it for their integration got harassed. By making it a project rule, people will voice their anger at Home Asssistant as a whole instead of individual contributors. If creating a space that allows contributors to not be harassed means some people will not contribute, that is a choice I will stand by.

I can’t believe that you’re looking at 100 comments to this blog post and try to draw conclusions from it and use it to predict how the whole community is going to feel. You’re ignoring all the other places that people comment on the blog post and the self-selecting bias of people that will actually comment.

Your solution is over simplified and completely ignores how config entries, devices and areas work.

As always, the devil is in the details.

nitobuendia · June 9, 2020, 10:56am

Hi @balloob, first of all, thanks for taking the time in reading and replying.

I am not sure where I wrote that, but that’s not what I meant. If you can point it out, happy to Edit it. The request is simply about being able to read from (not write to) editable configuration files that users write. In all cases I said YAML, but other structured, editable and documented formats are equally valid.

I will expand on the solution in an upcoming post.

I would love to understand more to help think through a solution. I will be posting an extended version of the proposal (although essentially the same) and I would love to know: (1) how should I bring it on to ADR and (2) what flaws are in there so we can work with a proposal that does work.

Happy to partner in the right way, if you truly are willing to listen.

I said that it doesn’t have to, though, if you just read from configuration, change the schema and pass it on to the same flow that the JSON files use. There might be other solutions which do not involve a significant cost of maintenance (compared to the cost of dropping it), or that it doesn’t put this cost on the contributors.

I actually touched on all the reasons that were mentioned on your post. I spent the time to understand and bring counterarguments to all of them.

This is really unfortunate. I have never seen such declarations on the board or GitHub, but it is sad that people feel that way. Are there any examples and/or stats that you could share without affecting the privacy of the people involved?

While I do not want to see harassment, I do not think that taking this decision is the right one.

For one, you can see that this decision is also generating a lot of negative comments and will continue. Some people were asking to distinguish projects with and without YAML for example and you can expect similar behaviour.

Second, people will find other reasons to harass contributors for any other decisions. As such, if you really want to avoid all bad behaviour, the slippery slope is simply close the project so there are no communication channels. But that doesn’t make sense, right?

Closing YAML does not feel right for the same reasons either.

This is great. We are all aligned. In what ways can we achieve this that would work for the whole project? I really cannot believe that all harassment comes from whether you support YAML or not on your component.

Let’s work on solving the root problem, not in using this as an opportunistic excuse to take a product decision. (Same as for what the “health” part of the Linux post was heavily criticised, essentially).

Definitely agree.

I actually read all the 600 posts and I am about to compile a list of all the user flows that are broken. Nevertheless, not even 600 posts are enough to represent the whole community.

If you read again on my post, I was asking for the data that you used to take that decision. Lacking any data on the original post, some data is better than none. If you can provide better data, we would appreciate it and we can use that instead.

Definitely. I do not work on core or architecture of Home-Assistant. I would love to know, though, where exactly the solution is breaking to change it and adapt it. I will be posting more on this soon. However, just saying “your solution does not work” is not good enough without at least an explanation on why not and where it fails.

Let’s work that devil down, so we can have a solution that satisfies everyone.

Thank you

nitobuendia · June 9, 2020, 11:27am

In the previous post, I left open what user flows are broken. I’ve gone through all the 600+ responses from the post and tried to compile them into a few user flows that are broken.

This is based on the work and posts of many, including: speedfire, eifinger, mpex, ReneTode, balthisar, Tinkerer, Razgriz, someone, scstraus, danielo515, Julien, debackerl, CentralCommand, finity, bhaonvashon, angelnu, mf_social, mig2008pt, 123, Mariusthvdb, nickrout, tom_l, wellsy, lindsayward, kanga_who, lmamakos. (Not allowed to mention people).

I hope I was able to capture your opinion correctly. Otherwise, similar to the other post, this is public on GitHub and accepts contributions.

After this, I will be expanding on the potential fixes; and seeing how’s the best way forward to get them proposed and reviewed.

What workflows are broken when YAML is phased out for Integrations

In the post The future of YAML, Home-Assistant team explains the decision of ADR0010 to drop YAML support for integrations.

One of the main points is to Make things easy for users. This is partially true as having a powerful UI makes it easy for many users. However, there are different users and use cases; and UI is better for some, but worse for others compared to YAML.

In this article, we talked about some user flows that were broken; and here we can expand on them.

Summary of Broken Workflows

The full list of reasons, whose use case and explanation is expanded below:

All current configurations will eventually break.
No easy bulk addition or support.
No partial versioning of components.
No flexible backups and restoring.
Reduced shareability for integrations and related entities.
Increased difficulty in troubleshooting.
Reduced documentation for users and developers.

Breaking existing integrations

With UI configuration only, the developers are effectively encouraged to remove YAML support, even from existing integrations. This will effectively break every single current Home-Assistant installation which uses this. In may cases, these are hundreds of integrations over long periods of time.

With YAML configuration, everything keeps working as today. You can still use YAML, or happily move over to UI; progressively at your own pace, or at once.

Users affected: (everyone, technically), 595, 634

Bulk Addition or Editing

There are changes that may affect several or all entities. To name some of these:

A change of local IP subnet or IP ranges that affects all devices.
A revamp of your identity or name that may affect usernames.
A decision to name all your devices on a certain structure.
Hiding or showing all/most entities on Google Assistant.

With UI configuration only, making bulk changes to many entities in bulk will be painful and time consuming. Having to go one by one, without having a good way to keep track the entities that were changed. Not only that, but if I want to change all the usernames where I used to have value “x”, there is no effective way for me to search and find all the configurations affected.

With YAML configuration, making bulk changes can be done easily in a configuration file. Using editors or git also will allow to search, and compare changes. As such, it is easy to find the places where changes are required, edit them and keep track of progress as you do it.

Users affected: 27, 44, 58, 114, 220, 450

(Partial) Versioning

With UI configuration only, it becomes harder to create “dated” versions of configurations. If I change the settings of one component to experiment, and I want to revert them one week later to the status one week before, there is no way for me to do it. The options are restoring a full back up (which may be reverting more changes that I want to keep), annotating screenshots, having my own means of documenting them (i.e. my own useless configuration file), or reading the .storage files and manually trying to understand the undocumented configuration files. In other words, there is no easy way to import partial past configurations (i.e. one or two components only).

With YAML configuration, all the versions are kept in track if you use a software like Google Drive or GitHub, you can see the history of changes and you can easily import the state to any of those versions.

Users affected: 2, 27, 124, 185, 457, 478, 481

Flexible Backups and Partial Restores

Backing up is supposed to be well covered by the current system. However, this is only under certain conditions. This is the solution proposed in the post:

Using the Home Assistant snapshot feature, this is not an issue. However, if you do manual backups on a system that runs just Core, you need to make sure to back up the .storage folder as well (which hopefully you’re already doing). Otherwise, there is no difference.

And related to git:

This is actually not true, the .storage folder contains all Home Assistant managed configuration files in JSON format, which in those cases, can be stored and versioned in a git repository.

With UI configuration only:

The snapshots that Home-Assistant takes are full copies (you can select which options, but you copy all those). When you restore, you are restoring a full version.
The backups are heavy and it might not be ideal to keep the backups for months or weeks in case you want to restore to older changes.
Snapshots are not available in all systems (e.g. Docker installations).
If the backups are on a system like a Raspberry Pi, you are at a risk of losing them if the SD card goes corrupt and starting from scratch.
To solve all these, you require additional systems (e.g. move them to NAS) which are not part of the native system; and requires technical implementations much harder than using Git for your config files.
Backing up .storage allows you to store this data, but it has sensitive data which only would work on private respositories.

With YAML configuration:
With YAML you still have all the previous options; however, now you are empowered to do more.

Adding configurations.yaml plus a simple (and available to everyone) like git or GitHub, you can:

Create and import different versions of components; in full or partial (see versioning above).
Keep years of history and be able to check, revert and have detailed data to make back ups and when you did the changes.
Import your configuration to new and fresh systems without importing everything that comes with the backups.
All the sensitive data is input by you and can be stored in secrets or means that make sense without; as opposed to having to download them from .storage folder and having to either update all or nothing depending on whether you can upload sensitive data. Note that the .storage folder so you do need to make continuous backups in case schema or other areas change.

Users affected: 72, 185, 193, 272, 335, 363, 367, 481

Shareability

A lot of us learnt and started using Home-Assistant by learning from the configurations of others. This is not only from automations, but also about the key integrations and how they are used.

With UI configuration only, the integrations are no longer shareable as part of your configurations. The JSON files contain sensitive data (like tokens or passwords), which cannot be removed automatically using systems like secrets. As such, only manually edited and cherry picked files can be shared. As a result, the sharing ecosystem will progressively weaken. Even for the elements that are still shareable (like automations or template sensors), they lose a lot of context when you are not aware of the integrations implemented (e.g. that binary sensor, what is it tracking?).

With YAML configuration, one can share their configurations easily for others to learn and get inspired. Using secrets allow to make this secure without manual intervention.

Users affected: 2, 9, 135, 181, 185, 187, 193, 449

Effective testing and troubleshooting

The post addresses this point by stating:

YAML configuration testing is often done to see if a specific YAML configuration is still valid against (newer versions of) Home Assistant. With integrations set up via the UI, this is not a concern, since Home Assistant ensures the data structure is compatible between versions and migrates it for you.

Moving across servers or Home-Assistant versions

It is possible for users to have several versions of Home-Assistant at points of time; be it because there are migrations between servers, or tests between development and production installs. This is not just a problem of testing upgrades to the next version.

With UI configuration only, there is no effective way to fully migrate between servers. Snapshots provide part of that functionality, but it has the problems described in the backup section (e.g. not being able to only move part of the installation, or installing in different versions of Home-Assistant). The alternative is to edit across multiple files of undocumented JSON (risking breaking the system as you are not meant to edit them), or having to start from scratch.

With YAML configuration, you can easily copy/paste all or parts of the configuration and install; allowing you to start fresh installations.

Partial Breakages

Configuration entries is not the only way a component can fail. There can be bugs or conflicting modules that may not make your system to work.

For example, some components temporarily failed (example nmap) and I had to disable them. Note that this is not a problem with the configuration, but with the component itself. The UI would not load at all, so the only way I was able to solve it was by connecting via Samba, commenting out the configuration and restarting via SSH. All worked after.

With UI configuration only, there is no way to isolate components temporarily. Moreover, if the UI is broken, there is no way to disable or remove components. You are stuck with a broken system. In other words, if the system is crashing due to a bug, there is no way to access the UI to disable or remove the component and JSON files do not allow an easy way to do it. There’s not a good way to recover from this mode.

With YAML configuration, you have control over which components to copy and isolate. You can also comment out components even without access to the UI, which allows you to test hypothesis, iterate and recover the system from failure.

Troubleshooting

In the past, when I reported bugs, I had to recreate and isolate the bugs (example). To do this easily, I usually copied the affected configurations from my main installation into a docker or dev environment.

In some cases, it was conflict between two or more components (some might be custom, but not always or all of them). For example, I recently had a conflict between a custom component (Hue Sync Box remote) and an official component (Harmony). To troubleshoot it, I moved the two components to a dev environment and I was able to identify and fix the issue.

With UI configuration only, there is no way to isolate components temporarily or move them into a dev environment quickly. Testing and troubleshooting becomes costly as you need to reproduce the partial setup on the UI, or restore an exact copy which might not be helpful. There is also no easy way to share the relevant configuration for others to recreate the environment and be able to fix it.

With YAML configuration, you have control over which components to copy and isolate. You can share the configuration for others to reproduce. You can also comment out components even without access to the UI, which allows you to test hypothesis, iterate and troubleshoot.

Users affected: 128, 183, 255, 481, 561, 568

Documentation for developers, users and the curious

One of the greatest things about Home-Assistant is the really good documentation of components; with all their parameters that allows you to learn and experiment.

With UI configuration only, all the data is opaque to the user and developers. One could argue that this is by design. However, it is limiting contributors who can learn, tweak and in the future contribute to the main components.

With YAML configuration: those who are willing to learn and contribute have plenty of documentation on the integrations page. Since YAML is an option, the configuration details need to be present.

Users affected: 68, 89, 332, 419, 454, 594, 625

nitobuendia · June 9, 2020, 11:42am

(post withdrawn by author, will be automatically deleted in 24 hours unless flagged)

danielo515 · June 9, 2020, 6:41pm

This is a fantastic summary! Well written and very acurate

wellsy · June 10, 2020, 2:49am

After you read and understand all you have summarised @nitobuendia I find it impossible to understand the reasoning of developers in their current development path.

New and future users will not even understand exactly what it is they are losing (what has been taken from them?) unfortunately as they are being blinded by how easy it will be with a shiny new UI!

CentralCommand · June 10, 2020, 4:47am

The future of YAML

How is it expensive to maintain this? Could you elaborate? Genuine question.

async def async_setup_platform(
        hass, config, async_add_entities, discovery_info=None):
  """Loads configuration and delegates to official integration."""
  # Load configuration.yaml
  host = config.get(const.CONF_HOST)
  name = config.get(const.CONF_NAME, const.DEFAULT_NAME)
  region = config.get(const.CONF_REGION)
  token = config.get(const.CONF_TOKEN)

  # Format it in the new config_entry JSON format.
  config_entry = Config(
      util.slugify(name),
      {
          const.CONF_TOKEN: token,
          const.CONF_DEVICES: [
              {
                  const.CONF_HOST: host,
                  const.CONF_NAME: name,
                  const.CONF_REGION: region,
              },
          ],
      }
  )

  await ps4_media_player.async_setup_entry(
      hass, config_entry, async_add_entities)

  return True

It is literally getting the config and changing the structure into what async_setup_entry . It is definitely much less costly than this async_migrate_entry that it is today needed to avoid breaking the system among changes in syntax.

Something is bugging me about this. This isn’t actually the whole story is it? If someone was going to support both YAML and UI configuration this would only be step 1 (getting a config entry created from the YAML config). In addition to this they would need to:

Define the schema for the YAML so that check config correctly tells users whether they have entered YAML correctly or not. Otherwise they would restart and encounter runtime failures.
Handle changes to that schema over time. Since YAML cannot be automatically changed the YAML schema can get complex as multiple versions are introduced. Either the complexity of that validation increases over time or the developer must introduce breaking changes to clean up the config schema
Add tests for their config schema to ensure that code works correctly
Write documentation telling users where and how to get these pieces of metadata. Since there’s no UI driving the config collection process they must write documentation telling users how to do it separately.

Unless I’m missing something your solution above doesn’t really cover any of this right? This is all still work developers would have to do in addition to the work creating their UI config flow in order to support both YAML and UI configuration. I think this is all the part that’s potentially expensive to maintain and requires extra work on the part of the developers. I agree that the actual process of converting YAML into a config entry doesn’t seem particularly difficult but there’s more to a YAML based method of config collection then that.

Now one might argue that the schema I’m referring to is required by both methods of config collection so this isn’t extra work. But that’s not really true. I’m looking at some of these config_flow.py files and the schema referenced in them is per screen. Each screen has a schema describing what inputs should be shown and how those should be validated but those aren’t necessarily the integration schema. Some of those are and some of those are intermediary fields used to collect other data (like oauth tokens for example). At no point does the final configuration schema seem to be consolidated and laid out. Probably since the author is expecting the UI to collect it so its not really necessary to do so. Which means a separate schema would need to be created to support YAML configuration.

I’m also noticing while going through this that these config flow files are very complicated. The shortest ones appear to be a minimum of ~60 lines (Emulated Roku, Flu Near You, GDACS). PS4’s is significantly more complex at 200 lines and the longest I found was Vizio’s at a whooping 500 lines.

This makes me wonder if perhaps we’re targeting the wrong thing here. Perhaps instead of looking at how we can make YAML work easier with config entries (since as you showed, that’s actually pretty easy) we should thinking about how we can make it easier to make UI configuration? If these UIs required less work to configure then developers might be more willing to support YAML config as well. And if we figured out a way to make these config flows more data-driven then they may not even have to since the same schema could be re-used for YAML entry without requiring users to physically go through this UI.

speedfire · June 10, 2020, 11:55am

I approve every lines you write. I stopped updating my HA because of that and was thinking about a solution similar to what you describe if I were to fork this project. I admire that huge amount of work you put to speak out loud and summarize our concerns with that project direction. Impressed.

nitobuendia · June 10, 2020, 11:57am

You are right. Aligned with your points. On the sample component I did for PS4, I had to do the following:

Define the PLATFORM_SCHEMA (i.e. define a YAML schema which would validate) [6 code lines].
Document the structure documented on integrations (not done yet).
Adapt YAML structure, to the structure needed by the device (this is independent to the JSON structure) and it is know by the creator as they use those fields from config_entries [~20 code lines].
The tests would be simply making sure that your device has the right values gathered. Everything else is already tested by the other tests.

This obviously requires updates when schema changes, but so does the UI one and the changes should be a handful of lines. I am advocating for breaking changes if you use YAML instead of UI-managed.

I am working on a full view on the solution (from different angles and problems) and I will bring this feedback in. If you want to discuss ideas, let me know.

Thank you.

I have a partially different view.

The integration schema is definitely known to the contributor, as it is the required one for creating the Entity. For example, most of PS4 are found here (some exceptions within init function).
Yes, the config flow stages this in several steps. The important part is that, at the end, you will have all that information. One option to reduce the cost of maintenance of the documentation:

a. User goes through the UI flow that guides them.
b. At the end, show the config information. The user can decide to Save (UI-managed) or copy this information to a manual YAML configuration (e.g. UI manage supports Edit on UI, YAML does not).
c. As such, we only need to say the schema; not document how to obtain them.

This feels like it is integrated here. The attributes are collecting the data of the flow and you can show this data, store it, or do whatever you want with it. My solution would be to show the data in the UI without storing it until the user confirms. This is just a change in the approach UI flow works (ADR agreement), but no extra cost.

This could make sense too. What is important is the overall maintenance cost, and not the individual parts. If we get a word from Contributors that this is the issue, I think it is a fair point.

However, this has not been raised as the issue. YAML, and maintaining both workflows, has. You might be right, I am just wondering if we could get a Core Contributor perspective on it to know where to focus our efforts.

Knowing really what is the real cost here would make a huge difference to come up with a good solution. I did feel the original reasons were not the full picture here.

firstof9 · June 10, 2020, 2:47pm

One of the custom components uses the async_setup to import the YAML config seems to work quite well.

This one specificly.

CentralCommand · June 10, 2020, 3:25pm

This is a great example! So looking at this you’ll notice the lengths the author had to go to in order to support both options given the complicated config. The schema for their component is kept here which is then shared by both YAML and UI config. And they even note some of the challenges they faced doing this in the comment:

"""
Type and validation seems duplicare, but I cannot use custom validators in ShowForm
It calls convert from voluptuous-serialize that does not accept them
so I pass it twice - once the type, then the validator :( )
"""

But the net result is good! Exactly what I was talking about. This is a significantly more data driven UI. You’ll notice they’ve created their own schema for describing the config options that includes how to validate, type, method, default, conditional dependencies and which screen to show it on when in the UI. This allows them to use this common compile_schema method in both places to generate the schema used for validation whether in the UI or YAML.

Adding YAML support when the config is defined this way is clearly quite simple. There’s a few conditionals in __init__.py based on where the config is coming from but for the most part everything about the YAML config is shared with the UI config.

Unfortunately despite all this work the author put in the config_flow.py file is still 500+ lines long. I had hoped that with all the work the author put in to define common config and that config_singularity.py file the UI would require less work but looks like its still pretty complicated. An initial scan shows that a lot seems to be in error handling which kind of makes sense given the authors earlier comment. Since validation extends beyond basic type and into more complicated validators they then have to write additional code to handle that while in the UI.

Anyway I wonder if there’s something that can be pulled from here into the proposed solution to simplify the process of providing both config options to users with reduced developer effort.

firstof9 · June 10, 2020, 3:32pm

Well the component in itself is kind of complex, here is a simpler one. Much less code same YAML/UI Config ability.

nitobuendia · June 11, 2020, 12:13pm

Maybe a good path forward is to showcase what would a “best class” implementation of UI + Configuration looks like.

The PS4 example was a first example and fell short in some areas. I am happy to work on the Cast component and see how it goes.

The goals would be:

Minimise SCHEMA to the minimum to avoid breaking changes and maintenance cost.
Use the UI flow to:
- Configure the entity, if the user wants that UI-configured.
- Obtain the YAML configuration, if the user prefers to use manual set up.
Allow the user to input YAML directly, even without going to the UI.
Share as much code as possible between both set ups, so there’s no additional cost.
Have a way to work with Auth token that requires user interaction.
Have a way to work with refresh tokens (and alike) that are dynamic over time.

If we are able to find a good, easy to maintain path forward; it can be presented to ADR.

If someone is trying similar things, feel free to contribute to the repository too. It would be nice to create a repository of YAML solutions (opinions exposing limitations of UI-only, potential solutions to bring back YAML, or custom components building on core components to enable YAML).