Request to bring back YAML - arguments against The Future of YAML

(post withdrawn by author, will be automatically deleted in 24 hours unless flagged)

This is a fantastic summary! Well written and very acurate

3 Likes

After you read and understand all you have summarised @nitobuendia I find it impossible to understand the reasoning of developers in their current development path.

New and future users will not even understand exactly what it is they are losing (what has been taken from them?) unfortunately as they are being blinded by how easy it will be with a shiny new UI!

4 Likes

Something is bugging me about this. This isn’t actually the whole story is it? If someone was going to support both YAML and UI configuration this would only be step 1 (getting a config entry created from the YAML config). In addition to this they would need to:

  1. Define the schema for the YAML so that check config correctly tells users whether they have entered YAML correctly or not. Otherwise they would restart and encounter runtime failures.
  2. Handle changes to that schema over time. Since YAML cannot be automatically changed the YAML schema can get complex as multiple versions are introduced. Either the complexity of that validation increases over time or the developer must introduce breaking changes to clean up the config schema
  3. Add tests for their config schema to ensure that code works correctly
  4. Write documentation telling users where and how to get these pieces of metadata. Since there’s no UI driving the config collection process they must write documentation telling users how to do it separately.

Unless I’m missing something your solution above doesn’t really cover any of this right? This is all still work developers would have to do in addition to the work creating their UI config flow in order to support both YAML and UI configuration. I think this is all the part that’s potentially expensive to maintain and requires extra work on the part of the developers. I agree that the actual process of converting YAML into a config entry doesn’t seem particularly difficult but there’s more to a YAML based method of config collection then that.

Now one might argue that the schema I’m referring to is required by both methods of config collection so this isn’t extra work. But that’s not really true. I’m looking at some of these config_flow.py files and the schema referenced in them is per screen. Each screen has a schema describing what inputs should be shown and how those should be validated but those aren’t necessarily the integration schema. Some of those are and some of those are intermediary fields used to collect other data (like oauth tokens for example). At no point does the final configuration schema seem to be consolidated and laid out. Probably since the author is expecting the UI to collect it so its not really necessary to do so. Which means a separate schema would need to be created to support YAML configuration.

I’m also noticing while going through this that these config flow files are very complicated. The shortest ones appear to be a minimum of ~60 lines (Emulated Roku, Flu Near You, GDACS). PS4’s is significantly more complex at 200 lines and the longest I found was Vizio’s at a whooping 500 lines.

This makes me wonder if perhaps we’re targeting the wrong thing here. Perhaps instead of looking at how we can make YAML work easier with config entries (since as you showed, that’s actually pretty easy) we should thinking about how we can make it easier to make UI configuration? If these UIs required less work to configure then developers might be more willing to support YAML config as well. And if we figured out a way to make these config flows more data-driven then they may not even have to since the same schema could be re-used for YAML entry without requiring users to physically go through this UI.

6 Likes

I approve every lines you write. I stopped updating my HA because of that and was thinking about a solution similar to what you describe if I were to fork this project. I admire that huge amount of work you put to speak out loud and summarize our concerns with that project direction. Impressed.

2 Likes

You are right. Aligned with your points. On the sample component I did for PS4, I had to do the following:

  1. Define the PLATFORM_SCHEMA (i.e. define a YAML schema which would validate) [6 code lines].
  2. Document the structure documented on integrations (not done yet).
  3. Adapt YAML structure, to the structure needed by the device (this is independent to the JSON structure) and it is know by the creator as they use those fields from config_entries [~20 code lines].
  4. The tests would be simply making sure that your device has the right values gathered. Everything else is already tested by the other tests.

This obviously requires updates when schema changes, but so does the UI one and the changes should be a handful of lines. I am advocating for breaking changes if you use YAML instead of UI-managed.

I am working on a full view on the solution (from different angles and problems) and I will bring this feedback in. If you want to discuss ideas, let me know.

Thank you.

I have a partially different view.

  1. The integration schema is definitely known to the contributor, as it is the required one for creating the Entity. For example, most of PS4 are found here (some exceptions within init function).

  2. Yes, the config flow stages this in several steps. The important part is that, at the end, you will have all that information. One option to reduce the cost of maintenance of the documentation:

    a. User goes through the UI flow that guides them.
    b. At the end, show the config information. The user can decide to Save (UI-managed) or copy this information to a manual YAML configuration (e.g. UI manage supports Edit on UI, YAML does not).
    c. As such, we only need to say the schema; not document how to obtain them.

This feels like it is integrated here. The attributes are collecting the data of the flow and you can show this data, store it, or do whatever you want with it. My solution would be to show the data in the UI without storing it until the user confirms. This is just a change in the approach UI flow works (ADR agreement), but no extra cost.

This could make sense too. What is important is the overall maintenance cost, and not the individual parts. If we get a word from Contributors that this is the issue, I think it is a fair point.

However, this has not been raised as the issue. YAML, and maintaining both workflows, has. You might be right, I am just wondering if we could get a Core Contributor perspective on it to know where to focus our efforts.

Knowing really what is the real cost here would make a huge difference to come up with a good solution. I did feel the original reasons were not the full picture here.

1 Like

One of the custom components uses the async_setup to import the YAML config seems to work quite well.

This one specificly.

3 Likes

This is a great example! So looking at this you’ll notice the lengths the author had to go to in order to support both options given the complicated config. The schema for their component is kept here which is then shared by both YAML and UI config. And they even note some of the challenges they faced doing this in the comment:

"""
Type and validation seems duplicare, but I cannot use custom validators in ShowForm
It calls convert from voluptuous-serialize that does not accept them
so I pass it twice - once the type, then the validator :( )
"""

But the net result is good! Exactly what I was talking about. This is a significantly more data driven UI. You’ll notice they’ve created their own schema for describing the config options that includes how to validate, type, method, default, conditional dependencies and which screen to show it on when in the UI. This allows them to use this common compile_schema method in both places to generate the schema used for validation whether in the UI or YAML.

Adding YAML support when the config is defined this way is clearly quite simple. There’s a few conditionals in __init__.py based on where the config is coming from but for the most part everything about the YAML config is shared with the UI config.

Unfortunately despite all this work the author put in the config_flow.py file is still 500+ lines long. I had hoped that with all the work the author put in to define common config and that config_singularity.py file the UI would require less work but looks like its still pretty complicated. An initial scan shows that a lot seems to be in error handling which kind of makes sense given the authors earlier comment. Since validation extends beyond basic type and into more complicated validators they then have to write additional code to handle that while in the UI.

Anyway I wonder if there’s something that can be pulled from here into the proposed solution to simplify the process of providing both config options to users with reduced developer effort.

1 Like

Well the component in itself is kind of complex, here is a simpler one. Much less code same YAML/UI Config ability.

3 Likes

Maybe a good path forward is to showcase what would a “best class” implementation of UI + Configuration looks like.

The PS4 example was a first example and fell short in some areas. I am happy to work on the Cast component and see how it goes.

The goals would be:

  1. Minimise SCHEMA to the minimum to avoid breaking changes and maintenance cost.
  2. Use the UI flow to:
    • Configure the entity, if the user wants that UI-configured.
    • Obtain the YAML configuration, if the user prefers to use manual set up.
  3. Allow the user to input YAML directly, even without going to the UI.
  4. Share as much code as possible between both set ups, so there’s no additional cost.
  5. Have a way to work with Auth token that requires user interaction.
  6. Have a way to work with refresh tokens (and alike) that are dynamic over time.

If we are able to find a good, easy to maintain path forward; it can be presented to ADR.

If someone is trying similar things, feel free to contribute to the repository too. It would be nice to create a repository of YAML solutions (opinions exposing limitations of UI-only, potential solutions to bring back YAML, or custom components building on core components to enable YAML).

4 Likes

While I work on other experiments, I have added some documentation to the PS4 one. This expands on all the changes that were required to make it work, and also a bit of additional explanation if someone else wants to build similar components and experiments.

Once completed, I will also add it to Custom Components as a full solution. At the moment, it’s ready and works, but there’s no documentation on how to obtain the token without using the UI. I want to change that.

In the meantime, the full explanation:

Objectives

What was the objective

To showcase that adding YAML is not a high maintenance task if the set up logic is share between the main config_flow, async_setup_entry and the deprecated async_setup_platform.

By delegating the implementation of async_setup_platform to async_setup_entry, the extra code and maintenance is very low and would allow to add YAML with minimal changes.

What was not the objective

This solution does not represent an end to end solution since PS4 does not require a lot of discovery of devices, OAuth flows or refresh tokens. Further experiments should explore these other areas and find a solution.


What was required to make it work

The changes are very minimal and explained below.

Do note that as this was distributed as a custom_components, we also imported config_flow, translations, services.yaml, manifest.json, etc. This is done in order to keep support for the UI and the custom component working perfectly. However, in the case of a core component this would not have been part of the implementation.

1. Define schema

In order to be able to validate the schema, it is required to define it first. While this is optional, it provides a lot of functionality for very little lines.

import voluptuous
from homeassistant.helpers import config_validation

PLATFORM_SCHEMA = config_validation.PLATFORM_SCHEMA.extend({
    voluptuous.Required(const.CONF_HOST): config_validation.string,
    voluptuous.Optional(const.CONF_NAME): config_validation.string,
    voluptuous.Required(const.CONF_REGION): config_validation.string,
    voluptuous.Required(const.CONF_TOKEN): config_validation.string,
})

This defines the 4 fields that are required to run a PS4 media player. With this code, a new entry can be created using the usual configuration:

media_player:
  - platform: ps4
    name: "PS4"
    token: !secret ps4_token
    host: !secret ps4_wifi_ip
    region: "Singapore"`

2. Create an async_setup_platform (setup_platform)

The only task that async_setup_platform performs is adapting the data that comes from the configuration.yaml to the required input of async_setup_entry.

The schema that the device requires is formatted like this:

config_entry: {
  entry_id: string  # Unique entry id. You can use uuid or simply entity id as it's unique.
  data: {  # Contains devices and platform information.
    token: string  # PS4 token obtained from the app.
    devices [  # List of devices.
      {
        host: string  # IP of the PS4.
        name: string  # Name of the PS4.
        region: string  # Region of the PS4.
      }
    ]
  }
}

The only condition is that entry_id and data are accessed like attributes of a class (i.e. config_entry.entry_id or config_entry.data), whereas all the data inside config_entry.data is shaped like a dictionary (i.e. config_entry.data['token'] or config_entry.data.get('devices')).

The code that was needed to transform from our configuration to config_entry is like this:

import collections

Config = collections.namedtuple(
    'Config', f'{const.CONF_ENTRY_ID} {const.CONF_DATA}')


async def async_setup_platform(
        hass, config, async_add_entities, discovery_info=None):
  """Loads configuration and delegates to official integration."""
  # Load configuration.yaml
  host = config.get(const.CONF_HOST)
  name = config.get(const.CONF_NAME, const.DEFAULT_NAME)
  region = config.get(const.CONF_REGION)
  token = config.get(const.CONF_TOKEN)

  # Format it in the new config_entry.yaml format
  config_entry = Config(
      util.slugify(name),
      {
          const.CONF_TOKEN: token,
          const.CONF_DEVICES: [
              {
                  const.CONF_HOST: host,
                  const.CONF_NAME: name,
                  const.CONF_REGION: region,
              },
          ],
      }
  )

  await ps4_media_player.async_setup_entry(
      hass, config_entry, async_add_entities)

  return True

The method async_setup_entry uses a different format of config_entry than what the configuration.yaml provides. As such, we used a namedtuple as a proxy for a very simple class that allows us to access fields like config_entry.data as it’s done in the PS4 entity.

This currently supports only one device, but you could just keep appending the data to devices if using the same token or create a full new Config if it comes from a different platform.

3. Documentation

When manual configuration is available, the last step would be adding some documentation. In this case, we advocate to leave very minimal configuration and rely on the UI flow to generate the details if ever needed. This way, documentation does not become a burden.

Further experiments can explore how the UI can become a source of data for configuration.yaml without having to create advanced documentation.

4. Delegating all the other code

For those of you reading this with the intention of creating similar custom_components, I want to add a bit of explanation on how to achieve it. This is not part of the experiment, but of the custom_components logic and it would not be needed if the rest of the code was integrated in the core component.

You need to copy services.yaml and manifest.json to describe your services. This will allow the platform to know all the services that are required by your system.

Do not copy the whole component, you can easily delegate the implementation by doing this:

from homeassistant.components.ps4 import media_player as ps4_media_player

async def async_setup_entry(hass, config_entry, async_add_entities):
  await ps4_media_player.async_setup_entry(
      hass, config_entry, async_add_entities)

This way, any changes on the core component will not affect your component and you will be able to take advantage of them by simply maintaining the Adapter logic.

If you want to support the UI flow as well, you can simply use the same mechanism to delegate it:

from homeassistant import config_entries
from homeassistant.components.ps4 import config_flow

@config_entries.HANDLERS.register(const.DOMAIN)
class PlayStation4FlowHandler(config_flow.PlayStation4FlowHandler):

  def __init__(self):
    super().__init__()

This will implement the same UI flow that the core component has without having to do any coding. In other experiments, I will showcase a better way to improve the UI flow so it also spits out the required code for configuration.yaml, but for now this is a good sample start.

If you do this, you also want to import the translations folder so the messages show. In this experiments, I copied both manually, but it might best to have a way to copy the core folder to your custom_components automatically. This could be showcased in another experiments, but the main focus is to show how to minimize the core components support for YAML.

As for the constants, in my case I decided to create my own const.py to have flexibility, but virtually all the constants are coming from core constants or ps4 constants really:

# Platform constants.
DOMAIN = 'ps4'
PLATFORMS = ['media_player']

# Attributes.
CONF_HOST = const.CONF_HOST
CONF_NAME = const.CONF_NAME
CONF_REGION = const.CONF_REGION
CONF_TOKEN = const.CONF_TOKEN

# Secondary data attributes.
CONF_DATA = const.CONF_SERVICE_DATA
CONF_DEVICES = const.CONF_DEVICES
CONF_ENTRY_ID = 'entry_id'

# Services.
SERVICE_SEND_COMMAND = 'send_command'

# Service attributes.
ATTR_ENTITY_ID = const.ATTR_ENTITY_ID
ATTR_COMMAND = const.ATTR_COMMAND

# Default values.
DEFAULT_NAME = 'PlayStation'
4 Likes

During the weekend, I tried a different experiment. I used Hue instead of Cast because Cast uses Discovery instead of the UI flows.

The objective of this experiment was to showcase how easy would it be to show the device information in the UI before writing the data. The ultimate objective is to show that the UI can be a good way to onboard the component, that would reduce the cost of maintenance of documentation; while still providing an option for YAML Components. As always, the conclusions are at the end and you can navigate through the titles.

I am compiling all experiments on this component, and it’s open to anyone who is working on similar objectives.

I want to send this experiment (or a variant) to ADR, so all feedback is welcomed (positive or negative).


Hue Experiment

This is an experiment to showcase how UI flow can act as a self-documenting feature for YAML configuration.


Objectives

What was the objective

To showcase that the UI configuration flow can be used to generate the same data needed for the configuration.yaml and allow the user to take the last decision on the configuration.

The underlying objective is to prove that the same config_flow can be used to set up both UI and YAML configuration without adding significant cost to the component owner.

What was not the objective

To implement the code required to read YAML configuration. This is why we focused on Hue component which still supports YAML configuration.

This experiment is also not intended to be a custom_component as the original core component has all the required documentation and features, but it could be used in the future. Nonetheless, this would work if it gets installed as a custom_component.


What was required to make it work

The only change required is to add an additional setup before async_create_entry that confirms the output with the user for validation.

0. Understanding the original device creation

The method async_step_link in the hue core component is responsible for getting the bridge information and creating the entry device. This is the logic responsible for this:

async def async_step_link(self, user_input=None):
  # ...
    return self.async_create_entry(
        title=bridge.config.name,
        data={
            "host": bridge.host,
            "username": bridge.username,
            CONF_ALLOW_HUE_GROUPS: False,
        },
    )
  # ...

However, we wanted to add an additional step to ensure that we could validate and read the configuration. As such, we have replaced this logic with a few changes described below.

1. Store device data into a class attribute

Instead of directly creating the entry, we store the data required into a class attribute, which will allow us to use it later to finally create the entry.

  self._set_up_data = {
      'title': title,
      'data': {
          'host': host,
          'username': username,
          'allow_hue_groups': allow_hue_groups,
      }
  }

Note: while it was not required, for this experiment, we have stored the original values into variables as we will be using them in two different places on this implementation.

  title = bridge.config.name
  host = bridge.host
  username = bridge.username
  allow_hue_groups = False

2. Format data into YAML Configuration format

Change the format from config_entry (_set_up_data) to the one required by YAML configuration.

import yaml

yaml_data = {
    'hue': {
        'bridges': [{
            'host': host,
            'allow_hue_groups': allow_hue_groups,
        }]
    }
}
yaml_configuration = yaml.dump(yaml_data)

Note: yaml_data is only required because the data required by async_create_entry and YAML configuration are different. We advocate to have an equivalent setup so there is no need to maintain two different schemas. If this was true, this would only have required this line: yaml_configuration = yaml.dump(self._set_up_data)

3. Show data before creating the device

We need to edit the original async_step_link to make a call to a new step before calling async_create_entry. This is done by passing the previous data into a form like this:

return self.async_show_form(
    step_id="confirmation",
    description_placeholders={
        'yaml': yaml_configuration,
    }
)

Note: this logic replaces the previously described return self.async_create_entry(...).

4. Add additional step to show YAML configuration

When the form is executed, it triggers a form which will show the YAML data. Once the user submits the form, it will create the entry. The code responsible for this:

  async def async_step_confirmation(self, user_input=None):
    """Creates device entries after confirmation."""
    if not self._set_up_data:
      raise ValueError('Configuration flow failed.')

    return self.async_create_entry(
        title=self._set_up_data.get('title'),
        data=self._set_up_data.get('data'),
    )

As a new step is created, you need to also add some translations to show the message:

"confirmation": {
  "title": "Confirm Configuration",
  "description": "Your device configuration is ready. \n\nIf you prefer to set up the device manually, use this code on your `configuration.yaml`:\n\n```yaml\n{yaml}```\n\nIf you prefer to set it up via UI, simply click submit below."
}

Summary

While the boilerplate may seem long, by sharing the same configuration structure between YAML and UI, this would only require to make one self.async_create_entry() with a yaml processed details of the device to really make it work. This should not require more than 5-10 lines of code (including English strings) and provide great functionality.

Result

This will show the following message in the UI:

Form with YAML configuration for hue component

The user can copy the code into configuration.yaml or simply press submit to complete the UI setup.


Alternative implementations

This implementation may not look friendly for UI-only users. There are a few ways to improve this.

Optional YAML configuration

One option is to only showcase the YAML configuration to those who are YAML configuration. This could be a user flag that allows us to know if it’s an advanced user or not. This is similar to enabling/disabling lovelace dashboards or showing the Developer Tools.

The code will be something like this:

if is_advanced_user:
  return self.async_show_form(
      step_id="confirmation",
      description_placeholders={
          'yaml': yaml_configuration,
      }
  )
else:
  return self.async_step_confirmation()

Make configuration editable

Another option is to additionally show the configuration data to the user, so they can Edit it. This is not a replacement for YAML configuration, but a way to make this confirmation box also users by UI-only users.

This would require defining the form data via a Schema:

  data_schema = voluptuous.Schema({
      voluptuous.Required("title", default=title): str,
      voluptuous.Required("host", default=host): str,
      voluptuous.Required("username", default=username): str,
      voluptuous.Required(
          "allow_hue_groups", default=allow_hue_groups): str,
  })

Additionally, the form call would change slightly:

return self.async_show_form(
          step_id="confirmation",
          data_schema=data_schema,
)

As the form has changed, the translation text would also be different:

  "confirmation": {
    "title": "Confirm Configuration",
    "description": "This is the data configuration that will be set up.",
    "data": {
      "title": "Title",
      "host": "host",
      "username": "username",
      "allow_hue_groups": "allow_hue_groups"
    }
  }

The form data could be gathered by the confirmation step from the user_input instead.

  async def async_step_confirmation(self, user_input=None):
    """Creates device entries after confirmation."""
    config_title = user_input.get('title')
    config_data = {
        'host': user_input.get('host'),
        'username': user_input.get('username'),
        'allow_hue_groups': user_input.get('allow_hue_groups')
    }

    return self.async_create_entry(
        title=config_title,
        data=config_data,
        title=self._set_up_data.get('title'),
        data=self._set_up_data.get('data'),
    )

At the end, this would show like this:

Form with all details from UI configuration for hue component

The form data could also be combined with the one from YAML configuration to offer the best of both worlds.

Overall, this alternative implementation empowers users to confirm the configuration before creating devices, and can be a good step to also provide the data needed for YAML configuration in a more user-friendly way. The extra schema seems to add to the cost (despite being exactly the one required to be stored), and as such, we have opted for a more lightweight option for the main proposal.

Centralized confirmation on core code

This code was implemented by a custom_component and, as such, it has focused in modifying a core component to add this additional step. However, a more efficient way to implement this would be by implementing this natively on self.async_create_entry under core/data_entry_flow.py. Any data received to create a new entry will be shown to the user in a form who would need to submit before creating the entry. This works best when the configuration is optional and only shown for advanced users.

This effectively would have no cost for the components developers.


Conclusions

For Component Owners

The same UI flow can be used to generate data to both manual YAML configurations as well as UI configurations. This will allow to ease the need to document the source of the configuration.yaml parameters. This reduces the cost for component owners to provide YAML configuration.

For UI Users

For users who prefer to use the UI, there is no changes in their current workflows. At worst, the may have one additional step where they see the configuration. However, this can also be optionally shown only to user who enable “advanced” mode; similar to how some Configuration are shown only to some users. The config_flow allows to do this type of conditional send to one step or another.

For YAML Users

Users who want to maintain components via YAML are encouraged to use the UI to set up if they are not sure how the data should be filled in. However, once the data is filled in, the users can edit it or create new ones using the same structure.

7 Likes

There would also need to be some function to reload the integration upon reconfiguration as well to get the same functionality from YAML configuration as the config flow.

Hi @firstof9, thanks a lot for the feedback :slight_smile:

One clarification: this code does not actually write to the configuration.yaml file. As Paulus described, this would have problems like potentially breaking secrets, includes, etc.

As such, it just shows you the configuration information in case you want to opt-in for manual YAML configuration. You would need to copy/paste the YAML snippet there.

In YAML mode, edits are done Editing the configuration.yaml file, not via the UI. I might be wrong, but as far as I know these YAML configurations do not show up under Integrations and therefore support edit. If they do show up, then yes, when Editing you would also want to show a confirmation step before, but I am not aware of how this works at the moment.

1 Like

I know, I wasn’t implying this just merely asking how a reload would be handled. Currently via the config flow you can have to unload and then load the entities again. In YAML mode you’d have to restart Home Assistant completely, while not a show stopper, I like this functionality myself as it leaves less gaps in my sensor data.

Makes sense. Thank you for clarifying. If I understand correctly, it would be something like “Reload Automations” but for “Reload Components or Integrations”, is that correct? i.e. Similar to this feature request.

Right but currently only works for config flow integrations. So additional core would would need to be done, as is my understanding, to get the YAML configuration to be able to be “reloaded”.

I could be wrong here but that’s my assumption :man_shrugging:

The difference is that those are in their own included files. A new engine would be required to just load partial configs from full files with multiple configs.

Actually I think they’re reloaded by domains in that case, since you can have automation & scripts spread across packages as well.

I could be wrong, but that’s how I understood it. Since they’re not really working like a component ie: no requirements, no external polling etc.

Packages are loaded differently if you follow the code. Everything loads from a single file at first and separated after.

1 Like