Unicode friendly names and entity_id transliteration

The text below is a straight copy from issue issue #14913 from github. I’m hoping for it to get some love from developers to make non-english speaking users life easier and their experience to be comparable with english-speaking.

Home Assistant release with the issue: any (until this is fixed somehow).
Last working Home Assistant release (if known): -
Operating environment (Hass.io/Docker/Windows/etc.): any
Component/platform: any, most offending is mqtt autodiscovery.
Description of problem:
There is a problem with how entity_id is given to any entity in home_assistant. In many cases (automations, mqtt discovery, etc), as I inderstand, entity_id is created like this:

  1. take friendly_name or alias
  2. replace spaces with ‘_’, strip any non-a-z and underscore ([^a-zA-Z_]*) characters
  3. a new entity_id is created.

This method may be acceptable for countries with mostly Latin-1 alphabet, but is invalid for any other countries.
Consider this perfectly valid piece of automation configuration:

- alias: Некоторый один русский текст
  trigger:
  - platform: time
    at: *redacted*
  action:
  - service:   *redacted*
    entity_id:  *redacted*
  id:  *redacted*
- alias: Другой некоторый русский текст
  trigger:
  - platform: time
    at:  *redacted*
  action:
  - service:   *redacted*
    entity_id:  *redacted*
  id:  *redacted*

And here is traceback, generated by it from hass log:

2018-06-10 19:23:37 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/asyncio/tasks.py", line 180, in _step
    result = coro.send(None)
  File "/home/hass/env/lib/python3.6/site-packages/homeassistant/helpers/entity_platform.py", line 308, in _async_add_entity
    msg)
homeassistant.exceptions.HomeAssistantError: Entity id already exists: automation.___

Some of the possible fixes for this unwanted behavior for any non-english writing user:

  1. Add a friendly-name(alias) to entity_id translation map as one of configuration options.
  2. Add entity_id field for automation configuration to hard-code it in configuration (doesn’t work for discover).
  3. Add a global (per hass installation) function, templateable through configuration, taking arbitrary string and, possibly, domain, returning a valid entity_id.
  4. A VERY bad option - restrict friendly-names and aliases to a-zA-Z only.

For now there is a very ugly hack - add some english-only identifier to friendly name or alias of an entity so that conversion process mentioned earlier wouldn’t mangle two aliases. Imagine yourself as an english-speaking person, needing to decode something like “СПЛНСВТ Bedroom lights” as a friendly! name.