Display updates disrupt audio output

I am trying to get both the display and an audio output via a MAX98357 to work on a WT32-SC01.

As long as the display is not configured, the audio output of an MP3 stream via i2s_audio and media_player works without problems.

Dropouts and noise can be heard when the display is enabled and updated.
For a tolerable output I have to set update_interval to several seconds.
For example, if I set update_interval: 5s, I hear the noise every 5s. It gets better when I set auto_clear_enabled to False. Probably because then less has to be written on the screen.
Also, the more I output on the screen, the bigger the problem becomes.

My guess is that the update of the display simply takes so long that the DMA(?) buffer for the audio output runs empty.

Any ideas? Known problem?

here are some parts with the hopefully relevant parts of the configfile:

#
#
# Blacklight
#
#

output:
  - platform: ledc
    pin: GPIO23
    id: gpio_23_backlight_pwm

light:
  - platform: monochromatic
    output: gpio_23_backlight_pwm
    name: "${friendly_name} Backlight"
    id: wt32_backlight
    restore_mode: RESTORE_DEFAULT_ON



#
#
# Display
#
#

spi:
  clk_pin: GPIO14
  mosi_pin: GPIO13
  miso_pin: GPIO12

[...]

# 320x480
display:
  - platform: ili9xxx
    id: my_display
    model: ST7796
    cs_pin: GPIO15
    dc_pin: GPIO21
    reset_pin: GPIO22
    rotation: 90
    update_interval: 5s
    auto_clear_enabled: False
    lambda: |-
        auto black = Color(0, 0, 0);
	it.rectangle(0, 0, 150, 150);
	it.rectangle(150, 0, 150, 150);
	it.image(0, 0, id(imgplay));
	it.image(150, 0, id(imgstop));
	it.image(300, 150, id(imgminus));
	it.image(300, 0, id(imgplus));
        it.filled_rectangle(0, 320-30, 480, 320, black);
        it.printf(0, 320-30, id(font20), "%d:%02d/%d:%02d %s", (int)(id(media_pos).state )/ 60, (int)(id(media_pos).state) % 60, (int)(id(media_len).state )/ 60, (int)(id(media_len).state) % 60,id(title).state.c_str());

#
#
# Touchscreen
#
#

external_components:
  - source: github://gpambrozio/esphome@FT6336U-touch
    components: [ ft63x6 ]

i2c:
  id: i2c_bus_intern
  sda: 18
  scl: 19
  scan: false

touchscreen:
  - platform: ft63x6
    id: ${id_prefix}_touch
    i2c_id: i2c_bus_intern
    on_touch:
      - script.execute: backlight_script

[...]

# Audio
i2s_audio:
  i2s_lrclk_pin: GPIO25
  i2s_bclk_pin: GPIO26

media_player:
  - platform: i2s_audio
    name: ESPHome I2S Media Player
    dac_type: external
    i2s_dout_pin: GPIO32
    mode: mono