Converting new 5€ IKEA KALLSUP speaker into a esphome media player

IKEA just released a pleasantly priced bluetooth speaker KALLUP.
It is not hard to get it converted into a homeassistant player :wink:

Based on the info from this forum thread about retrofitting the IKEA VAPPEBY speaker by @formatBCE it was rather easy to get the new speaker integrated in HA / musicassistant.


KALLSUP speaker with original bluetoooth board and battery.

Replacement board, consisting of a XIAO ESP32-S3 and a MAX98357A DAC.


Two buttons soldered on the back of the board, fitting to the buttons at the top of the speaker case.


All is fitting quite easily. The battery is not used.


Closeup of the buttons.

The 3W speaker is sounding fine for the money paid.
Integration in musicassistant was easy, with the new (still experimental) sendspin streaming protocol also good for synced multiroom sound.

The connections for the XIAO ESP32-S3 and a MAX98357A DAC were:

I2S LRC β†’ GPIO07
I2S BCLK β†’ GPIO09
I2S DIN β†’ GPIO08
GND β†’ GND
VIN β†’ 5V

The buttons are connected to
Button 1 β†’ GPIO04
Button 2 β†’ GPIO43
both β†’ 3,3 V

The setup is also working with the new ESP32-C5, with the possible benefit of 5MHz Wifi.


First test with the C5.

10 Likes

Nice !
Could you indicate which height of button you used, please.

I had short and longer ones at hand, neither at the right size.
so i did shorten the longer ones until fitting. :wink:

1 Like

nice, how loud is it?

Hey man, can we have the code ? Are you using ESPHome in Home assistant ? I’m working on it too. Thanks by advance.

Jup, the config is ESPhome and iΒ΄m happy to share.
I build on info shared by others :wink:

While experimenting i got to reusing specific part of the YAML quite often, so i discovered the advantage of !including packages and substitutions.
The packages are placed in the common subfolder under esphome.
Thus it was easy to switch / test a config for new sendpin development or different ESP board. This config is for a XIAO ESP32-S3.

I did not clean the code of all comments as i think it might help (at least it does for me).

Answering the question before:
KALLUPS is not too loud if compared with bigger speakers but for the size and price its quite ok. Its 3Watts so i wire the DAC with 5V instead of 3,3V.

Main YAML Code:

## ------------ Template v2 -------  
## ------------ 08.03.2026 --------

# ----------------  Start --------------
substitutions:
  name: "audio-kallsup-01-s3x7"
  friendly_name: Audio-KALLSUP-01-S3X7
  api_key: !secret api_key  
  ota_key: !secret ota_key   
  
# ----------------  GPIOs ---------------

#  pin_i2c_sda: GPIO08  #sda/mtck/
#  pin_i2c_scl: GPIO09  #scl/mtdo

#  pin_spi_clk: GPIO12   #sclk/clk/sck/clock
#  pin_spi_cs: GPIO10  #cs/chip select
#  pin_spi_mosi: GPIO11  #mosi/sdo/data out
#  pin_spi_miso: GPIO13  #miso/sdi/data in

#  pin_spi_rst: GPIOxx   #rst/res/reset
#  pin_spi_dc: GPIO21
#  pin_spi_bl: GPIO46

  pin_i2s_lrclk: GPIO07 #leftrightclock/lrck fs/framesync /ws/word select 
  pin_i2s_dout: GPIO08 #dout/sdout/ sd/sdata/dacdat/data 
  pin_i2s_bclk: GPIO09  #bck/bcclk/bit clock sck/ serial clock (optional)
#  pin_i2s_mclk: GPIOxx  #mclck/master clock

#  pin_i2s_din: GPIO02  #din/sdin/data in /sd/serial data/ adcdat

#  pin_uart_rx: GPIO18
#  pin_uart_tx: GPIO17

  led_01: GPIO21  #onboard LED  
  
  task_stack_in_psram: "true" # important to disable this for non-S3 model. Slower, use if running out of memory with many components.     

# ---------------- Packages --------------

packages:
  - !include common/sendspin_06.yaml
  - !include common/wifi_02_s3.yaml

# ----------------  Basics --------------

esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  name_add_mac_suffix: false

esp32:
  board: seeed_xiao_esp32s3
#  flash_size: 8MB # 16MB for N16R8. Put yours.
  framework:
    type: esp-idf

psram:
  mode: octal
  speed: 80MHz
  ignore_not_found: false

# ------------------- Connection ---------------

# Enable Home Assistant API
api:
  encryption:
    key: ${api_key}
  reboot_timeout: 0s # to avoid rebooting without HA

# Allow Over-The-Air updates
ota:
  - platform: esphome
    password:  ${ota_key}

#wifi:
# >>>>> see external wifi.YAML

http_request:

network:
  enable_ipv6: true

mdns:
  disabled: false


# -------------- Web Interface -------------

web_server:
  port: 80
  version: 3

debug:
  update_interval: 10s

# Enable logging
logger:
# level: none
# level: error
# level: warn
# level: info
  level: debug
# level: verbose
# level: very_verbose
  logs:
    wifi: info
    api: info
    mdns: info

# -------------- Interface Elements ------------

button:
  - platform: restart
    id: restart_button  
    name: "Restart ${friendly_name}"
    entity_category: config
    icon: "mdi:restart"

  - platform: factory_reset
    id: factory_reset_button
    name: "Factory Reset"
    entity_category: diagnostic
    internal: true
    disabled_by_default: true    

# -------------- Platforms ------------------

#i2c:

#  sda: ${pin_i2c_sda}
#  scl: ${pin_i2c_scl} 
#  scan: true
# frequency: 200kHz #100kHz or 400kHz
# timeout: 1s

#spi:
#  - id: spi_bus0
#    clk_pin: ${pin_spi_clk}
#    mosi_pin: ${pin_spi_mosi}    
#    miso_pin: GPIOxx
#    interface: hardware

#uart:
#  rx_pin: ${pin_uart_rx}
#  baud_rate: 9600

# -------------- Bluetooth -----------------

# -------------- Audio  -----------------

# >>>>> see external sendspin.YAML

# --------------- LED / LIGHT --------------

status_led:
    pin:
        number: ${led_01}

# --------------- Globals --------------
# Globals (since-boot only; not restored)


time:
    - platform: homeassistant
      id: time_homeassistant

    - platform: sntp
      id: time_sntp
      timezone: Europe/Berlin
      servers:
          - 0.pool.ntp.org
          - 1.pool.ntp.org

# ----------------Script --------------


# --------------- Sensors --------------

binary_sensor:
  - platform: status
    name: "Status"    

# -------------buttons ---------------------

# Buttons for controling KALLSUP, still looking for good use for the BT button

  - platform: gpio
    pin: 
      number: 04
      mode: INPUT_PULLDOWN
      inverted: false # Assume button connects to GND
    name: "Button Play"
    on_press:
      - media_player.toggle: external_media_player 

  - platform: gpio
    pin: 
      number: 43
      mode: INPUT_PULLDOWN
      inverted: false # Assume button connects to GND
    name: "Button BT"
    on_press:
      - media_player.mute: external_media_player 

text_sensor:
  - platform: debug
    device:
      name: "Device Info"    

sensor:
# ---------------diagnostic sensors -------------

  - platform: uptime
    type: seconds
    name: Uptime Sensor    
  - platform: internal_temperature
    name: "Internal Temperature"    

  - platform: debug
#    free:
#      name: "Heap Free"
#    block:
#      name: "Heap Max Block"
#    loop_time:
#      name: "Loop Time"
    psram:
      name: "Free PSRAM"
#    cpu_frequency:
#      name: "CPU Frequency"

# -------------real sensors ---------------------

# ------------ Displays ---------------------

Config for sound. Only media pipeline as i dont need announcements
sendspin_06.YAML


# -------------- Sendspin Player, no announcement ------------

# ----  223.03.2026
# ----  based on
# ----  https://github.com/esphome/esphome/pull/14933
# ----  component changes 26.03


esphome:
  min_version: 2026.3.0
  on_boot:
    priority: 220.0
    then:
      media_player.play_media:
        id: external_media_player
        media_url: audio-file://startup_sync_sound

network:
  enable_high_performance: true
# The optimization level is PSRAM-aware:
# - with PSRAM guaranteed:  (Psram configured with ignore_not_found: false): Aggressive settings with 512KB TCP windows and 512 WiFi RX buffers
# - without PSRAM guaranteed:  Conservative optimized settings with 65KB TCP windows and 64 WiFi buffers
# If you experience out-of-memory issues, you can disable these optimizations by setting enable_high_performance: false


esp32:
 
  framework:
    type: esp-idf
    version: recommended    
    sdkconfig_options:
    
      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
      CONFIG_ESP32S3_INSTRUCTION_CACHE_32KB: "y"

      # Moves instructions and read only data from flash into PSRAM on boot.
      # Both enabled allows instructions to execute while a flash operation is in progress without needing to be placed in IRAM.
      # Considerably speeds up mWW at the cost of using more PSRAM.
      CONFIG_SPIRAM_RODATA: "y"
      CONFIG_SPIRAM_FETCH_INSTRUCTIONS: "y"

      CONFIG_BT_ALLOCATION_FROM_SPIRAM_FIRST: "y"
      CONFIG_BT_BLE_DYNAMIC_ENV_MEMORY: "y"

      CONFIG_MBEDTLS_EXTERNAL_MEM_ALLOC: "y"
      CONFIG_MBEDTLS_SSL_PROTO_TLS1_3: "y"  # TLS1.3 support isn't enabled by default in IDF 5.1.5
      

# -------------- External Components  -----------------

external_components:

  - source: github://pr#14933
    components: [const, generic_image, media_source, sendspin]
    refresh: 1h
    
  - source:
      type: git
      url: https://github.com/esphome/esphome
      ref: ff8ce89556748509d7ee8724e12d9d43d3c8c1e8
    refresh: 1h
    components: [http_request]

# -------------- Audio  -----------------

i2s_audio:
  - id: i2saudio_output
    i2s_lrclk_pin: 
      number: ${pin_i2s_lrclk}  #lrclk - leftrightclock, WS wordselect, FS - framesync 
      allow_other_uses: false
    i2s_bclk_pin:  
      number: ${pin_i2s_bclk}  #BCLK - bit clock , SCK - serial clock
      allow_other_uses: false


# ===== Speaker & Audio Pipeline =====

speaker:
  # Hardware speaker output
  - platform: i2s_audio
    id: i2s_audio_speaker
    sample_rate: 48000   # 44100 Change this if most of your audio sources are 48 KHz (rare).
    i2s_mode: primary
    i2s_dout_pin: ${pin_i2s_dout} 
    #mute_pin: XX   
    bits_per_sample: 32bit #16bit
    i2s_audio_id: i2saudio_output
    dac_type: external
    channel: stereo
    timeout: never  #  speaker_timeout: 250ms
    buffer_duration: 250ms      
    # 100ms There is still room to reduce the buffer duration.
    # Although at a quarter second, responsiveness is quite okay.   


# ===== Sendspin Media Player =====

sendspin:
  id: sendspin_hub
  task_stack_in_psram: ${task_stack_in_psram}
#  kalman_process_error: 0.01
  
http_request:
  buffer_size_rx: 2048  # Reduces CPU load when streaming audio

media_source:

# Necessary for playing synchronized audio
  - platform: sendspin
    id: sendspin_source
    buffer_size: 500000 #   

  - platform: http_request
    id: http_source
    buffer_size: 500000 #200000

  - platform: audio_file
    id: audio_source


audio_file:
  - id: startup_sync_sound
    file: startup_sound.flac
#      file: https://github.com/mrtoy-me/esphome-tas5805m/raw/main/components/tas5805m/tas5805m_boot_louder.flac


media_player:

# Only necessary for controlling a Sendspin group. It does not play synchronized audio! 
  - platform: sendspin
    id: sendspin_group_media_player

# Necessary for playing synchronized audio and controlling the Sendspin group
  - platform: speaker_source
    id: external_media_player 
    name: "${friendly_name} Player"
    media_pipeline:
      speaker: i2s_audio_speaker
      sources:
        - sendspin_source
        - http_source    
        - audio_source

# -------------- Image  -----------------

# Only necessary for displaying cover art on a display

#generic_image:
#  - platform: sendspin
#    id: sendspin_cover_art
#    format: jpg
#    type: rgb565
#    resize: 200x200
#    image_source: ALBUM


# -------------- Sensor  -----------------

text_sensor:

  - platform: sendspin
    type: title
    id: sendspin_title    
    name: "Stream - Title"

  - platform: sendspin
    type: artist
    id: sendspin_artist        
    name: "Stream - Artist"

  - platform: sendspin
    type: album
    id: sendspin_album    
    name: "Stream - Album"

  - platform: sendspin
    type: track
    id: sendspin_number    
    name: "Stream - Track Number"

#  - platform: sendspin
#    type: year
#    name: Year
#    id: sendspin_year    
#    name: "Stream - Year"

sensor:

#  - platform: sendspin
#    type: kalman_error
#    name: Sendspin Kalman error  # Time sync filter uncertainty estimate
#    entity_category: diagnostic
##    disabled_by_default: true
    
#  - platform: sendspin
#    type: audible_syncs
#    name: Sendspin Audible syncs  # Major sync events that are audible
#    entity_category: diagnostic
#    disabled_by_default: true

  - platform: sendspin
    type: track_duration
    id: sendspin_track_duration

The last YAML is for WiFi.
wifi_02_s3.yaml

esp32:
  framework:
    type: esp-idf
    sdkconfig_options:
        CONFIG_SOC_WIFI_HE_SUPPORT: y     # ESP32
        CONFIG_ESP_WIFI_11AX_SUPPORT: y   # ESP32


# ------------------- Connection ---------------

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  domain: .fritz.box
  enable_btm: true  # 802.11v BSS Transition Management (only ESP32)
  enable_rrm: true  # 802.11k Radio Resource Management (only ESP32) 
  fast_connect: true
  enable_on_boot: true
  power_save_mode: NONE  
  on_disconnect:
      then:
          - lambda: |-
              id(_wifi_disconnects_since_boot)++;

# Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "ESPhome-Fallback"
    password: !secret wifi_password2  

# In combination with the `ap` this allows the user
# to provision wifi credentials to the device via WiFi AP.
captive_portal:


# --------------- Globals --------------
# Globals (since-boot only; not restored)

globals:
    - id: _wifi_disconnects_since_boot
      type: int
      restore_value: no
      initial_value: '0'
      
 
# --------------- Sensors --------------


text_sensor:
  - platform: wifi_info
    ip_address:
        name: "WiFi IP"
        entity_category: diagnostic
#    mac_address:
#        name: "WiFi MAC"
#        entity_category: diagnostic
#    ssid:
#        name: "WiFi SSID"
#        entity_category: diagnostic
#    bssid:
#        name: "WiFi BSSID"
#        entity_category: diagnostic      
    power_save_mode:
        name: "WiFi Powersafe mode"
        entity_category: diagnostic   
      
sensor:
  - platform: wifi_signal # Reports the WiFi signal strength/RSSI in dB
    name: "WiFi Signal dB"
    id: wifi_signal_db
    update_interval: 60s
    entity_category: "diagnostic"
  
  - platform: copy # Reports the WiFi signal strength in %
    source_id: wifi_signal_db
    name: "WiFi Signal Percent"
    filters:
      - lambda: return min(max(2 * (x + 100.0), 0.0), 100.0);
    unit_of_measurement: "Signal %"
    entity_category: "diagnostic"
    device_class: ""  

  - platform: template
    name: "Wi‑F Disconnects (since boot)"
    id: wifi_disconnects_since_boot_sensor
    entity_category: diagnostic
    accuracy_decimals: 0
    update_interval: 30s
    lambda: |-
        return id(_wifi_disconnects_since_boot);

Thanks man, it works well ! Do you think it will work if we try to use the voice mode of HA on this device adding I2S mic like M58625 ? I tried to dev this part but unfortunately didn’t works. I think your better so if you make it don’t hesitate to share us the code. You think we need to use openwakeword for word detection or wakeword running on the esp board ? Thanks by advance your the best !