M5Stack LLM-8850 8GB M.2 Axera AX8850 24 TOPS AI accelerator

M5Stack LLM-8850 is a new M.2 M-Key (2242 NGFF socket) AI accelerator card based on Axera AX8850 24 TOPS @ INT8 NPU SoC:

This is a new LLM8850 M.2 AI accelerator module is less expensive alternative than competes with the Hailo-8 based the Raspberry Pi AI Kit and Waveshare Hailo-8 M.2 AI Accelerator Module, and just like those this too is designed primarly for “Edge AI” on single-board-computers such as Raspberry Pi 5 / CM5, Rockchip RK3588 SBCs (like example NanoPC-T6, NanoPi R6C, Orange Pi 5 Plus, Banana Pi BPI-M7), and small-form-factor x86-64 Mini-PCs (like Intel NUC series) with a spare M.2 Key-M socket.

On paper the Axera 8850 SoC Edge NPU coprocessor is howerver far superior on general AI acceleration / AI generation and thus more flexible then the Hailo-8 M.2 cards which in turn are more or less optimized for computer vision applications.

Wondering if we could get support for this by default in the official Home Assistant Operating System (Linux distobution)?

The M5Stack LLM-8850 card supports PCIe 2.0 ×2 lanes and onboard it has 8GB RAM (64 bit LPDDR4x @ 4266 Mbps) which NPU capable of up to 24 TOPS @ INT8 (based on Axera AX8850 Octa-Core Cortex‑A55 1.7 GHz CPU SoC). It also supports H.265/H.264 8Kp30 video encoding and 8Kp60 video decoding, with up to 16 channels for 1080p videos, so another use case would be on-the-fly video transcoding acceleration for Frigate NVR, other than real-time AI object detection (similar to the Google Coral M.2 and mini-PCIE card Edge TPU coprocessor). The main downside (other than it only having 8GB of VRAM) is that M5Stack LLM-8850 card runs a hotter so need a fan for active cooling as maximum load it uses 7W @ 3.3V and runs at 70 °C degrees (so it probably get way too hot to run inside the Home Assistant Yellow enclosure).

M5Stack sells the Axera AX8850 M.2 module for $99 US-dollar in its own online store (and on its on AliExpress store):

PS: M5Stack is a subsidiary of Espressif (of ESP32/ESP8266 fame):

FYI, Axera representative has started a Frigate NVR discussion here:

Fork of Frigate with initial Axera support:

AXERA-TECH rep also mentioned these Huggingface models:

AI-accelerator hardware such as this could be useful for many things in HA OS, such as STT (Whisper), TTS (Piper), object recognition, and local LLMs.

Ya, I’m excited to see what could happen here. Imagine Frigate, with a hardware video encoder here along with a couple AI models running in parallel (vision, chat). That would be awesome for local, low power AI.

FYI, also see the “Radxa AICore AX-M1” M.2 2280 M Key form-factor AI Acceleration Module (M72T) is based the same Axera AX8850 SoC:

Radxa has not yet posted anything about pricing or when it will be availability.

Radxa officially lists their ROCK 5A, 5B, 5B+, and ROCK 5 ITX boards as tested.

RapidAnalysis posted a demo-video of a preview unit testing DeepSeek-R1-Qwen-7B and SmolLM2-360M-Instruct large language model(s):

Radxa has a support listing for Large Language Models, small and large Vision models, Speech Models, Text-to-image Generation models:

Large Language Models:

  • DeepSeek-R1-Distill
  • Qwen2.5
  • Qwen3
  • MiniCPM4
  • SmolLM3
  • Llama3.2
  • Gemma2
  • Phi3

Vision Large Models:

  • InternVL3
  • Qwen2.5-VL
  • SmolVLM2
  • CLIP
  • YOLOWorldv2

Speech Models:

  • Whisper
  • SenseVoice
  • MeloTTS

Text-to-image Generation Model:

  • Stable Diffusion v1.5

Vision Model:

  • YOLO Family
  • Depth-Anything-V2
  • Real-ESRGAN
  • MixFormerV2
  • LivePortrait
1 Like