Jeston Orin Nano 8GB + Wyoming + Speech to text

Has anyone had any luck getting speech to text working on a Jetson Orin Nano?
So far the closest I have found is this:

But for the life of me I cant get it to run without errors in docker on my jetson orin nano super with jetpack 6 installed.

What errors?

I’ve hit the errors that the author describes in his github readme:

“NOTE: ARM64 dGPU and iGPU containers may take a while to start on first launch after installation or updates. I do not have ARM64 or Jetson devices so several packages such as torch and torch2trt fail to install properly because CUDA is not detected when using QEMU/buildx. If you know how to get around this please reach out to me.”

Ive been basically “vibe debugging” with ChatGPT to try achieve a docker container that actually works on the Jetson Orin nano,… but have’t quite managed to get it to work.
I have just been feeding error messages into ChatGPT and getting it to provide me with a new docker file iteratively,… I am a way out of my element here…

1 Like

OMG,. after hours of going back and forth using ChatGPT,… it has created a dockerfile that runs.

################################################################################
# 🐋 WYOMING WHISPER TRT – Jetson Dockerfile
#
# 🎯 GOALS:
#   1. Provide a self-contained Whisper + TensorRT service for Jetson (JetPack 6.2)
#   2. Use NVIDIA’s official CUDA/cuDNN/PyTorch wheels (NO compiling PyTorch from source)
#   3. Fail FAST if CUDA/cuDNN aren’t working, so we don’t debug after 20min builds
#   4. Patch + build torch2trt safely (no broken TRT imports, no missing CUDA)
#   5. Keep comments & rationale so future AI or humans understand WHY choices were made
#
# 📚 LESSONS LEARNED:
#   ✅ JetPack 6.2 ships CUDA 12.6 + cuDNN 9.3 + TensorRT 10.3
#   ✅ NVIDIA’s L4T PyTorch wheels must match JetPack minor version (v61 for JP 6.1/6.2)
#   ✅ “Torch not compiled with CUDA” errors came from wrong wheels — fixed by using NVIDIA’s
#   ✅ torch2trt setup.py imports TRT too early; we patch this out before compiling
#   ✅ cuDNN 9 supersedes cuDNN 8 – no need to symlink fake libcudnn.so.8 anymore
#   ✅ JetPack base image did NOT ship cuSPARSELt at all — we now install it via NVIDIA’s apt repo
#   ✅ `torch.cuda.is_available()` will be False during docker build (no GPU driver) — so we skip it there
#   ✅ torch2trt needed the `packaging` Python module — added
#   ✅ NumPy 2.x broke ABI for PyTorch/torch2trt — now pinned to <2.0
#
# 🏁 End goal: A container that runs `wyoming-whisper-trt` with CUDA acceleration OOTB.
################################################################################

FROM nvcr.io/nvidia/l4t-jetpack:r36.4.0

################################################################################
# 1️⃣ BASE SYSTEM SETUP
################################################################################
RUN apt-get update && apt-get install -y --no-install-recommends \
    # 🔧 Basic dev/build tools
    git wget curl python3 python3-pip python3-dev python3-venv build-essential \
    # 🔊 Audio + math libs (Whisper deps)
    libopenblas-dev liblapack-dev libsndfile1 ffmpeg \
    # 🤖 TensorRT dev libs (needed for torch2trt)
    libnvinfer-dev libnvinfer-plugin-dev nvidia-cuda-toolkit \
    # 🔑 Needed for adding NVIDIA apt repo
    gnupg2 \
    && rm -rf /var/lib/apt/lists/*

# ✅ Make sure CUDA path is consistent (some scripts expect /usr/local/cuda)
RUN ln -sf /usr/local/cuda-12.6 /usr/local/cuda

# ✅ Set up CUDA env vars for all future stages
ENV CUDA_HOME=/usr/local/cuda
ENV PATH=$CUDA_HOME/bin:$PATH
ENV LD_LIBRARY_PATH=/usr/lib/aarch64-linux-gnu:${LD_LIBRARY_PATH}

# ✅ Upgrade Python packaging tools early
# ⚠️ Pin numpy<2 to avoid ABI breakages with PyTorch/torch2trt/Whisper
RUN pip3 install --upgrade pip setuptools wheel "numpy<2" packaging

################################################################################
# 2️⃣ INSTALL cuSPARSELt (runtime + dev headers)
################################################################################
# 📌 Rationale: JetPack base image doesn’t ship cuSPARSELt at all. PyTorch/TensorRT require it.
RUN echo "🔑 Adding NVIDIA CUDA apt repo..." && \
    curl -fsSL https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/sbsa/3bf863cc.pub | apt-key add - && \
    echo "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/sbsa/ /" > /etc/apt/sources.list.d/cuda-sbsa.list && \
    apt-get update && \
    apt-get install -y --no-install-recommends \
    libcusparselt0 \
    libcusparselt-dev && \
    rm -rf /var/lib/apt/lists/*

# ✅ Verify cuSPARSELt actually installed
RUN test -f /usr/lib/aarch64-linux-gnu/libcusparseLt.so.0 || \
    (echo "❌ cuSPARSELt runtime missing!" && exit 1)

################################################################################
# 3️⃣ PYTORCH (CUDA ENABLED)
################################################################################
# 🎯 Goal: Use NVIDIA’s official prebuilt PyTorch wheel for JetPack (CUDA 12.6/cuDNN 9.3)
# 📌 Rationale: Avoid compiling PyTorch from source — too heavy for Jetson.
WORKDIR /tmp
RUN wget -nv https://developer.download.nvidia.com/compute/redist/jp/v61/pytorch/torch-2.5.0a0+872d972e41.nv24.08.17622132-cp310-cp310-linux_aarch64.whl && \
    pip3 install --no-cache-dir torch-2.5.0a0+872d972e41*.whl

# 🚨 FAIL FAST: verify Torch *build* has CUDA (but don’t require GPU drivers in build)
RUN python3 - <<EOF
import torch
print("🔥 Torch version:", torch.__version__)
print("🔥 Reported CUDA version:", torch.version.cuda)
print("🔥 cuDNN version:", torch.backends.cudnn.version())
assert torch.version.cuda is not None, "❌ Torch was not compiled with CUDA support!"
assert torch.backends.cudnn.version() >= 9000, f"❌ Expected cuDNN ≥9, got {torch.backends.cudnn.version()}"
EOF

################################################################################
# 4️⃣ TORCH2TRT (TensorRT acceleration for Whisper)
################################################################################
WORKDIR /usr/src
RUN git clone https://github.com/NVIDIA-AI-IOT/torch2trt.git
WORKDIR /usr/src/torch2trt

# 🩹 Patch: torch2trt setup.py imports TensorRT too early — break install if TRT not present.
RUN sed -i 's/^import tensorrt/# import tensorrt/' setup.py && \
    sed -i 's/version.parse(tensorrt.__version__)/version.parse("8")/' setup.py

# 🛡 Temporarily comment CUDAExtension block so first install doesn’t break
RUN perl -pi -e 'if (/plugins_ext_module = CUDAExtension\(/../^\s*\)/) { s/^/#/ }' setup.py

# ✅ First pass: skeleton install
RUN python3 setup.py install

# 🔨 Second pass: actually build CUDA plugins
RUN CUDA_HOME=/usr/local/cuda PATH=$CUDA_HOME/bin:$PATH python3 setup.py build_ext --inplace && \
    python3 setup.py install

################################################################################
# 5️⃣ WHISPER + WYOMING WHISPER TRT
################################################################################
WORKDIR /usr/src
RUN git clone https://github.com/openai/whisper.git && \
    git clone https://github.com/Jonah-May-OSS/wyoming-whisper-trt.git

WORKDIR /usr/src/whisper
RUN pip3 install --no-cache-dir git+https://github.com/openai/whisper.git && \
    pip3 install .

WORKDIR /usr/src/wyoming-whisper-trt
# 🩹 Remove torch/tensorrt from requirements — we already installed them
RUN sed -i '/tensorrt/d;/torch/d' requirements.txt && \
    pip3 install -r requirements.txt
# 🩹 Remove install_requires from setup.py to avoid dependency conflicts
RUN sed -i '/install_requires/d' setup.py && \
    pip3 install .

################################################################################
# 6️⃣ RUNTIME CONFIG
################################################################################
WORKDIR /usr/src/wyoming-whisper-trt
ENV PYTHONPATH=/usr/src/wyoming-whisper-trt:${PYTHONPATH}
EXPOSE 10300

# ✅ Runtime check: will actually assert CUDA is usable *once the container runs on Jetson*
HEALTHCHECK --interval=1m --timeout=5s --retries=3 CMD python3 -c "import torch; assert torch.cuda.is_available()" || exit 1

# 🚀 Default entrypoint: start Wyoming Whisper TRT server
CMD ["python3", "-m", "wyoming_whisper_trt", "--uri", "tcp://0.0.0.0:10300"]
re

Here is the portainer docker compose file to go with it:

version: "3.8"

services:
  whisper-trt:
    image: wyoming-whisper-trt:jetson     # 👈 Uses the local image we built
    container_name: whisper-trt
    ports:
      - "10300:10300"                     # Expose Wyoming protocol port
    command: >
      python3 -m wyoming_whisper_trt
      --model base.en
      --uri tcp://0.0.0.0:10300
      --data-dir /data
      --device cuda
      --compute-type float16
    volumes:
      - whisper_data:/data                # Persistent model cache & data
    runtime: nvidia                       # 👈 Tells Docker to use Jetson GPU
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    restart: unless-stopped               # Auto-restart if it crashes

volumes:
  whisper_data:

Seems to transcribe 15s of audio in about 1 second.

Would you be able to put this on a registry? I looked through your code, and it seems pretty good to me. The only thing I see from a security standpoint is that it would be running as root, which is not best practice. I tested it on my system and it does work quite well!

I would like this as well.