Esp32-s3 local wake word detection

From my understanding the S3-Box has local wake word detection so its not consistently streaming to the server. My question is if you get just a generic esp32-s3 can yo set it up that way yourself. I’m guessing its the firmware that is downloaded to the S3-Box that really sets it up that way?