If that module don’t work maybe this is worth a try.
I’m using a esphome adapter version of this to recognize when someone press the door bell intercom
https://iotassistant.io/esp32/smart-door-bell-noise-meter-using-fft-esp32/
its a bit more coding involved but if you want to try it i can clean up my code a bit, this is what i can pick up when i play:
https://www.youtube.com/watch?v=2vMBLxaKYX0
This is the dominant sound frequency with 22627Hz sample rate, this is the bucket number to get the real frequency multiply with 22.
[08:48:53][E][custom:428]: peaks 9,7,7,12,9,9,8,9,13,9
[08:48:53][E][custom:428]: peaks 9,9,8,8,7,5,8,3,14,8
[08:48:53][E][custom:428]: peaks 8,7,8,13,7,10,9,9,11,12
[08:48:53][E][custom:428]: peaks 13,12,13,17,10,6,8,9,10,12
[08:49:06][E][custom:428]: peaks 100,99,99,99,99,99,99,10,10,10
[08:49:06][E][custom:428]: peaks 10,10,11,10,10,11,11,9,13,125
[08:49:06][E][custom:428]: peaks 125,125,125,113,10,112,112,112,100,100
[08:49:06][E][custom:428]: peaks 99,99,99,99,99,99,10,10,10,6
[08:49:06][E][custom:428]: peaks 9,9,83,84,84,84,84,84,84,84
[08:49:06][E][custom:428]: peaks 84,84,84,84,84,84,84,88,88,88
[08:49:08][E][custom:428]: peaks 88,88,99,99,10,10,9,112,11,10
[08:49:08][E][custom:428]: peaks 77,75,75,75,75,83,84,84,84,84
[08:49:08][E][custom:428]: peaks 83,88,84,88,88,83,84,84,84,84
[08:49:08][E][custom:428]: peaks 84,84,84,84,84,84,84,84,84,83
[08:49:08][E][custom:428]: peaks 84,99,99,99,99,99,99,10,10,10
[08:49:08][E][custom:428]: peaks 10,11,10,99,98,99,99,99,99,99
[08:49:11][E][custom:428]: peaks 100,12,10,12,11,10,9,10,9,9
[08:49:11][E][custom:428]: peaks 9,10,125,125,125,125,125,10,112,112
[08:49:11][E][custom:428]: peaks 10,112,10,99,99,99,99,99,99,99
[08:49:11][E][custom:428]: peaks 10,11,10,11,5,10,10,10,11,10
[08:49:11][E][custom:428]: peaks 4,10,134,10,10,10,3,9,11,10
[08:49:11][E][custom:428]: peaks 9,11,10,10,11,150,150,150,150,10
[08:49:14][E][custom:428]: peaks 125,125,125,125,125,11,10,112,11,112
[08:49:14][E][custom:428]: peaks 125,125,125,125,125,11,11,11,10,11
[08:49:14][E][custom:428]: peaks 11,10,134,11,10,10,10,10,9,11
[08:49:14][E][custom:428]: peaks 10,11,10,10,10,10,10,10,10,9
[08:49:14][E][custom:428]: peaks 9,10,10,11,11,10,10,9,10,10
not to hard to write a some code that can recognize some patterns in that.