Voice over IP Integration - Call from Any SIP Softphone

mn_box · January 16, 2025, 12:08am

Oef, not that advanced over here. How can I do that? ssh into HA and then tcpdump command which what parameters?

jaminh · January 16, 2025, 1:41am

Yeah, sorry, if you are up for learning I can try to describe how I would do it. Start by connecting to HA with ssh, then find out what your network interface is called with the ifconfig command. There might be a few listed so you want the one associated with the IP address you use to connect to HA. Once you have that I would run tcpdump with something like:

tcpdump -i <iface name> 'udp port 5060'

You might want to write that out to a file by adding ‘-w ’. You can then read it with ‘tcpdump -r -A’.

mn_box · January 16, 2025, 9:02am

Great, always keen to learn.

See below the output from the network interface when I call the VOIP integration. Basically it rings but it is not being pick-up

There is nothing else in this log

jaminh · January 16, 2025, 12:13pm

Sorry, I think you will need to add -A to that command to get it to print out the packet contents.

thedanno · January 16, 2025, 9:51pm

When I use the app the successful log info is:

Speech-to-text2.22s
Engine
stt.faster_whisper
Language
en
Output
What is the outside humidity?
Raw
Natural Language Processing0.06s
Engine
conversation.home_assistant
Language
en
Input
What is the outside humidity?
Response type
query_answer
Prefer handling locally
true
Processed locally
true

I see the error “speech-to-text failed” when running with the HT801V2 and the log is below. I have the codec (all entries) set to OPUS and followed the tutorial for setting up the rest, though the info for setting up the GrandStream device is rather light in the tutorial.

stage: done
run:
pipeline: 01jgm6dzfctr1t7ngkfnwymrv4
language: en
events:

type: run-start
data:
pipeline: 01jgm6dzfctr1t7ngkfnwymrv4
language: en
timestamp: “2025-01-16T21:38:37.193354+00:00”
type: stt-start
data:
engine: stt.faster_whisper
metadata:
language: en
format: wav
codec: pcm
bit_rate: 16
sample_rate: 16000
channel: 1
timestamp: “2025-01-16T21:38:37.193614+00:00”
type: error
data:
code: stt-stream-failed
message: speech-to-text failed
timestamp: “2025-01-16T21:38:42.552141+00:00”
type: run-end
data: null
timestamp: “2025-01-16T21:38:42.552638+00:00”
stt:
engine: stt.faster_whisper
metadata:
language: en
format: wav
codec: pcm
bit_rate: 16
sample_rate: 16000
channel: 1
done: false
error:
code: stt-stream-failed
message: speech-to-text failed

mn_box · January 16, 2025, 10:01pm

Indeed, see below the output (which then repeats)

Any ideas?

jaminh · January 17, 2025, 1:49pm

That is interesting, it looks like in the From header there is no space between the description and the actual URI. I’ll have to take a closer look at the SIP spec, I think the space is required.

jaminh · January 17, 2025, 2:09pm

I stand corrected, it appears in https://www.ietf.org/rfc/rfc3261.txt section 20.10 Contact that

There may or may not be LWS between the display-name and the “<”.

Assuming LWS means linear white space. Looks like we need to update the header parsing to account for that.

jaminh · January 17, 2025, 2:12pm

As you may have noted it appears your STT processing is taking longer than 2 seconds, which I suspect is the cause of the problem.

mn_box · January 17, 2025, 3:50pm

Great to hear you might have found the issue. I am sure other 3cx users will be happy to get this resolved. Let me know if you want more details on my set-up to further help resolve.

jaminh · January 18, 2025, 4:33pm

It was a pretty easy fix, if you want to watch for if/when it gets merged Fix SIP header parsing by jaminh · Pull Request #26 · home-assistant-libs/voip-utils · GitHub.

Voice over IP Integration - Call from Any SIP Softphone

Speech-to-text2.22s Engine stt.faster_whisper Language en Output What is the outside humidity? Raw Natural Language Processing0.06s Engine conversation.home_assistant Language en Input What is the outside humidity? Response type query_answer Prefer handling locally true Processed locally true

Speech-to-text2.22s
Engine
stt.faster_whisper
Language
en
Output
What is the outside humidity?
Raw
Natural Language Processing0.06s
Engine
conversation.home_assistant
Language
en
Input
What is the outside humidity?
Response type
query_answer
Prefer handling locally
true
Processed locally
true