Snips speaks too fast!


#1

My setup is HASS.io on Raspberry PI3 model B with Snips and MQTT addons. From log of Snips addon (see below), Snips does have response to my voice command like “turn on the main light” although it recognizes as “turn on the menu” without correct slot. So it asks “Which light mode do you need” to get some correct slot.

The problem is that the sentence is played at several times the original sampling rate (16K as seen in the log). It is simply too fast to be understood. I had tried ALSA’s speaker-test util to check it and found the 48K wav sound is perfect to be played. Does anyone has idea about what might cause this issue?

[01:31:27.406930] INFO :snips_hotword_hermes: Hotword detected
[01:31:27.437724] INFO :snips_dialogue::dialogue: State: Idle, incoming Message: Hotword(Detected)
[01:31:27.437930] INFO :snips_dialogue::services: publish Hotword(Wait)
[01:31:27.438021] INFO :snips_dialogue::services: publish Asr(ToggleOn)
[01:31:27.438107] INFO :snips_dialogue::services: publish AudioServer(PlayFile)
[01:31:27.438183] INFO :snips_dialogue::dialogue: Current State: WaitingQuery
[01:31:27.488473] INFO :snips_asr_hermes : Listening
[01:31:27.488602] INFO :audio_server_hermes : Playing “/usr/share/snips/dialogue/sound/start_of_input.wav” using output “default”, wav spec : WavSpec { channels: 2, sample_rate: 22050, bits_per_sample: 16, sample_format: Int }
[01:31:32.727678] INFO :snips_asr_lib::asr: Endpoint detection.
[01:31:33.050403] INFO :snips_asr_hermes : Cleanup
[01:31:33.050534] INFO :snips_asr_hermes : Idle
[01:31:33.085769] INFO :snips_dialogue::dialogue: State: WaitingQuery, incoming Message: Asr(TextCaptured)
[01:31:33.085897] INFO :snips_dialogue::services: publish Asr(ToggleOff)
[01:31:33.085951] INFO :snips_dialogue::services: publish AudioServer(PlayFile)
[01:31:33.088050] INFO :snips_dialogue::services: publish Nlu(Query)
[01:31:33.088218] INFO :snips_dialogue::dialogue: Current State: WaitingIntent
[01:31:33.176483] INFO :audio_server_hermes : Playing “/usr/share/snips/dialogue/sound/end_of_input.wav” using output “default”, wav spec : WavSpec { channels: 2, sample_rate: 22050, bits_per_sample: 16, sample_format: Int }
[01:31:33.177359] INFO :snips_analytics_hermes: Cleanup
[01:31:33.177651] INFO :snips_analytics_hermes: Idle
[01:31:33.195957] INFO :queries_hermes : Cleanup
[01:31:33.196096] INFO :queries_hermes : Idle
[01:31:33.226003] INFO :snips_dialogue::dialogue: State: WaitingIntent, incoming Message: Nlu(IntentParsed)
[01:31:33.226245] INFO :snips_dialogue::services: publish Tts(Say)
[01:31:33.226329] INFO :snips_dialogue::dialogue: Current State: WaitingEndSpeaking(WaitingAnswer(“Which light mode do you need?”, PartialIntent(IntentMessage { input: “turn on the menu”, intent: IntentClassifierResult { intent_name: “user_3nvyne7w2__ActivateLightMode”, probability: 0.745324 }, slots: Some([]) }), “LightingMode”))
[01:31:33.362275] INFO :snips_analytics_hermes: Cleanup
[01:31:33.362442] INFO :snips_analytics_hermes: Idle
[01:31:33.486921] INFO :audio_server_hermes : Playing “tts-5” using output “default”, wav spec : WavSpec { channels: 1, sample_rate: 16000, bits_per_sample: 16, sample_format: Int }
[01:31:35.377096] INFO :snips_dialogue::dialogue: State: WaitingEndSpeaking(WaitingAnswer(“Which light mode do you need?”, PartialIntent(IntentMessage { input: “turn on the menu”, intent: IntentClassifierResult { intent_name: “user_3nvyne7w2__ActivateLightMode”, probability: 0.745324 }, slots: Some([]) }), “LightingMode”)), incoming Message: Tts(SayFinished)
[01:31:35.377364] INFO :snips_dialogue::services: publish Hotword(Wait)
[01:31:35.377511] INFO :snips_dialogue::services: publish Asr(ToggleOn)
[01:31:35.377591] INFO :snips_dialogue::services: publish AudioServer(PlayFile)
[01:31:35.377663] INFO :snips_dialogue::dialogue: Current State: WaitingAnswer(“Which light mode do you need?”, PartialIntent(IntentMessage { input: “turn on the menu”, intent: IntentClassifierResult { intent_name: “user_3nvyne7w2__ActivateLightMode”, probability: 0.745324 }, slots: Some([]) }), “LightingMode”)
[01:31:35.427168] INFO :snips_asr_hermes : Listening


#2

I have extracted tts wav file from MQTT messages. It is a wav file with 16000Hz sampling rate. I can play it normally with any player on PC but still fail in Snips. I then converted the tts wav file to 22050 Hz which is the same sampling rate as start_of_input.wav (wake up sound of Snips). The converted file can be played well with “aplay” in the environment of Snips container! Therefore, I think it should be the audio setting of ALSA in Hass.io. Still try to figure out how to correctly configure ALSA with a USB speakerphone (Jabra 510)…


#3

I had the same issue:

here is fix from developers:

pcm.!default {
type asym
playback.pcm {
type plug
slave {
pcm "hw:1,0"
rate 48000
format "S16_LE"
channels 2
}
rate_converter “samplerate_medium”
}
capture.pcm {
type plug
slave.pcm “hw:1,0”
}
}

"I’ve done some tests with a Jabra 510 and I have indeed the same problems you have. Here is an alsa conf that should fix the problems. Can you try it ? (the rate / format and channel should match the ones shown by cat /proc/asound/card1/stream0"

Also try to join - https://snipslabs.slack.com You will get more help there


#4

Following the example, aplay cannot play any wave file because it cannot find /usr/lib/arm-linux-gnueabihf/alsa-lib/libasound_module_rate_▒٩.so. I don’t know why the file name comes with some strange characters. I do find libasound_module_rate_medium.so. I guess there may be some problem about the base image coming with snips_addon. I remove the line of rate_converter and try it again. This time I can use aplay to play wave file with 16000 and 22050 correctly!

However, with the new .asourcerc, Snips tts still has the same issue (play wave file with 16000Hz sampling rate at faster speed). Maybe it uses other way to play wave file… still trying to find solutions…


#5

I am using the debian packages and not snips docker. In any case this works for me. I had to comment out the rate converter but this config both tts, alsa and direct wavs work properly.

  type asym
  playback.pcm {
    type plug
    slave {
      pcm "hw:1,0"
      rate 48000
      format "S16_LE"
      channels 2
    }
#    rate_converter "samplerate_medium"
  }
  capture.pcm {
    type plug
    slave.pcm "hw:1,0"
  }
}

ctl.!default {
  type hw
  card 1
}

#6

(editing post)

The .asoundrc / asound.conf in the link below solved my Jabra 410 turbo talk issue.


#7

Hello,
Sorry to open this very old topic but I’ve spent several hours trying to resolve that issue with my Jabra410 speaking too fast with SNIPS and I’m running in circle :frowning:

Despite all my search I’m not able to find any suitable solution for my problem (even try on 2 differents devices, resinstall from scratch …etc)

On github here I’ve seen a omment from tschmidty69 saying “FWIW, I had to manually add rate 48000 under the playback section in the asound.conf to get it to work” but I’m not able to put that rate 48000 in my asound.conf as I’m always getting error message.

Would it be possible to get a copy of working files (asound.conf and .asoundrc
) please ?

Thanks


#8

I had the same problem with my Jabra Speaker.
Here is my /etc/asound.conf file:

pcm.!default {
  type asym
  playback.pcm {
    type plug
    slave {
      pcm "hw:1,0"
      rate 48000
      format "S16_LE"
      channels 2
    }
  }
  capture.pcm {
    type plug
    slave.pcm "hw:1,0"
  }
}

ctl.!default {
  type hw
  card 1
}

Just make sure that the address "hw:1.0" is the address of your Jabra device if you have more than one sound device on your Raspi. Hope this helps!


Snips won't respond
#9

Thanks a lot it works now … in fact I was using the exact same config but on my .asoundrc file and just realized now that the asound.conf was taking over the asoundrc

Great thanks again !


#10

Your welcome.
Took me some googling and fiddling as well before it worked. Happy to share those findings :slight_smile:


#11

What’s the process for making this change? sudo nano /etc/asound.conf and just save it? Do I need to restart snips aftewards. I got this to work once and after the PI rebooted it didn’t work.
I resetup the audio via “Sam setup audio” and noticed the asound.conf file was changed back to default.

Is there a way to have snips build the right config file? how do I go about this?


#12

Hello,

If you execute sam setup audio, the asound.conf is generated again.
So the only solution is to not use sam setup audio, and modify manually the asound.conf with all examples you can find on this forum or the net.
You check with sam test speaker and sam test microphone if asound.conf is correct.

Ced


#13

Thanks. I’ll give it a shot again.