Sonos Announces Acquisition of Snips: Is this the end of Snips for makers?

Rhasspy and Home Assistant here.
Created automations responding to the events Rhasspy generates, work well for me and good out of the box.
I see a lot of people trying Snips and the first parts works well. But then they get stuck with Skills and such.
The same is probably for every assistant except the google/apple/amazon stuff.
Initial setup is relatively easy, but then what?

I think I will try and create a guide for using Rhasspy and Home Assistant if not already existing.

2 Likes

I’ve tested Rhasspy and it works pretty well.

Great Job @Romkabouter and synesthesiam !

The default ASR (using pocketsphinx in french) need more initial configuration than Snips (obviously…) but works sometimes better (for the few test cases I implemented).

The default NLU (using OpenFST) works also quite well for simple cases.

The only thing really missing to easily transition a Snips integration to Rhasspy is the built-in slot types (datetime, number, temperature, percentage, etc.). Some of these can be recreated using JSGF grammar but it is really tedious. Maybe it can be achieved directly using something like python-duckling?..

Alternatively, using Snips NLU might alleviate this but it require some coding to integrate it with Rhasspy (using custom training commands and remote HTTP NLU).

https://duckling.wit.ai

I tried duckling and it is really fast but for now it does not support multi level durations (ex: 2 hours and 35 minutes) and the logic to associate the extracted value with the intent slot need to be handled manually.

I built and tested snips-nlu-rs and although it depends on a lot of Snips Github repositories, it is working with offline trained NLU engine (through the use of Snips NLU python library).

Integrating the Snips NLU engine in Rhasspy for example should allow lots of Snips makers to migrate without changing their intent handling system.

It is a short term solution though as the NLU engine may soon be discontinued by Snips/Sonos.

I’ll keep searching for a better way to handle NLU using “community libraries” :slight_smile:

It is a short term solution though as the NLU engine may soon be discontinued by Snips/Sonos

I don’t think we’re already here. For now, the NLU library works pretty great and is open-source since a long time… I personaly rely on it for my tiny assistant (Pytlas - An open-source python library to build your own assistant) but can migrate for another since it has been abstracted into an Interpreter class.

What kind of hardware are you using?
Does it support Matrix Voice Standard?

Yes, without the esp32 chip you can still attach it to a Pi and use it as a Microphone

Unfortunately, it’s the end of snips for Maker:

Is snips-tts also open-sourced? Because for languages other then english, most free/non-cloud TTS solutions sound plain crappy.

At least in German they just use Pico tts and nothing of their own. You can download and install pico2wave pretty easily.
Johannes

1 Like

Ah, didn’t know that. Might be worth checking out.

When looking at alternatives, like Rhasspy, how would one realize the concept of snips satellites? I like the idea of having one base, and multiple satellites across the rooms.

Does Rhasspy support this in some way, manually or out of the box?

Disclaimer: I don’t actually use Rhasspy but I read the docu and had a little play.
I think they also support audio input over Mqtt using https://pypi.org/project/hermes-audio-server/ which is a project based on one of the few components snips actually made opensource. It should be possible to do something with this.

I’m the developer of this project. It isn’t based on any Snips component, it’s just a minimal implementation of the audio server part of the Hermes protocol that Snips uses.

When I stopped using Snips half a year ago and found Rhasspy, I missed Snips satellites so I implemented this functionality in Hermes Audio Server so I could run this on a Raspberry Pi satellite device.

3 Likes

Ah I misread than, thanks for the clarification and the awesome piece of software.

Note that currently Rhasspy cannot handle multiple individually addressable satellites yet, that still has to be implemented. A setup with one Rhasspy server and one Hermes Audio Server satellite works already fine, though.

Another option is using @Romkabouter’s (another Snips refugee) Matrix-Voice-ESP32-MQTT-Audio-Streamer, which runs on a standalone MATRIX Voice device (no Raspberry Pi needed).

DeepSpeech 0.6 works great and so much improvement

Since its using tensorflow:
What hardware requirements does it have? I know this is speech2text, but for the various text2speech engines based on deep learning it seems like you really need capable hardware?

Hi @koan, you say in the doc of hermes-audio-server that we should plug a hotword detector if we do not want to stream continuouly with respect to privacy. Is it an undocumented option of hermes-audio-server (hotword plugin like porcupine directly embedded into the audio-server like webrtcvad ?) or is it something we have to develop ?

Regards

This is not supported yet in hermes-audio-server. Contributions/ideas are welcome on the project’s GitHub repository.

1 Like

My company – https://keenresearch.com/ – develops/licenses SDKs for on-device speech recognition, currently for iOS and Android. Linux/Raspberry PI is definitely possible as well, although we’ve been a bit reluctant on going down that path.

We license the SDK on commercial bases, so this is probably not relevant for a number of users that are looking for open-source/free alternatives. In case it is, please feel free to reach out (info@keenresearch.com).

Best,
Ogi