Improve recognition between "on" and "off"?

What’s the best way, if there is any, to improve the accuracy of correctly detecting “on” vs “off” in a sentence? This is the primary use case for home automation, such as “Turn on the kitchen lights”, but snips seems to get it wrong over 50% of the time and mix the two up.

Thanks,

//TB

Shouldn’t miss them. How do you setup your training exemples?

It doesn’t miss them, it confuses the two. In other words, when I say “turn on the kitchen lights” it will hear “turn off the kitchen lights” and vice versa.

There are two separate intents, based off the Home Assistant HassTurnOn and HassTurnOff with examples such as the following (one which has “on”, and the other has “off”):

turn off [office lights](name)
turn [tv](name) off
turn off the [outside lights](name)
turn the [master bedroom lights](name) off
switch off [office lights](name)
switch [tv](name) off
switch off the [outside lights](name)
switch the [master bedroom lights](name) off

Would I get better results if I combine into one intent and make “on/off” a slot?

Thanks,
//TB

You might get better results with on/off as a slot, yes, but I cannot guarantee it to be honest, this has been evolving back and forth

Thanks, any other ideas? I couldn’t seem to find detailed documentation on the various ASR configuration options that might help, such as beam_size and max_activity (not in docs, found a reference to it in the forums).

In your training use examples with words like- Glow, resume, start etc for Turn On intent and words like-Halt, pause, stop, kill, close etc for Turn Off intent. They have better chances of getting recognized correctly than Turn On and Switch On because On/Off sound pretty same most of the time.
BTW if anyone comes up with more such words, please inform me too.

Thanks adityakush,

I’m not willing to sacrifice natural interactions (and neither is my family) by using a workaround because the implementation isn’t good enough. I’ve used Alexa and Google Assistant and neither had this level of problems, so it’s technically feasible to get more than a 50% success rate.

Are there any docs about the beam_size, max_activity, etc anywhere?

Hi docBliny,
I found this in the documentation.
But the parameters are not explained there.

Thanks atomix,

Yeah, I saw those but without some explanation it’s really hard to even try to Google what those settings do and how/if they might help.

//TB

What hardware do you use btw?? I am using orange Pi Lite with onboard mic and surprisingly works better than my earlier setup having Raspberry pi 3B+ with USB mic. With usb mic, I faced the problem of On/Off mixing a lot. But with OPI lite it works pretty well most of the time and has a great range too.

I haven’t tried a mic array yet but I think that should produce even better results in getting words right.

I’m using Raspberry Pi 3b+s with ReSpeaker Mic Array v2.0s.

Hm, thats good enough. Maybe Snips needs some more time before its speech recognition becomes as good as alexa / google

Which firmware do you have on you mic array? For me I had to install the one channel 6.02dB gain firmware for it to work best.
Johannes

As a workaround I did this and is working wonders for me now. Thanks to you, you made me think about this in the first place.
I edited the file
sudo nano /usr/share/snips/assistant/custom_asr/config.json
In this file, go to this part

    "endpointingRules": {
        "rule3": {
            "maxRelativeCost": 8.0, 
            "minTrailingSilence": 1.0, 
            "minUtteranceLength": 0.0, 
            "mustContainNonsilence": true
        }, 
        "rule2": {
            "maxRelativeCost": 5.5, 
            "minTrailingSilence": 0.7, 
            "minUtteranceLength": 0.0, 
            "mustContainNonsilence": true
        }, 
        "rule1": {
            "maxRelativeCost": 10000000000.0, 
            "minTrailingSilence": 4.0, 
            "minUtteranceLength": 0.0, 
            "mustContainNonsilence": false
        }, 
        "rule5": {
            "maxRelativeCost": 10000000000.0, 
            "minTrailingSilence": 0.0, 
            "minUtteranceLength": 10.0, 
            "mustContainNonsilence": false
        }, 
        "rule4": {
            "maxRelativeCost": 2.5, 
            "minTrailingSilence": 0.5, 
            "minUtteranceLength": 0.0, 
            "mustContainNonsilence": true
        }

Edit the line in rule 2 and rule 4 to make it

"minTrailingSilence": 2.0, 

Save and reboot.
Now snips will listen a bit longer before ending capturing speech.
So now I can say Turn Off and stress Off a bit and then say fan.
Incresing the above time to 2s practically allows to say your words clearly by stressing a bit.
Without doing above modification, if I stress on the word turn off and then say fan after a bit of pause, snips only captured turn off.
But now Snips can clearly hear me saying On/ Off along with the part that follows.