How to differentiate "did not understand what was said" / "nothing was said"?


#1

Hello,

as was written in the title, I am trying to get some feedback when no intent was recognized, and know if it was because nothing was said, or the assistant did not understand the sentence.

I am using setOnSessionEndedListener and working on the sessionEndedMessage to try to achieve this. (I wanted to use setOnIntentNotRecognizedListener at first, but another topic showed me it did not work using the simple startSession).

So, in my SessionEndedListener, I am logging raw info:

if (sessionEndedMessage.getTermination() == SessionTermination.Timeout.INSTANCE) {
Log.i(TAG, “Session termination: timeout”);
}

if (sessionEndedMessage.getTermination() == SessionTermination.IntenNotRecognized.INSTANCE) {
Log.i(TAG, “Session termination: Intent not recognized”);
}

If I say something that makes no sense, I can see I’m in the “intent not recognized” case.

If I don’t say anything, I’m in the same situation. I was expecting the Timeout type to be here for this type of situation. It seems I am mistaken.

Is there a way to achieve this?


#2

Hi @remy_v

The dialogue component considers the “nothing was said” case as a special case of “intent not recognized” case (ie the system was asked to recognize an intent but was didn’t here anything so it didn’t recognize an intent). This explains why you see the IntenNotRecognized termination in both cases (We’ve got a typo here will fix that. BTW you should not compare with the instance directly but check the type on the termination).

The Timeout termination type should not happen when the platform is properly running. You’ll mosty have it for example if your action code doesn’t send the Continue/EndSession message after receiving an intent. The ASR will by default stop listening if it doesn’t picks up anything after listening for 4 seconds, and this is way below the tiemout threshold of the dialogue (by design)

In you case, the way to go is using setOnIntentNotRecognizedListener as you mentioned. Note that you need enable it in you start/continueSession. The listener gives you an object with the input that will contain the text the user said, and will be empty if the user didn’t say anything.

Note that this work only on session that your code as already “touched”. There is at the moment no way to set a defaultIntentNotRecognized handler to react to the case where the user says the hotword but doesn’t say anything afterwards. Future versions will include a listener for the asr captured text and will allow you to implement such a thing.