Blog

MSFT CLU Deeper dive. Yes, Azure AI Conversational language understanding (CLU) is much better!

Brad Crain, 2 min read

I have enhanced our chatbot authoring solution so a chatbot author can select to either use Microsoft Language Understanding Service (LUIS) or Azure Conversation Language Understanding Service (CLU) at runtime with a chatbot.  In my previous post, I mentioned that my first impression – doing some spot testing using LUIS and CLU’s LLM authoring applications -  indicated at a first pass that CLU is a great improvement over LUIS.   I have done more research now and, wow, yes, CLU is a great improvement over LUIS.

In eBotSpot's chatbot authoring application I added support for CLU (it already had support for LUIS) and also added a feature enabling the author to specify an utterance and then see how that utterance is , or isn’t recognized, by LUIS or CLU.  In this new Verify feature the needed API call is made to the appropriate endpoint for LUIS or CLU, passing in the bot author defined query/utterance; then, the API results are presented to the author is a table list of all the intents recognized and their respective scores, as well as the top intent and its score.  

Once implemented, I used the new Verify feature to experiment with Microsoft’s statement that CLU has “…state-of-the-art language models that understand the utterance's meaning and capture word variations, synonyms, and misspellings while being multilingual.”.  Well, my findings verify Microsoft’s statement indeed speaks the truth. I found that CLU certainly is much better in these areas (synonyms, misspellings, word variations)  than LUIS.  These capabilities in CLU alone are key reasons why anyone currently using LUIS should upgrade to CLU.  


Here are some examples of my testing results.  

Exhibit 1. All my work using the CLU application was focused on the English language. The built-in support for additional languages - without me as the LLM author having to enter the utterances in other languags beyond English - worked very well as shown by the screen shot above (this is the information show by our chatbot authoring "Verify" feature)

CLU api call results showing top intent was correctly identified as RequestJoke

LUIS was setup with same utterance as CLU but LUIS did not correctly identify the intent when using the same utterance of "Humor me please"

Exhibit 2. Both CLU and LUIS were trained with the same utterances.  I had an intent "RequestJoke" with training  utterances such as "Tell me a joke", "Got any jokes" and so on. I did not have very many training utterance either. As can be seen above, when I used the phrase "Humor me please" in the Verify feature, CLU properly came up with RequestJoke as the top intent and the confidence score was high at 0.8947012.  Whereas, looking the LUIS results LUIS didn't come up with any score above 0.10686312 and the correct intent RequestJoke was not even in the top 3 findings.