![]() Intent classification and entity extraction are two natural language processing techniques used to further simplify the problem space for the dialog manager. Intent Classification → Natural Language Processing Voiceflow approaches this problem from a different perspective: allow the designer to specify an observable and explainable logical transition graph that can be triggered by various different user intents. Many competing implementations for this black-box exist in many cases, they are quite literally black-box machine learning models. Now that inputs and output formats are specified, the dialog manager problem is simplified to a "black box" application that ingests an input transcript from the user and outputs the response transcript to the user. To ensure maximum flexibility, Voiceflow's dialog manager API expects a text transcript input and produces both a text and an audio output based on a preconfigured voice persona. Similarly, on the output-side, dialog managers usually produce a text output so that different implementations of text-to-speech can be used to produce the desired voice(s) based on branding and context. Most dialog manager implementations that accept audio inputs usually implement some variation of speech-to-text functionality to rectify the input audio (large parameter space) into a text transcript (smaller parameter space) in order to reduce the computational cost and duration of the downstream processes. Inputs and outputs can be divided into 2 broad categories: audio and transcript. ![]() Though definitions vary, at a high level, all dialog managers determine the most fitting output given some (set of) user input(s). At the heart of modern conversational agents is the "dialog manager". ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |