Leveraging Speech Recognition with Qt Commercial
April 16, 2012 by Tuukka Turunen | Comments
In order to have easy access to speech recognition with Qt Commercial we have embarked on a research project leveraging the Nuance VoCon Hybrid speech recognition engine. To demonstrate the potential, speech recognition is integrated to our concept Car HMI system built with Qt Commercial. Using the Nuance VoCon Hybrid speech recognition together with Qt Commercial allows the user to easily control the system with both touch and voice.
In our everyday interactions, speaking is probably the most common method of exchanging information with others. Today, interaction with computers and embedded systems is mainly done via a graphical UI. In many use cases speaking would complement the interaction, but it is difficult to make really great speech recognition. Poorly working speech recognition is a common source of entertainment, but it is not funny at all if you have a system equipped with such.
Touch and Voice Input for Automotive and Beyond
Combining the power of touch and voice input is very valuable in an automotive HMI solution. The user can choose the best input-option to use depending on the situation. Voice input makes it possible to perform many use cases safely with both hands on the wheel. It is not just cars that benefit from the option of using voice to supplement the input mechanisms of the system. Being able to leverage a reliable and powerful speech recognition is a great asset for many embedded systems as well as applications.
In order to have easy access to speech recognition from Qt Commercial, we have worked with Nuance to integrate their VoCon Hybrid speech recognition engine. Hybrid means it is capable of recognizing speech both inside the client application and to use server-based recognition when needed. If the client side is not able to recognize the commands reliably, it is possible to leverage the server-based recognition with larger vocabularies and more languages. It is straightforward for a connected solution using server-based recognition to complement the client’s recognition capability and in addition adds great versatility. Having well-working client side recognition is important to reach fast reaction times, as well as to support offline use. In addition to the use cases we have already integrated into our technology demonstration, VoCon and other technologies from Nuance allow fluent text-to-speech and dictation, as well as navigation with one-shot address entry, just to name some examples.
Qt Commercial Car HMI Concept with VoCon
We have created a demonstration of the Nuance speech recognition solution with Qt Commercial leveraging our CarHMI concept . With speech recognition it is now possible to navigate around and perform direct operations with voice commands. By recognizing the names of the different views the system is able to take the user directly to the requested view for further actions, or perform specified use cases. Compared to navigating with touch input it is straightforward to provide these direct links by speaking. For example, it is possible to use voice input to call a specific person or playing a song.
We have created the needed bindings for the Nuance VoCon and VoCon Hybrid with the Qt Commercial technology demonstration as part of this research project. These are still at an experimental level for now, but we plan to continue the work in this area together with Nuance. We aim to provide all the needed tools to easily use Nuance speech recognition from any Qt Commercial application or embedded device. The speech recognition engine is subject to a separate agreement with Nuance, but the bindings to Qt Commercial will be available to all our licensees.
Check out a demo of our speech integration below.
Leading Provider of Voice Recognition
Luckily there are very good speech recognition solutions available from Nuance, who is truly the leading provider of speech recognition. Their solutions are used in an amazingly large number of different devices and applications to provide state-of-the-art speech recognition for automotive, healthcare and many other industries. Nuance is focused on voice-controlled systems with a staff of 6.000 persons and a strong product portfolio covered with over 4.000 patents and patent applications. Their solutions also support 50 languages, which is very important for reliable recognition. If you would like some more proof points on their solution, please visit www.nuance.com.
Qt Commercial at the Nuance Automotive Forum April 16-17
Today, we are showing Qt Commercial based CarHMI demo with integrated Nuance speech recognition at the Nuance Automotive Forum in Stuttgart. Qt Commercial is an excellent choice for automotive manufacturers to build the HMI systems to meet the needs of the 21st century. Backed by strong business interests and an active community it is a solid and future-proof choice to build what the consumers desire in the years to come.
Let us know what you think of our speech recoginition integration. We appreciate feedback on our research projects!
Comments
Subscribe to our newsletter
Subscribe Newsletter
Try Qt 6.8 Now!
Download the latest release here: www.qt.io/download.
Qt 6.8 release focuses on technology trends like spatial computing & XR, complex data visualization in 2D & 3D, and ARM-based development for desktop.
We're Hiring
Check out all our open positions here and follow us on Instagram to see what it's like to be #QtPeople.