How the Voice Command Work in Smart TV Remote Control?
Initially, appliances such as televisions, set-top boxes, and air conditioners required only a small amount of control. Usually, the on/off switch, several selection buttons, and two sets of increase/decrease controls are sufficient.
As technology advances, device support features increase, and command and configuration options increase. But users still want to use a single remote to manage all the features. Engineers start to integrate more complex user interfaces (UI). In order to facilitate the user's convenience, a layered menu appears on the TV screen, and more and more buttons are set in the remote control.
Nowadays, there two important development directions of smart household appliances. First, making devices more “smart” to understand the human language and follow the human's verbal commands to complete the corresponding operations, thus achieving direct language communication and control of the human-machine. Second, the user interface is more user-friendly, so that the elderly and disabled can be used without barriers. Smart devices can connect to other devices and the Internet for added convenience. However, making complicated buttons on the remote control is impractical for manufacturers and the user experience will be poor.
In this article, we will discuss how voice commands can be used to provide a better user experience, especially focus on the Bluetooth Low Energy(BLE) remote control for smart TV.
Voice Command and Remote Control
As the news says “Voice control has advanced from being the technology of the future to being one of the hottest new technologies of the day.” Voice is a very powerful and intuitive interface. A short language which contains enough information can describe very complex commands. Devices now have access to cloud computing and can be used in state-of-the-art recognition engines such as those from Microsoft, Google and Amazon. Today, the services of cloud-based speech recognition provide a very good user experience.
Voice can also be provided to virtual assistants for smart devices which are rapidly expanding over the past few years. Among the software agents that perform tasks for individuals, Apple's Siri, Google Assistant, and Amazon Alexa are the most widely used.
These virtual assistants, which use natural language to match text or voice to execute commands, have been integrated into phones and are rapidly evolving in smart TVs, watches and wearable devices. Virtual assistants offer a wide range of services, including providing information, playing music and videos and configuring devices, and even purchasing items for customers.
Although there has been a constant listening to voice commands, the background noise and the distance between the user and the microphone make it difficult to correctly identify the message. In addition, the amount of data exchanged between the device and the cloud service is so large that most requests in the speech recognition engine are irrelevant. Furthermore, constantly recording environmental sounds can pose serious security and privacy issues.
As a result, a trigger is needed that can be implemented by buttons, gestures or recognizable words or phrases. This solution is suitable for users who are close to the device, such as a smartphone. But it is much more difficult to correctly identify triggers and provide a good user experience in smart TVs, set-top boxes, and other applications that are far from the user. The microphone needs to be close to the user and the remote control already exists. So, the most natural way is embedding a microphone in the remote control which is also known as voice remote control.
Voice Command and Speech Recognition in Smart TV
With the continuous development of science and technology, the emergence of speech recognition technology has enabled the above ideals of human beings to be realized. Speech recognition is a high-tech that allows a machine to recognize and understand a voice signal into a corresponding text or command.
In short, for the voice remote control manufacturer, the voice command function can be expressed as: "Capture 'sufficient' high quality voice recordings, send them to the speech recognition engine, and then process the text results to derive the user's commands."
Regarding the architecture of the voice command remote, we will look at the path of the system audio signal. In this process, we will also pay attention on the challenges usually encountered when trying to implement a cost- and power-efficient voice remote control, and their possible solutions. In details, speech recognition in voice remote control for smart TV consists of four parts. The first part is the analog-to-digital conversion part. The input end receives the input voice signal and converts it into a digital acquisition signal that the digital chip can process. The decoded voice digital signal is then converted to an audio analog signal at the output and played through the speaker. The second part is the speech recognition part. The function is to analyze the input digital speech term signal and recognize the command represented by the entry signal, which is generally completed by the DSP. The third part of the voice prompt and playback part, which is also generally completed in the DSP, the core is digital compression encoding and decoding of the voice signal, the purpose is to prompt the user to operate and respond to the recognized voice, complete the human voice interaction. The fourth part is the system control part, which converts the speech recognition result into the corresponding control signal, and converts its output into physical layer operation to complete the specific function.
Bluetooth Low Energy(BLE) Remote Control
Those who have used wireless audio protocols are more skeptical of using data-oriented protocols to transmit voice commands, such as Bluetooth Low Energy(BLE) technology.
Transferring voice commands are slightly different from transmitting real-time audio or vocals (such as phone conversations). Since the user does not have to listen to their voice or keep the conversation, the delay requirement can be relaxed because fixed and a range of delays are unavoidable. However, the data loss requirements are very strict, and the lack of audio clips may make the speech recognition engine unable to successfully extract the user's initial information. Of course, the machine can request you to repeat this message, but this kind of user experience will not be good.
Bluetooth Low Energy(BLE) is a packet-based protocol that exchanges packets at specific points in time, separated by the periodic connection intervals. If interference occurs during the rendezvous, the packet is distorted and the data will be retransmitted in the next packet. Each connection interval occurs in a different frequency channel. Usually, interfering signals occur frequently over a range of frequencies, distorting multiple connection events and significantly reducing bandwidth. In order to solve this problem, BLE provides a mechanism called channel mapping update. The master device will detect the affected frequency range and implement a channel map update procedure.
The voice remote control manufacturers are sensitive to the cost due to the mass production. Bluetooth Low Energy (BLE) technology is a low cost, short range, interoperable, robust wireless technology that operates in the license-free 2.4 GHz ISM RF band. It is designed from the outset as ultra-low power (ULP) wireless technology and minimizes the power consumption by using many intelligent means. Therefore, BLE technology is ideal for transferring data from micro-wireless sensors (exchange data every half-second) or other peripherals such as remote controls that use fully asynchronous communication.
The three characteristics of Bluetooth Low Energy (BLE) technology in smart TV remote controls enable its ULP performance, which are maximum standby time, fast connection and low peak transmit/receive power consumption.
First, the wireless "turn-on" time, as long as it is not very short, can drastically reduce the battery life. So any necessary send or receive tasks need to be completed quickly. The first technique used by Bluetooth Low Energy technology to minimize wireless “turn-on” time is to find other devices using only three "advertising" channels, or to announce their existence to devices seeking to establish a connection, while standard Bluetooth technology uses 32 channels. In this situation, BLE technology scans other devices only by "turn-on" for 0.6 to 1.2ms, while the standard Bluetooth technology takes 22.5ms to scan its 32 channels.
Then, Bluetooth Low Energy remote controls for smart TV "complete" each connection (i.e. scanning other devices, establishing links, sending data, authenticating and ending properly) in just 3ms. And standard Bluetooth technology requires hundreds of milliseconds to complete the same connection cycle.
What’s more, technology of BLE remote control can limit peak power consumption in two other ways: with more “relaxed” RF parameters and sending short packets. Both technologies use Gaussian Frequency Shift Keying (GFSK) modulation, but Bluetooth low energy consumption the modulation index used by the technology is 0.5, while the standard Bluetooth technology is 0.35. The index of 0.5 is close to the Gaussian Minimum Shift Keying (GMSK) scheme, which can reduce the power consumption requirements of wireless devices. There are two benefits to lower the modulation index, which are increasing coverage and enhancing robustness. Standard Bluetooth technology uses a longer packet length. When transmitting these longer packets, the wireless device must remain in a relatively high power state for a longer period of time, which tends to cause the silicon to heat up. This heating will change the physical properties of the material, which in turn changes the transmission frequency (interruption link) unless the wireless device is frequently recalibrated. Recalibration will consume more power and require a closed-loop architecture, making the wireless device more complex and pushing up the price of the device.
Speech recognition is gradually becoming the key technology of human-machine interface in information technology. The combination of speech recognition technology and speech synthesis technology enables people to get rid of the keyboard and operate through voice commands. The application of speech recognition technology has become a competitive emerging technology industry.
Voice commands are already in use on smartphones, smart TVs, and set-top boxes. The low-cost microphone is integrated into the BLE-connected peripherals, and the user's speech recognition experience is greatly enhanced. The commands collected from the remote controller are transmitted to the cloud speech recognition engine through the smart device, and can control the smart device itself and peripheral devices connected to the smart device or other devices controlled by the voice assistant.
About DUSUN ELECTON LTD.
More than 16+ years of designing, developing and manufacturing, Dusunremotes has been created and delivered billions of high-quality and cost-effective remote control devices worldwide. Our innovative remote control solutions designed based a broad wireless technology portfolio, like BLE, RF4CE, 2.4G, IR and interactive paradigms: Air-mouse, Voice, Touch, Motion, Pointing and Virtual Reality control. Customers came from a wide range of industries, such as pay TV, PC/OTT, home automation, automobile, consumer electronics, IOT and so on. We always try our best to bring customers with value-added and different design experiences for their essential products and brand. Learn more