EE Times Europe - Responsive Smart Speaker Designs

2022-06-19 00:21:14 By : Ms. Yin Irene

The rapid adoption of smart speakers in homes is accompanied by rising user expectations. In order not to disappoint them, developers need reliable components to create new and sophisticated products. These requirements call for innovative and comprehensive solutions. Infineon offers a comprehensive portfolio from a single source, as well as an ecosystem, to support the development of advanced smart speakers.

The market for smart speakers is growing rapidly, and new features are constantly being added to improve the user experience. The Voice Assistant Platform Forecasts from SAR Insight & Consulting predict that global shipments of smart speakers will grow to almost 200 million units per year by 2026.

The installed base is expected to double from 539 million units in 2021 to 1.051 million units in 2027, a compound annual growth rate (CAGR) of 9% (Figure 1). However, as the market grows, so do user expectations. Frustration with devices that do not understand instructions and respond accordingly dampen user adoption. Hence, smart speaker adoption is faltering, and growth rates are not reaching their full potential.

To meet the high expectations for the technology and its capabilities, OEMs need solutions that give them a competitive edge, such as easy implementation of new functions and support for the development of future use cases. At the same time, all components need to stay on the cutting edge of various semiconductor developments, but this also requires a tremendous amount of know-how. Components such as microelectromechanical system (MEMS) microphones, touch controllers, and new technologies such as radar are key to improving the user experience in the smart speaker segment — but only if developers know all the products in detail and can apply them correctly.

For example, radar solutions enable future use cases such as vital sensing, gesture sensing, and presence detection (Figure 2), but implementing the appropriate components is challenging. Radar sensors are highly complex products that require specialized knowledge and manpower. However, small OEMs and ODMs often lack the necessary know-how for touch controllers and gesture recognition.

In addition, the growing bill of materials (BOM) complicates the development process and leads to various stumbling blocks: Complex applications require a variety of components, often from different semiconductor manufacturers. With the complex mix of sensors, including digital, analog, and mixed-signal circuits, as well as power supply management and software, it is difficult to keep track of all vendors and their product portfolios. However, to develop reliable smart speakers, it is important to know which components are compatible and how well they complement each other. This can make component selection and procurement unnecessarily complicated, which in turn makes it harder for development engineers to design their new products.

To meet all these challenges, it is crucial to have a reliable partner with a comprehensive product portfolio that makes development as easy as possible. For instance, with many years of experience in sensors, connectivity, and power solutions, Infineon has the expertise needed to help developers and manufacturers meet the consumer market’s requirements for performance, reliability, and energy efficiency. In addition, Infineon has a comprehensive offering of all the key components needed to build smart speakers, as well as an ecosystem of software partners to create a one-stop–store experience and to support and drive market trends (Figure 3).

Smart speakers may seem undistinguished at first glance, but they are highly complicated devices that consist of diverse components. They are not just speakers but are intended to serve as the hub of the smart home. To realize all the different functions, numerous sensors, controllers, and microphones are needed. Infineon offers the necessary components, based on a deep-rooted system knowledge, from a single source. However, to further simplify evaluation and development and provide an overview of the extensive product range, the company is working with its official ecosystem partner, Sugr, on a new evaluation kit (see profile, below). The kit helps engineers test and evaluate the performance of individual products. They can also find out how the different components work together and which combination is best suited for their application to achieve the best results. In addition, the kit can be easily integrated with the expertise of Infineon and its partners, significantly reducing development time and accelerating time to market. The evaluation kit contains all key components for the development of a smart speaker.

Nowadays, with perfect audio input, algorithms have little problem understanding spoken words — unless they are confronted with a strong dialect or accent. In the real world, however audio is rarely perfect: The speaker might be in a space with ambient noise such as a TV, car noise, or other people talking in the same room. There are several techniques to improve the quality of the audio pickup, such as using a high-quality microphone and localizing the voice source so that disturbing noise can be ignored. In the past, large electret condenser microphones were used here, but tiny MEMS microphones, such as Infineon’s XENSIV™ MEMS microphones, offer dedicated advantages and are the perfect fit for smart speakers.

The microphones, with high sensitivity, low self-noise (high SNR), and low distortion, are designed for voice user interface (VUI) applications that require low self-noise (high SNR), a wide dynamic range, and a high acoustic overload point. They offer crystal-clear audio signals, an extended pickup distance, and sensitivity to both soft and loud signals in applications. The best-in-class mic-to-mic matching results in identical audio signals from multiple microphones, which can be used for noise cancellation or ultra-precise beamforming to identify a sound source and recognize a specific speaker among multiple speakers. Furthermore, the microphones are only a few cubic millimeters in size and can be easily integrated into smart speaker designs and even mounted directly on printed-circuit boards because they are packaged like other electronic chips and manufactured in foundries in a similar way. At the same time, their design allows for consistent performance over their lifetime, so there is no drift to confuse algorithms. In addition, the XENSIV™ MEMS microphones (Figure 4) enable improved audio input and thus excellent command recognition in a wide range of sound levels, from whispering to very loud noises, and even at longer distances to the speaker.

To capture the sound even better, multiple microphones can be used. The difference between the signals can also be utilized to precisely determine the location of the person speaking. This helps to filter out disturbing noise, but it is only possible because of the nearly identical performance characteristics of MEMS microphones with the same part number.

There is another technique that provides a similar but more reliable result while requiring fewer components for the design and still providing more data points: using a radar sensor. For instance, the XENSIV™ 60 GHz radar sensor (Figure 5) can provide precise presence detection within a configured distance without violating privacy, as no facial or personal data is collected. Instead, the sensor detects macro-movements on a full-body scale down to micro-movements on a sub-millimeter level for presence detection. This includes even breathing and heartbeat rates for health and vital-signs monitoring services.

Radar can even identify multiple people and their locations in a room and track their movements to identify who is talking and provide more levels of contextual information to the smart speaker to better understand what is happening.

With a feature like that, a smart speaker would be able to distinguish between a person speaking on TV/radio and a real person in the room. As an example, radar sensors can detect whether someone is in the room even if the person is not speaking. This enables new features to reduce energy consumption: The device remains in off mode if no person is present (see “Energy Efficiency,” next page).

Gone are the days of large speakers, as smart speakers can produce a surprisingly loud output from a tiny form factor. This is becoming more important, as smart speakers are now increasingly applied on the go and even outdoors.

This requires a high-quality audio chip like the Infineon MERUS™ Class D audio amplifier. The chip’s advanced design reduces power consumption and enables longer battery life compared with competing solutions in portable designs. This also benefits wired designs, as less waste heat is generated. The multi-stage, compact design reduces the number of filters and external components. As a result, the Class D amplifier product family offers cooler, smaller, and lighter amplifiers to maximize power efficiency and dynamic range while delivering premium audio performance in product form factors for great-sounding audio products. In addition, the components allow for extended battery playback time or smaller battery size without compromising battery playback time.

Smart speakers must have reliable wireless connectivity. The first and most important is Bluetooth for a short-range connection with smartphones to set up the smart speaker via an app and perhaps an additional sound system. The second is Wi-Fi for connecting to a home network and then to the internet.

The quality of this connection directly affects the user experience. Therefore, high-quality components are of utmost importance. Third, thanks to Bluetooth mesh, the smart speaker can serve as the center of the smart and connected home, controlling other IoT devices such as lighting and air conditioning. To enable all these functions, the Infineon AIROC™ Wi-Fi and Bluetooth combination comes with ultra-low power consumption in a single-chip solution. The solution also enables small-form–factor IoT designs. It provides connectivity for maximum interoperability and performance without dropouts anywhere in the home.

Because the speech-recognition software runs on servers, smart speakers need to be connected to the cloud. But as smart speakers process more and more sensitive personal data, they must be equipped with a high level of security to protect it, as any device connected to the cloud becomes a target for cyberattacks. Security becomes even more important as the speaker becomes the central hub for the smart home and as secondary speakers are added in other rooms of the house.

Infineon’s embedded OPTIGA™ Trust M security solutions provide a trust anchor for connecting IoT devices to the cloud, giving each IoT device its own identity and integrity to ensure a secure cloud connection throughout the device’s lifetime. This pre-personalized, turnkey solution provides secure, zero-touch onboarding and the high performance required for fast cloud access.

Despite sophisticated gesture and voice control, the occasional use of buttons or similar touch interfaces is still justified, such as for turning the speaker on and off and muting the microphone for privacy.

However, mechanical switches are a potential source of errors, so capacitive touch controls are a better and more elegant choice. They have also become more intuitive thanks to gesture control. In addition, human-machine interface (HMI) is an important way to stand out from the competition, using a combination of lighting, screen, touch, VUI, and gestures. The reliable, elegant, and durable CAPSENSE™ touch controller from Infineon with advanced capacitive touch sensors enables sleek, futuristic user interfaces. The controller also offers state-of-the-art noise immunity (SNR > 100:1) and is water-resistant.

Naturally, a switched-mode power supply (SMPS) is required to convert the mains current into the DC current needed for the electronics. Infineon offers highly efficient and power-dense SMPS solutions that can be used as high-efficiency chargers. In addition, it is advisable to provide electrostatic discharge (ESD) protection for the electronics. Infineon’s ESD protection devices are low-capacitance devices for best signal integrity and provide high-protection performance through extremely low clamping voltage.

Another useful feature is a USB output for charging devices such as smartphones or smartwatches. For example, the highly integrated EZ-PD™ USB-C controller supports all USB PD profiles, is USB-IF–certified, and has a market-proven USB PD stack that ensures specification compliance and interoperability. It is a highly integrated solution that minimizes additional BOM costs; in addition, the controller requires no firmware development.

The development of smart speakers is far from complete, and there are still many promising features that can further improve the user experience. Touch control could soon be enhanced by gesture control based on radar sensors or time-of-flight 3D imaging solutions. In addition, environmental sensing with CO2 sensors could expand indoor air-quality control. Furthermore, next-generation smart speakers will inevitably have more computing power, opening the possibility of controlling VUIs on the edge, meaning a standalone device that does not need to be connected to the cloud. This would enable the development of designs for mobile use, such as outdoors where Wi-Fi reception is sporadic or even nonexistent.

There are many more features waiting to be implemented, but for designing a smart speaker, developers need a strong partner and trusted supplier that understands the specifics of smart speaker applications, offers a rich portfolio of dedicated products optimized for smart speakers, and complements this with the system knowledge required to support design-in and accelerate time to market. Infineon has a comprehensive offering of all major components needed for smart speaker manufacturing, as well as an ecosystem of software partners to create a one-stop–store experience. Infineon also offers other sensor and communication technologies that round out the functions of smart speakers, including CO2 sensors for monitoring indoor air quality, NFC solutions for device pairing, and UWB for data transmission and device localization. To enable the necessary IoT connectivity for smart speakers and thus prepare them as a hub for smart homes, Infineon also supports the Thread protocol and complies with the Matter standard.

Sugr, an Amazon Voice Service (AVS)-certified and Preferred Partner of Infineon, specializes in microphone array algorithms and voice IoT systems.

Sugr is based in Shenzhen, China, and provides several services, including acoustic design, algorithm/software development, and system integration. With the recent increase in the popularity of teleconferencing, Sugr’s new solution, VoiceZoom™, was created to enhance people’s conferencing experiences, regardless of the user’s location.

VoiceZoom™ has met the criteria of the Microsoft Teams and Tencent Meeting certifications.

Explore more design partners on www.infineon.com/partners

The green potential of smart speakers

When equipped with radar and acoustic sensors, smart speakers enable major energy savings.

Via presence and absence detection, the XENSIV™ BGT60TR13C radar sensor makes sure that the smart speaker is on only when needed — in other words, if a person is present or nearby. If not, the device is switched off or put in deep-sleep mode.

Thanks to their two modes, the built-in XENSIV™ MEMS microphones contribute to energy savings as well. In normal mode, the microphone has the best acoustic performance, allowing for clear audio input. In low-power mode, the microphone’s energy consumption is dramatically reduced, but it is still able to “listen.” Once an input is detected, the microphone switches to normal mode for optimal audio pickup. 

Smart speakers have more than four microphones and spend up to 90% of the time in low-power mode. Explore more on energy efficiency on www.infineon.com/green-energy

More to experience at www.infineon.com/smartspeaker

Thomas Kasper is senior manager Application Marketing at Technologies AG

Yanqin Li is System Application Engineer at Infineon Technologies AG

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0)" class="scrollToTop">Top