Augmenting Split Processing with a Multi-Stream Architecture Algorithm

The past few decades have ushered in advances in directional hearing aid technologies, which have provided incremental improvements when listening to speech in noisy situations. Signia’s RealTime Conversation Enhancement using multi-stream architecture was developed to further extend the directional benefit to dynamic communication situations with multiple conversation partners.

By Petri Korhonen, MSc, and Christopher Slugocki, PhD

Although assistive hearing technologies have demonstrated speech-in-noise (SiN) benefits, listening in noise still poses a challenge for listeners with hearing loss. Our listening world is rarely static. The sound scenes around the listener can include multiple communication partners moving and speaking from different directions. Simultaneously, noise may originate from many directions. Hearing aid technologies that track the sound environments and adapt their processing to facilitate improved communication in such acoustically complex listening situations could help resolve this issue.

Comparing Unilateral & Bilateral Beamforming

Directional microphones improve speech-in-noise performance. Two-microphone arrays, also known as unilateral beamformers, combine signals from two omnidirectional microphones located within the body of one hearing aid, while bilateral two-microphone arrays, also known as bilateral beamformers, combine signals from each side of the head, effectively using a total of four microphones. Bilateral beamformers provide an even narrower region of focus than is possible with a dual-microphone directional system, thus further helping hearing aid wearers to follow speech in acoustically challenging environments. However, this narrow-focus advantage could be a disadvantage of bilateral beamformers.

Real-life communication is not limited to one-on-one conversations with talkers facing each other. Instead, it is not uncommon to have several communication partners speaking from different directions. For a bilateral beamformer with a very narrow beam, the messages spoken by speakers outside of the narrow beam may not be audible.

Furthermore, bilateral beamformers adapt slowly to changes in the sound scene to avoid jarring changes in the sound, and thus work only when the background noise is stable. To overcome the potential loss of audibility, the hearing aid must be able to adapt its directional pattern rapidly to follow the moving speakers.

Adding Split Processing Tech to Improve SNR

Such a need was addressed with the introduction of Signia’s RealTime Conversation Enhancement (RTCE) technology, implemented using multi-stream architecture (MSA)¹, a new way of analyzing and processing group conversations. RTCE includes split processing technology that applies different gain and compression settings to signals from the front vs. the sides/back.² Additionally, three independent focus streams are created using advanced binaural beamforming and added to the split-processing stream in the front hemifield to capture speech from several talkers in multiple directions.¹

The goal is to make group conversations in noise easier and more comfortable by improving the signal-to-noise ratio (SNR), including when the listener is not directly facing the conversation partners. The algorithm first determines the locations of the sound sources to determine if they are nearby talkers, or if they are diffuse and further away and therefore not part of the conversation.

Further reading: Multi-Stream Architecture for Improved Conversation Performance

The algorithm then chooses the best directionality by steering and mixing multiple focus streams based on the results of acoustic analysis. This yields improved listening and communication beyond that provided by split processing alone. The analysis and the positions of the focus streams and the level of enhancement, enabled by the fast high bandwidth ear-to-ear (e2e) technology, are adapted 1,000 times a second in real-time, thus reacting immediately to any changes in the conversation to provide the optimal processing.

This study sought to compare the performance of the RTCE to a non-RTCE-based split processing technology when listening to two conversation partners. Because communication challenges can extend to dimensions not captured by speech recognition measures alone such as the listening being effortful, tiring, and stressful³, we also measured subjective report of listening effort, and listeners’ willingness to stay in noise in addition to measurement of speech reception threshold (SRT).

Methods of the RTCE Tech Study

Eighteen adults (mean age = 72.8 yrs, SD ±10.4; 9 females) with symmetrical (within ±15 dB at 250 Hz to 4kHz, except two participants with ±25 dB at one frequency) sensorineural hearing loss participated. Their four-frequency pure-tone average across two ears was 49.2 dB HL (SD = 10.0). Thirteen listeners had >9 years of hearing aid experience, while one listener had <1 year of hearing aid experience.

We adapted the Repeat and Recall Test (RRT)⁴ to a scenario where two conversation partners take turns talking in noise. We assessed the impact of the RTCE on speech recognition performance, listening effort, and the tolerable listening time. Listening effort rating quantifies the need to allocate cognitive resources, such as attention, to understand speech; indirectly reflecting communication difficulty.

Tolerable listening time refers to the duration that the listener would be willing to stay engaged listening to the talker under the specific test condition. Tolerable time has been shown to correlate with noise tolerance⁵, and with hearing aid satisfaction in real-life noisy situations.⁶ The inclusion of judgments of listening effort and tolerable time in the evaluation paints a fuller picture of the listener’s motivation to continue communication in noisy situations.

Testing was conducted using syntactically valid, but semantically meaningless (e.g., “Keep the ice fruit in the restaurant”) RRT sentences.⁷ Each sentence contained three or four keywords used for scoring. Sentences alternated between two locations (0° and 330°) to approximate listening to two conversation partners taking turns in speaking. Speech distractor signal (ISTS) was presented from 150°, 180°, and 210° at 75 dB SPL. Cafeteria noise was presented from 30°, 150°, 180°, and 210° at 65 dB SPL (Figure 1).

*Figure 1: The test setup with two talkers speaking in background noise while taking turns.*

Listeners were tested using bilateral Signia Pure Charge&Go IX hearing aids (HA) programmed with two settings: RTCE-OFF and RTCE-ON. The RTCE-OFF setting included the split processing technology, which treats sounds from the front and back using independent compression and noise reduction settings in addition to the adaptive unilateral beamformers.

The RTCE-ON setting included RealTime Conversation Enhancement (RTCE) technology, which created multiple front-facing focus streams using advanced bilateral beams steered toward the speech directions. These focus streams were added to the split-processing front-hemifield focus stream. The HAs were coupled using occluding ear tips and fitted using Signia’s fitting formula (IXFit).

Procedure

In Phase No. 1, the SNR required for 50% correct sentence-level speech reception threshold (SRT-50) was estimated via an adaptive procedure using blocks of 20 sentences in the RTCE-ON condition twice. The averaged results were recorded.

In Phase No. 2, each listener’s speech-in-noise performance was measured at the SNR corresponding to the SRT-50 measured from Phase No. 1. Sentence recognition performance (% repeated correctly) was measured using three 20-sentence blocks from the RRT in a counterbalanced order. After each block of 20 sentences, participants were asked to rate how effortful they found the listening situation on a 10-point scale spanning across “minimal effort” (=1), “moderate effort” (=5), and “very effortful listening” (=10). The listeners also estimated how much time (in minutes) they were willing to spend listening in that noise condition.

Study Results

*Figure 2: Distribution of the SNRs corresponding to 50% performance (SRT-50) with RTCE-ON.*

The SRT-50s measured with RTCE-ON ranged from -9.7 to 10.1 dB (mean = -1.4 dB, SD = 5.1 dB) with 12 out of 18 listeners having SRT-50 <0 dB (Figure 2). It is noteworthy that SNRs below 0 dB can be characterized as “very noisy.”⁸ Thus, the observation that a majority of these participants with a moderate-to-severe hearing loss only required SNR <0 dB supports the efficacy of the RTCE-ON.

Speech Recognition Results

*Figure 3. Speech recognition (in %) measured with RTCE-OFF and RTCE-ON on the RRT sentences. Error bars represent within-subject 95% confidence intervals (***p < 0.001).*

Figure 3 shows sentence recognition performance with RTCE-ON was 8.2% better than RTCE-OFF (p<0.001). This suggests that listeners follow multi-talker conversations in noise more accurately. An example may be a conversation around the dinner table at a noisy restaurant.

Impact on Listening Effort

*Figure 4. Subjective ratings of listening effort (on 10-point scale) measured with RTCE-OFF and RTCE-ON. Error bars represent within-subject 95% confidence intervals (*p < 0.05).*

Figure 4 shows that participants rated listening effort with the RTCE-ON significantly lower than RTCE-OFF (p<0.024). This lowered listening effort reflects a reduced need to evoke top-down processes to help understanding.⁹ This could speed up understanding and increase a listener’s willingness to engage with communication and social activities.³

Tolerable Listening Time

A direct consequence of reduced listening effort may also be seen in the listeners’ willingness to spend more time in noise with RTCE-ON than RTCE-OFF (28 mins vs. 22 mins) (Figure 5). The 6-minute increase was significant (p<0.004) and represents a 27% increase in time over the RTCE-OFF condition. The longer tolerable time could delay the onset of thoughts of wanting to exit the communication situation to allow full conversation engagement. It is worth stressing that, while the reported tolerable time was relatively short (28 mins), it was measured at a challenging SNR, which most people would avoid in real life.¹⁰

Discussion

Signia’s new RealTime Conversation Enhancement (RTCE) technology with multi-stream architecture (MSA) improved the participants’ speech understanding, reduced their listening effort, and increased the time they were willing to participate in communication compared to a technology without the RTCE.

It is important to note that the improvements offered by RTCE were on top of the benefits provided by split processing. When compared to single-stream-based adaptive directional microphones, split processing has been shown to improve SNR¹¹, improve speech understanding², increase noise tolerance¹², enhance contrasts between sounds in the wearer’s soundscape as indexed by MMN¹³, and reduce listening effort as measured via alpha EEG activity.¹⁴

Compared to split processing, RTCE technology further broadens directional benefit to dynamic listening situations with more than one conversation partner. Furthermore, because RTCE continuously analyzes the sound scenes around the listener, it adjusts the directions of interest when the listeners change their positions. This may allow listeners to move more freely while ensuring audibility.

The new RTCE technology further extends the speech-in-noise benefit of split-processing to dynamic multi-talker communication situations. The current study suggests that RTCE makes it easier for listeners to understand and contribute to the conversation. Moreover, the results may have implications for how hearing care professionals evaluate a patient’s hearing needs, including possibly adding listening effort and noise acceptance measures to their clinical routine when dispensing hearing aids.

About the Authors: Petri Korhonen, MSc, is a senior research scientist and Christopher Slugocki, PhD, is a research scientist at the WS Audiology Office of Research in Clinical Amplification (ORCA) in Lisle, Ill

Graphs courtesy of Signia.

Lead photo: Dreamstime

References