A Better Understanding of Speech Perception in Noise using New Measures of Word Frequency, Neighborhood Density, Phonotactic Probability and Orthographic Transparency

(Presented at the Nobel Day Festivities, Örebro University, December 9 2019)

Witte E, Köbler S, Smeds K, Ekeroot J, Mäki-Torkko E

School of Health Sciences, Faculty of Medicine and Health, Örebro University,
Swedish Institute for Disability Research, Örebro University

ORCA Europe,
Department of Surgical Sciences, Section of Otorhinolaryngology, Head and Neck Surgery, Uppsala University,
School of Medical Sciences, Faculty of Medicine and Health, Örebro University

Visual Aid

If the text is difficult to read, you can activate a visual theme with more contrast and larger text.

Activate visual aid
Download poster as PDF

Download the poster as a PDF.

Download (2.05 MB)

Background

When assessing someone’s hearing, speech-in-noise testing is an important complement to other audiometric measures. Since the stimuli used in speech-in-noise tests are often real words, there are various lexical word metrics that could influence the test results, independently of the hearing ability of the subject.

Purpose: To investigate what effect AFC-list1 measures of word frequency, neighborhood density, phonotactic probability and orthographic transparency had on the results from a new Swedish phonetic perception test.

Image of normalized distributions of four AFC-list word metrics
Figure 1. Normalized distributions of four AFC-list word metrics based on 7889 monosyllabic Swedish words.

The Zipf-scale word frequency measure ranges from ≈ 0 for very unusual words to ≈ 7 for very common words.

The normalised syllable-structure based phonotactic probability (SSPP) metric ranges from 0 – 1 for each phoneme of a word, and expresses the likelihood of occurrence of that phoneme given its phonological context. The min SSPP equals the lowest SSPP value of the word.

The Zipf-scale weighted phonological neighbourhood density probability (PNDP) measure ranges from 0 – 1 indicates the likelihood of occurrence of a specific word given its phonological neighbours, defined as words at an edit distance of one.

The grapheme-initial letter to pronunciation orthographic transparency (GIL2P-OT) metric ranges from 0 – 1 for each grapheme-phoneme correspondence in a word. The metric indicates how easy a grapheme is to decode into a phonological unit (i.e. reading). The min GIL2P-OT equals the lowest GIL2P-OT value of the word.

Methods

Image of an example of response alternatives in the 3AFC test
Figure 2. An example of response alternatives in the 3AFC test as presented to the participants on a touch screen.
Image of normalized distributions of the same word metrics based as in figure 1
Figure 3. Normalized distribution of the same word metrics based as in figure 1, but here based on the 84 test words used in the 3AFC test.

Sixty-six Swedish speaking adults with normal hearing or symmetric sensorineural hearing loss were each presented with 84 separate three-alternative-forced-choice (3AFC) auditory word discrimination trials at a signal-to-noise ratio (SNR) individually adjusted as to result in approximately 60 percent correct discrimination. The response alternatives were real Swedish words, differing from each other in only one phoneme. The background noises used consisted of sounds from an urban outdoor environment, and were highly controlled as to match the acoustic content of each set of contrasting phonemes. The data was analyzed using multi-level logistic regression modelling with Zipf-scale value, PNDP, SSPP and GIL2P-OT as main effects and participant as random effect.

Image of distribution of age and pure-tone hearing thresholds among the 66 participants
Figure 4. Distribution of age and pure-tone hearing thresholds among the 66 participants included in the current study. Best-ear average pure-tone thresholds (in dBHL) are presented for frequencies 0.5, 1, 2 & 4 kHz (BPTA-4), for 0.5, 1 & 2 kHz (BPTA-LF), and for 3, 4 and 6 kHz (BPTA-HF), respectively.
Image of summary of the 66 3AFC test sessions
Figure 5. Summary of the 66 3AFC test sessions. Red dots indicate the SNR (in dB) used for testing each participant. Green dots indicate the number of correct test trials (out of 84 possible). Blue dots indicate the best side pure-tone average hearing thresholds for the frequencies 0.5, 1, 2 and 4 kHz (in dBHL). Participants are ordered in ascending order from left to right according to their average test results.

Results

Image of estimated marginal effects representing the change in probability (P) of a correct 3AFC trial response
Figure 6. Estimated marginal effects representing the change in probability (P) of a correct 3AFC trial response, due to changes in Zipf-scale value (plot a), PNDP (plot b), Min SSPP (plot c) and Min GIL2P-OT (plot d) of the test word, when all other estimators are held at fixed values.

The results indicate that all lexical factors had statistically significant influences upon the probability of correct trial outcome. High Zipf-scale values, as well as high values of PNDP and GIL2P-OT, showed facilitative effects, while high SSPP, contrary to expectations from the scientific literature, showed an inhibitory effect on word discrimination. The largest effect could be attributed to the Zipf-scale value, which alone could affect the test results by up to 40 percentage points.

Table 1. Esimated odds ratios and (95 %) confidence intervals.
Odds ratio CI lower CL upper
(Intercept) 2.71 2.29 3.22
Zipf-scale value 1.68 1.57 1.80
PNDP 1.91 1.67 2.20
Min SSPP 0.66 0.57 0.75
Min GIL2P-OT 1.94 1.70 2.23

Conclusions

Lexical word-metric values may affect the outcome of speech audiometry tests independently of the hearing ability of the listener. Therefore, it is important that such factors are controlled for when interpreting the results from speech audiometry tests, as well as when constructing new speech tests.

References

  1. Witte, E., & Köbler, S. (2019). Linguistic Materials and Metrics for the Creation of Well-Controlled Swedish Speech Perception Tests. Journal of speech, language, and hearing research: JSLHR, 62(7), 2280-2294. doi:10.1044/2019_JSLHR-S-18-0454

Relaterad personal: Erik Witte, Susanne Köbler, Jonas Ekeroot, Elina Mäki-Torkko