The Development of the SiB-test

A Swedish Test of Phonetic Perception in Noise, for Adult Persons with Hearing Loss

Witte E, Köbler S, Ekeroot J, Lundin E, Möller C

School of Health Sciences, Örebro University, Sweden, Audiological Research Centre, Örebro Sweden

Swedish Institute for Disability Research, Örebro Sweden, Linnaeus Centre HEAD, Linköping University, Sweden

Visual Aid

If the text is difficult to read, you can activate a visual theme with more contrast and larger text.

Activate visual aid
Download poster as PDF

Download the poster as a PDF.

Download (3.02 MB)


Hearing ability is most commonly measured using pure tone audiometry. Pure tones however, differ largely from natural speech. In order to attain a valid measure of speech perception, a speech test in which real speech signals are presented in an ecologically valid auditory background would be preferred. In the Swedish language, several such speech audiometry tests exist (1, 2, 3), all measuring the ability to correctly identify whole words or sentences.

Image of the Speech Banana in an audiogram showing a common hearing loss
The Speech Banana in an audiogram showing a common hearing loss.

Current Swedish clinical speech audiometry tests use:

  • Phonemic balance (the same proportion of speech sounds as in the language)
  • Averaging of test results over all speech sounds
  • This means that they have a very limited ability to capture improvements in speech perception that only affect specific sub-groups of phonemes

The SiB-test...

  • will count results separately for different speech sounds / groups of speech sounds
  • will contrast the speech sounds most easily confused / difficult to distinguish
  • will use optimization of phoneme levels for maximum reliability

Taken together, this means that the SiB-test will detect smaller changes in hearing ability than current Swedish speech tests.

Selecting contrasting test phonemes

Contrasting all speech sounds with every other speech sound in the language would make a very inefficient and time consuming hearing test. Instead the speech sounds that are most similar and consequently have the highest risk of being confused were identified using a sound similaritCurrent cliniy calculcal ation, speech abased udon ioprimenciples try tests use:from au tomatic speech recognition (4). The two phonems most similar to each phoneme were idenfified and grouped. Where possible, also zero phonems (absence of initial or final consonant) were added to the groups.

Sound recordings of 751 monosyllabic Swedish words including several examples of each possible minimal phonetic contrast in the Swedish language (1st author as speaker).

Image of the wave form of the word kaj
The wave form of the word "kaj".
Image of the wave form of the word katt
The wave form of the word "katt".

Each recording was segmented phonetically, and contrasting phonemes were analyzed in the frequency domain, using overlapping triangular Bark filters (5).

Image of the time and frequency content of the speech sound [ʝː] in kaj
The time and frequency content of the speech sound [ʝː] in "kaj".
Image of the time and frequency content of the speech sound [tː] in katt
The time and frequency content of the speech sound [tː] in "katt".

Time warped acoustic distances between all contrasting phonemes were calculated, using dynamic time warping (6) of the Euclidian distance between time windows.

Image of the Euclidian distance between time windows within a restricted sub-space
Euclidian distance between time windows within a restricted sub-space.
Image of the Attaining the best synchronized sound distance using locally and globally constrained dynamic time warping
Attaining the best synchronized sound distance using locally and globally constrained dynamic time warping.
Target phoneme a o y ʊ p ʈ d ɡ ŋ s ɕ l r v h ...
Most similar phoneme æ ø i ɔ ʈ k ɖ d m f ʂ n ɳ l p ...
Next most similar phoneme œ ɑ ʉ ø̞ t t b ɳ ɡ ʂ f ɳ l ɳ t ...

Selecting the best suited test words

Once groups of appropriate phoneme contrasts had been identified, the most suitable real Swedish words, containing these phonetic contrasts needed to be identified. Previous research has shown that several non-phonetic factors influence the speed and accuracy of single word perception. As we attempt to measure phonetic perception as isolated as possible, we’ve strived to minimize the effect of the following four of those factors, by selecting the test words groups with the smallest possible variation in these factors.

Word Frequency

Word frequency is a measure of how common a word is in the language as a whole. This measure was extracted by counting how many times each word occurred in a large (+500 million words) collection of internet blog texts, available from Språkbanken at the University of Gothenburg. The raw word frequency data were transformed into its Zipf-scale values (7) prior to analysis. Lexical access is facilitated by increased word frequency.

Neighborhood density

The neighborhood density of a specific word describes how many other words in the language that are very similar to the word, and how common those words are. The more similar words there are, and the more common those words are, the higher is the neighborhood density. Lexical access is inhibited by increasing neighborhood density (8).

Phonotactic probability - Stress and syllable structure based

Phonotactic probability expresses the probability of the speech sound combination in a specific word, given the all occurring phoneme combinations in the language as a whole. As such, high phonotactic probability has a facilitating effect (9).

Spelling regularity

The spelling regularity of a specific word is determined by the degree of spelling-to-pronunciation agreement within the word. Words with high spelling regularity are easier to identify than words with low spelling regularity (10, 11).

Image of three of the 28 test word groups selected for the SiB-test
Three of the 28 test word groups selected for the SiB-test. Each group contains a set of minimally contrasting test phonemes, and has low intra-group variation in word frequency, neighborhood density, phonotactic probability and spelling regularity.

Test trial composition

The auditory stimulus

1. Test word recording (of "blund")

Five ecologically valid ‘Lombard speech’ recordings of each test word by one male and one female speaker (Cf. 12).

2. Stationary speech weighted noise

The same long time spectrum as ICRA nr 8 below.


Unintelligible natural speech, one speaker (13).

4. ICRA nr 8

Fluctuating masking noise with the same amplitude modulation and long time spectrum as six speakers with raised vocal effort (14).

Image of three of the 28 test word groups selected for the SiB-test
Complete mix, presented in one test trial.

The graphic interface

The test is constructed to be run in a typical audiologist clinic. The test person sits in front of a touch screen, and the test sound is delivered via sound field speakers, some distance away from the listener.

Each auditory test stimulus consists of speech signals mixed with background noise, at specific signal-to-noise ratios (SNR). In order to attain a high reliability, the test word has a stationary noise in the background. And to attain high ecological validity, the SNR is upheld at all other times by mixing an international standardized speech signal (ISTS) with a standardized noise signal (ICRA nr 8). Optimal audibility in noise can also be attained by applying gain specifically to the test phoneme.

Image of gra graphic interface

Clinical validation

The SiB-test is undergoing clinical validation in a set of experiments on people with normal hearing and hearing loss. The validation process i outlined in the steps below.

  • Perceptual validation of the test word recordings without background noise, on normal hearing subjects. Perception is validated at test phoneme gains of 0, -6 and +6 dB. The limits of +/- 6 dB has been chosen as the maximum allowed phoneme gain adjustment.
  • Validation of the SiB-test on subjects with mild to severe sensorineural hearing loss.
    • Correlations to pure tone thresholds
    • Correlations to other Swedish speech-in-noise tests (2, 3)
    • Three different test protocols, suitable for different purposes:
      • Fixed SNR – tests all phonemes, lower sensitivity
      • Adaptive group SNR – tests selected phonemes, high sensitivity
      • Adaptive cluster SNR – tests all phonemes within clusters of test word groups, medium sensitivity

Some areas of application

Communicative diagnosis

The results of the SiB-test will indicate which speech sounds that a person hears well and which sounds are problematic. Hence the test could be used as a tool in audiological rehabilitation in order to assess patients’ communicative strengths and weaknesses.

Hearing aid / cochlear implant benefit

The SiB-test can be used to evaluate the benefit of a hearing aid fitting or a cochlear implantation. As the test is more sensitive to smaller changes in phonetic perception than other speech tests, such result will be more reliable than existing Swedish speech audiometric tests.

Evaluation of specific signal processing

Modern hearing aids use many, very intricate, signal processing algorithms aimed at improving speech perception, such as speech enhancement and frequency lowering (15). Hard proof of the effectiveness of such algorithms may be very difficult to attain. The SiB-test is constructed to be an efficient tool to evaluate the benefit of such algorithms for the patient.

Effectiveness of auditory training

Analytic auditory training is a feasible intervention in audiological rehabilitation. However, verifying its effectiveness has been a longstanding problem, in large due to the lack of instruments sensitive enough of capturing the type of improvements that can be expected (16). Also here, it is hoped that the SiB-test can shed some light and serve as a way to determine the effects from analytic auditory training.


  1. Grunditz M, Magnusson L. Validation of a speech-in-noise test used for verification of hearing aid fitting. Hearing, Balance and Communication, 2013, Vol11(2), p64-71. 2013;11(2):64-71.
  2. Magnusson L. Reliable clinical determination of speech recognition scores using Swedish PB words in speech-weighted noise. Scand Audiol. 1995;24(4):217-23.
  3. Hällgren M, Larsby B, Arlinger S. A Swedish version of the Hearing In Noise Test (HINT) for measurement of speech recognition. International Journal of Audiology, 2006, Vol45(4), p227-237. 2006;45(4):227-37.
  4. Jurafsky D, Martin JH. Speech and language processing : an introduction to natural language processing, computational linguistics and speech recognition. Upper Saddle River, N.J.: Pearson Education International/Prentice Hall; 2009.
  5. Zwicker E, Fastl H. Psychoacoustics : facts and models. Berlin: Springer; 1999.
  6. Sakoe H, Chiba S. Dynamic Programming Algorithm Optimization for Spoken Word Recognition. IEEE Trans Acoust. 1978;26(1):43-9.
  7. van Heuven WJ, Mandera P, Keuleers E, Brysbaert M. SUBTLEX-UK: a new and improved word frequency database for British English. Q J Exp Psychol (Hove). 2014;67(6):1176-90.
  8. Luce PA, Pisoni DB. Recognizing Spoken Words: The Neighborhood Activation Model. Ear Hear. 1998;19(1):1-36.
  9. Vitevitch MS, Luce PA. Probabilistic Phonotactics and Neighborhood Activation in Spoken Word Recognition. Journal of Memory and Language. 1999;40(3):374-408.
  10. Berndt RS, Reggia JA, Mitchum CC. Empirically derived probabilities for grapheme-to-phoneme correspondences in english. Behav Res Methods Instrum Comput. 1987;19(1):1-9.
  11. Dich N. Orthographic consistency affects spoken word recognition at different grain-sizes. J Psycholinguist Res. 2014;43(2):141-8.
  12. Van Summers W, Pisoni DB, Bernacki RH, Pedlow RI, Stokes MA. Effects of noise on speech production: Acoustic and perceptual analyses. The Journal of the Acoustical Society of America. 1988;84(3):917-28.
  13. Holube I, Fredelake S, Vlaming M, Kollmeier B. Development and analysis of an International Speech Test Signal (ISTS). Int J Audiol. 2010;49(12):891-903.
  14. Dreschler WA, Verschuure H, Ludvigsen C, Westermann S. ICRA noises: artificial noise signals with speech-like spectral and temporal properties for hearing instrument assessment. International Collegium for Rehabilitative Audiology. Audiology. 2001;40(3):148-57.
  15. Dillon H. Hearing aids. 2nd ed.. ed. Sydney: Sydney : Boomerang Press : New York : Thieme; 2012.
  16. Henshaw H, Ferguson MA. Efficacy of individual computer-based auditory training for people with hearing loss: a systematic review of the evidence. PLoS One. 2013;8(5):e62836.

Relaterad personal: Erik Witte, Jonas Ekeroot, Susanne Köbler, Elin Lundin, Claes Möller