| 361 | 33 | 13 |
| 下载次数 | 被引频次 | 阅读次数 |
本文对言语识别中的声学要素从时域和频域的角度进行探讨,旨在为人工耳蜗编码策略的改善提供理论依据。声码器技术被用于一系列的实验以确定时域和频域信息对言语识别和汉语四声识别的相互作用。频域信息是由声码器中的频道数来决定,而时域信息则是由声码器的低通滤波器的截止频率来决定。听力正常成人参加了各项感知试验。结果表明,时域和频域信息都对音素识别很重要。在安静环境下,辅音和元音识别率分别在8和12频道及16Hz和4Hz的低通截止频率时达到平台成绩。在噪声环境下,元音识别受益于增高的频道数。汉语四声的识别需要256Hz的低通截止频率才达到平台成绩,这一频率比英语音素识别所需的时域信息高得多。声调识别率在本研究中最高频道数12时仍未见饱和。为了研究细微结构和时域包络对四声识别的相对重要性,我们用声嵌合技术将不同声调信号的时域包络和细微结构进行对换。感知实验结果表明,声调识别主要取决于细微结构,这一点与音乐感知的结果类似,而不象言语识别,后者主要依赖于时域包络信息。因此,增加人工耳蜗系统中有效的频道数将有助于尤其是噪声环境下的言语识别。将人工耳蜗刺激中提供更多的细微结构信息可能会提高患者声调识别的成绩。
Abstract:The present study explores the temporal and spectral cues for speech recognition in an attempt to provide information for improving the speech processing strategies in cochlear implant systems. A noise-excited vocoder was used in a series of experiments to determine the relative contribution of temporal and spectral cues to phoneme recognition and lexical tone recognition. Spectral information was controlled by varying the number of channels and temporal information was controlled by varying the lowpass cutoff frequencies of the envelope extractors. Normal-hearing adult subjects participated in the perceptual tests. The results demonstrated that both temporal and spectral cues are important for phoneme recognition in quiet and in noise. The plateau performance for consonant and vowel recognition in quiet was reached when the number of channels was 8 and 12, respectively and the lowpass cutoff frequency was 16 and 4 Hz, respectively. In noise conditions, vowel recognition benefited from increased spectral resolution. For Mandarin-Chinese tone recognition, the lowpass cutoff frequency required for asymptotic performance was 256 Hz, much higher than that required for English phoneme recognition. Tone recognition performance had not yet reached plateau when 12 channels, the highest in this study, were used. To study the relative importance of fine structure and temporal envelope in lexical tone recognition, a separate experiment using the auditory chimera technique was carried out. The perceptual results demonstrated that tone recognition relies more on the fine structure as does melody perception rather than on the temporal envelope as does English speech perception. Therefore, to improve speech recognition, especially in noise, efforts should be concentrated on providing more effective channels in the cochlear implant systems. Lexical tone recognition could benefit from fine structure information presented in the cochlear implant stimulations.
1徐立.人工耳蜗的工作原理[M]//韩德民.人工耳蜗.北京:人民卫生出版社,2003:7-38.
2Friesen LM,Shannon RV,Baskent D,et al.Speech recognition in noise as a function of the number of spectral channels:Comparison of acoustic hearing and cochlear implants.J Acoust Soc Am,2001,110:1150-1163.
3Shannon RV.Multichannel electrical stimulation of the auditory nerve in man.I.Basic psychophysics.Hear Res,1983,11:157-189.
4Shannon RV.Temporal modulation transfer functions in patientswith cochlear implants.J Acoust Soc Am,1992,91:2156-2164.
5van den Honert C,Stypulkowski PH.Physiological properties of the electrically stimulated auditory nerve.II.Single fiber recordings.Hear Res,1984,14(3):225-243.
6Rubinstein JT,Hong R.Signal coding in cochlear implants:Ex-ploiting stochastic effects of electrical stimulation.Ann Otol Rhinol Laryngol,2003,112:14-19.
7Skinner MW,Arndt PL,Staller SJ.Nucleus24advanced encoder conversion study:Performance vs preference.Ear Hear,2002,23:2S-25S.
8Villchur E.Electronic models to simulate the effect of sensory dis-tortions onspeech perception by the deaf.J Acoust Soc Am,1977,62:665-674.
9ter Keurs M,Festen JM,Plomp R.Effect of spectral envelope smearing on speech reception.I.J Acoust Soc Am,1992,91:2872-2880.
10ter Keurs M,Festen JM,Plomp R.Effect of spectral envelope smearing on speech reception.II.J Acoust Soc Am,1993,93:1547-1552.
11Baer T,Moore BCJ.Effects of spectral smearing on the intelligibil-ity of sentences in noise.J Acoust Soc Am,1993,94:1229-1241.
12Baer T,Moore BCJ.Effects of spectral smearing on the intelligibil-ity of sentences in the presence of interfering speech.J Acoust Soc Am,1994,95:2277-2280.
13Boothroyd A,Mulhearn B,Gong J,et al.Effects of spectral smear-ing on phoneme and word recognition.J Acoust Soc Am,1996,100:1807-1818.
14Hill FJ,McRae LP,McClellan RP.Speech recognition as a func-tion of channel capacity in a discrete set of channels.J Acoust Soc Am,1968,44:13-18.
15Shannon RV,Zeng FG,Kamath V,et al.Speech recognition with primarily temporal cues.Science,1995,270:303-304.
16Dorman MF,Loizou PC,Rainey D.Speech intelligibility as a func-tion of the number of channels of stimulation for signal processors using sine-wave and noise outputs.J Acoust Soc Am,1997,102:2403-2411.
17Loizou PC,Dorman M,Tu Z.On the number of channels needed to understand speech.J Acoust Soc Am,1999,106:2097-2103.
18Xu L,Tsai Y,Pfingst BE.Features of stimulation affecting tonal-speech perception:Implications for cochlear prostheses.J Acoust Soc Am,2002,112:247-258.
19Xu L,Thompson C,Pfingst BE.Relative contributions of spectral and temporal cues for phoneme recognition.J Acoust Soc Am,2005,117:3255-3267.
20Eisenberg LS,Shannon RV,Martinez AS,et al.Speech recognition with reduced spectral cues as a function of age.J Acoust Soc Am,2000,107:2704-2710.
21Rosen S.Temporal information in speech:Acoustic,auditory and linguistic aspects.Philos Trans R Soc Lon don SerB1992,336:367-373.
22Drullman R,Festen JM,Plomp R.Effect of temporal envelopesmearing on speech perception.J Acoust Soc Am,1994,95:1053-1064.
23Drullman R,Festen JM,Plomp R.Effect of reducing slow temporal modulations on speech reception.J Acoust Soc Am,1994,95:2670-2680.
24Fu Q-J,Shannon RV.Effect of stimulation rate on phoneme recog-nition by Nucleus-22cochlear implant listeners.J Acoust Soc Am,2000,107:589-597.
25Xu L,Pfingst BE.Relative importance of the temporal envelope and fine structure in tone perception.J Acoust Soc Am,2003,114:3024-3027.
26Xu L,Zheng Y.Spectral and temporal cues for phoneme recogni-tion in noise.American Auditory Society Annual Meeting,Scotts-dale,AZ,2006.
27Dudley H.Remaking speech.J Acoust Soc Am,1939,11:169-177.
28Dorman MF,Loizou PC,Fitzke J,et al.The recognition of sen-tences in noise by normal-hearing listeners using simulations of cochlear-implant signal processors with6-20channels.J Acoust Soc Am,1998,104:3583-3585.
29Dorman MF,Loizou PC,Fitzke J,et al.Recognition of monosyllab-ic words by cochlear implant patients and by normal-hearing sub-jects listening to words processed through cochlear implant signal processing strategies.Ann.Otol.Rhinol.Laryngol.2000,Suppl109:64-66.
30Fu Q-J,Shannon RV,Wang X.Effects of noise and spectral reso-lution on vowel and consonant recognition:Acoustic and electric hearing.J Acoust Soc Am,1998,104:3586-3596.
31Fu Q-J,Shannon RV.Effects of electrode location and spacing on phoneme recognition with the Nucleus-22cochlear implant.Ear Hear,1999,20:321-331.
32Loizou PC,Dorman M,Poroy O,et al.Speech recognition by nor-mal-hearing and cochlear implant listeners as a function of inten-sity resolution.J Acoust Soc Am,2000,108:2377-2387.
33Henry BA,Turner CW.The resolution of complex spectral patterns by cochlear implant and normal-hearing listeners.J Acoust Soc Am,2003,113:2861-2873.
34Turner CW,Gantz BJ,Vidal C,et al.Speech recognition in noise for cochlear implant listeners:Benefits of residual acoustic hear-ing.J Acoust Soc Am,2004,115:1729-1735.
35Kong Y-Y,Cruz R,Jones JA,et al.Music perception withtemporal cues in acoustic and electric hearing.Ear Hear,2004,25:173-185.
36Shannon RV,Zeng F-G,Wygonski J.Speech recognition with al-tered spectral distribution of enve lope cues.J Acoust Soc Am,1998,104:2467-2476.
37Dorman MF,Loizou PC,Rainey D.Simulating the effect of cochlear-implant electrode insertion depth on speech understand-ing.J Acoust Soc Am,1997b,102:2993-2996.
38Rosen S,Faulkner A,Wilkinson L.Adaptation by normal listeners to upward spectral shifts of speech:Implications for cochlear im-plants.J Acoust Soc Am,1999,106:3629-3636.
39Greenwood DD.A cochlear frequency-position function for several species-29years later.J Acoust SocAm,1990,87:2592-2605.
40Shannon RV,Jensvold A,Padilla M,et al.Consonant recordings for speech testing.J Acoust Soc Am,1999,106:L71-74.
41Hillenbrand J,Getty LA,Clark MJ,et al.Acoustic characteristics of American English vowels.J Acoust Soc Am,1995,97:3099-3111.
42梁之安.汉语普通话中声调的听觉辨认依据.生理学报,1963,26(2):85-91.
43Whalen DH,Xu Y.Information for Mandarin tones in the ampli-tude contour and in brief segments.Phonetica,1992,49:25-47.
44Moore BCJ.Coding of sounds in the auditory system and its rele-vance to signal processing and coding in cochlear implants.Otol Neurotol,2003,24:243-54.
45Zeng FG.Cochlear implants in China.Audiology,1995,34:61-75.
46Huang TS,Wang NM,Liu SY.Tone perception of Mandarin-speaking postlingually deaf implantees using the nucleus22-channel cochlear mini system.Ann Otol Rhinol Laryngol,1995,Suppl166:294-298.
47Huang TS,Wang NM,Liu SY.Nucleus22-channel cochlear mini-system implantations in Mandarin-speaking patients.Am J Otol,1996,17:46-52.
48Sun JC,Skinner MW,Liu SY,et al.Optimization of speech pro-cessor fitting strategies for Chinese-speaking cochlear implantees.Laryngoscope,1998,108:560-568.
49Wei WI,Wong R,Hui Y,et al.Chinese tonal language rehabilita-tion following cochlear implantation in children.Acta Otolaryngol,2000,120:218-221.
50Lee KYS,van Hasselt CA,Chiu SN,et al.Cantonese tone percep-tion ability of cochlear implant children in comparison with nor-mal-hearing children.Int J Ped Otorhinolaryngol,2002,63:137-47.
51Ciocca V,Francis AL,Aisha R,et al.The perception of Cantonese lexical tones by early-deafened cochlear implantees.J Acoust Soc Am,2002,111:2250-2256.
52Wu JL,Lin CY,Yang HM,et al.Effect of age at cochlear implan-tation onopen-set word recognition in Mandarinspeaking deaf chil-dren.Int J Ped Otorhinolaryngol,2006,70:207-211.
53Xu L,Li Y,Hao J,et al.Tone production in Mandarin-speaking children with cochlear implants:A preliminary study.Acta Oto-laryngol,2004,124:363-367.
54Liu T-C,Chen HP,Lin HC.Effects of limiting the number of ac-tive electrodes on Mandarin tone perception in young children us-ing cochlear implants.Acta Otolaryngol,004,124:1149-1154.
55Wei CG,Cao KL,Zeng FG.Mandarin tone recognition in cochlear-implant subjects.Hear Res,2004,197:87-95.
56Smith ZM,Delgutte B,Oxenham AJ.Chimaeric sounds reveal di-chotomies in auditory perception.Nature,2002,416:87-90.
基本信息:
中图分类号:R764
引用信息:
[1]徐立.言语识别中的时域及频域信息[J].中华耳科学杂志,2006(04):335-342.
基金信息:
美国NIH(F32-DC00470,RO1-DC03808,R03-DC006161.);; 俄亥俄大学研究基金。
2006-12-16
2006-12-16