[1]李燕萍,等.一种适于说话人辨认的自适应频率尺度变换[J].南京理工大学学报(自然科学版),2010,(02):182-186.
 LI Yan-ping,TANG Zhen-min,DING Hui,et al.Adaptive Frequency Transform for Speaker Identification[J].Journal of Nanjing University of Science and Technology,2010,(02):182-186.
点击复制

一种适于说话人辨认的自适应频率尺度变换
分享到:

《南京理工大学学报》(自然科学版)[ISSN:1005-9830/CN:32-1397/N]

卷:
期数:
2010年02期
页码:
182-186
栏目:
出版日期:
2010-04-30

文章信息/Info

Title:
Adaptive Frequency Transform for Speaker Identification
作者:
李燕萍1 4 唐振民1 丁辉1 2 张燕1 3
1. 南京理工大学计算机科学与技术学院, 江苏南京210094; 2. 嘉兴学院数学与信息工程学院, 浙江嘉兴314001; 3. 金陵科技学院信息技术学院, 江苏南京210006; 4. 南京邮电大学通信与信息工程学院, 江苏南京210003
Author(s):
LI Yan-ping14TANG Zhen-min1DING Hui12ZHANG Yan13
1.School of Computer Science and Technology,NUST,Nanjing 210094,China;2.School of Mathematics and Information Engineering,Jiaxing University,Jiaxing 314001,China;3.School of Information Technology,Jinling Institute of Technology,Nanjing 210006,China;4.College of Telecommunication and Information Engineering,Nanjing Univesity of Posts andTelecommunications,Nanjing 210003,China
关键词:
说话人辩认 自适应频率尺度变换 鉴别性特征 非均匀子带
Keywords:
speaker identification adaptive frequency transform discriminative feature non-uniform sub-bands
分类号:
TN912.34
摘要:
该文提出了一种适于说话人辨认的自适应频率尺度变换,基于说话人信息在不同频带中的非均匀分布性质,通过F比衡量不同频率子带对说话人信息的贡献大小,设计自适应频率滤波器,提高贡献大的频带的频率分辨率,降低贡献小的频带的频率分辨率,提取鉴别性特征DFCC。干净语音环境下,不同测试文件的实验表明,该文提出的DFCC特征的识别率比传统MFCC特征平均提高了1.45%,表明特征的稳定性好,对语音内容不存在依赖性;在不同信噪比的噪声环境下,识别率平均提高了6.37%,表明DFCC特征能够充分利用语音频带中包含的说话人信息,具有良好的抗噪性能。
Abstract:
A novel method for speaker identification based on adaptive frequency transform is proposed here.According to the fact that the speaker information is non-uniformly distributed in frequency bands,the discrimination power between frequency components and individual characteristics is examined and the speaker information is quantified based on Fisher’s F-ration.A new adaptive frequency filter is designed,which can improve the frequency resolution in high contribution frequency domain,reduce the frequency resolution in low contribution frequency domain,and extract the discriminative feature DFCC(Discriminative frequency cepstral coefficient).In a clean environment,the results from the experiments on different testing materials indicate that the recognition rates based on DFCC increases by 1.45% on average than on traditional MFCC(Mel frequency cepstral coefficient),which confirms that the proposed feature is stable and independent of spoken contents.Furthermore,in the noise environment of different SNR levels,the experiment results demonstrate that the recognition rate increases by 6.37% on average,which confirms the effectiveness of discrimination and robustness of DFCC.

参考文献/References:

[1]Campbell J P. Speaker recognition: a tutoria l[ J]. Proceed ings o f the IEEE, 1997, 85( 9): 1437- 1462.
[2] H ayakawa S, Itakura F. Tex t-dependent speake r recogn ition using the inform a tion in the higher frequency band[ A]. Proceed ings o f the Conference on Acoustic, Speech and S igna l Pro cessing [ C ]. Adela ide, SA, Australia, IEEE, 1994: 19- 22.
[3] M iyajmi a C, Watanab leH, Tokuda K, et a.l A new approach to designing a feature extractor in speaker identif-i cation based on discrmi inative feature ex traction [ J]. Speech Commun ication, 2001, 35( 3): 203- 218.
[4] Lu Xugang, Dang Jianwu. An investigation o f dependenc ies between frequency components and speaker character istics for tex-t independent speaker identification [ J]. Speech Commun ication, 2008, 50: 312- 322.
[5] 俞一彪, 袁冬梅, 薛峰. 一种适于说话人识别的非 线性频率尺度变换[ J]. 声学学报, 2008, 33 ( 5): 450- 455.
[6] 赵力. 语音信号处理[M ]. 北京: 机械工业出版 社, 2008.
[7] Reyno lds D A, Rose R C. Robust tex t- independent speaker identifica tion using Gaussian m ix ture speaker m ode ls[ J]. IEEE Transac tions on Speech and Audio Processing, 1995, 3( 1): 72- 83.
[8] Dang J, H onda K. Acoustic character istics o f the p ir-i fo rm fossa in models and hum ans[ J] . Acoustica l So c-i ety o f Am er ica, 1997, 101: 456- 465.
[9] K itam ura T, H ondaK, Takem o to H. Ind iv idua l variation of the hypopha ryngea l cav ities and its acoustic e ffects [ J] . A coustical Soc ie ty of Am er ica, 2005, 26( 1): 16- 26.
[10] ChanW N, Zheng N, Lee T. D iscrim ination power of voca l source and vocal tract related fea tures for speaker segm en tation[ J]. IEEE Transactions on Audio, Speech and Language Process ing, 2007, 15( 6): 1884- 1892.
[11] Varga A, Steeneken H JM, Tom linson M, et a.l The NOISEX-92 study on the effect o f add ictive noise on automa tic speech recogn ition[ R ]. Ma lve rn, UK: Speech Research Un it, Defense Research Agency, 1992. 186

备注/Memo

备注/Memo:
基金项目: 浙江省自然科学基金( Y1090649); 浙江省教育厅科研资助项目( Y200805349) ; 南京邮电大学引进人才 科研启动基金( NY209004) ?? 作者简介: 李燕萍( 1983- ) 女, 博士生, 主要研究方向: 语音信号处理, 说话人识别, E-mail: njustjsjlyp@ 163. com; 通讯作者: 唐振民( 1961- ), 男, 教授, 博士生导师, 主要研究方向: 模式识别与智能信息处理, Em a i:l tang-zm@ mail. n just. edu. cn。
更新日期/Last Update: 2010-04-30