Toggle Main Menu Toggle Search

ePrints

Study of fusion strategies and exploiting the combination of MFCC and PNCC features for robust biometric speaker identification

Lookup NU author(s): Musab Al-Kaltakchi, Dr Wai Lok Woo, Professor Satnam Dlay, Professor Jonathon Chambers

Downloads

Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


Abstract

In this paper, a new combination of features and normalization methods is investigated for robust biometric speaker identification. Mel Frequency Cepstral Coefficients (MFCC) are efficient for speaker identification in clean speech while Power Normalized Cepstral Coefficients (PNCC) features are robust for noisy environments. Therefore, combining both features together is better than taking each one individually. In addition, Cepstral Mean and Variance Normalization (CMVN) and Feature Warping (FW) arc used to mitigate possible channel effects and the handset mismatch in voice measurements. Speaker modelling is based on a Gaussian Mixture Model (GMM) with a universal background model (UBM). Coupled parameter learning between the speaker models and UBM is utilized to improve performance. Finally, maximum, mean and weighted sum fusions of model scores are used to enhance the Speaker Identification Accuracy (SIA). Verifications conducted on the TIMIT database with and without noise confirm performance improvement.


Publication metadata

Author(s): Al-Kaltakchi MTS, Woo WL, Dlay SS, Chambers JA

Publication type: Conference Proceedings (inc. Abstract)

Publication status: Published

Conference Name: 2016 4th International Workshop on Biometrics and Forensics (IWBF)

Year of Conference: 2016

Online publication date: 11/04/2016

Acceptance date: 02/04/2016

Publisher: IEEE

URL: http://dx.doi.org/10.1109/IWBF.2016.7449685

DOI: 10.1109/IWBF.2016.7449685


Share