首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Multi-stream LSTM-HMM decoding and histogram equalization for noise robust keyword spotting
Authors:Martin W?llmer  Erik Marchi  Stefano Squartini  Bj?rn Schuller
Institution:1.Institute for Human-Machine Communication, Technische Universität München, Theresienstr. 90, 80333 München, Germany ;2.3MediaLabs, A3LAB, DIBET, Dipartimento di Ingegneria Biomedica, Elettronica e Telecomunicazioni, Università Politecnica delle Marche, 60131 Ancona, Italy
Abstract:Highly spontaneous, conversational, and potentially emotional and noisy speech is known to be a challenge for today’s automatic speech recognition (ASR) systems, which highlights the need for advanced algorithms that improve speech features and models. Histogram Equalization is an efficient method to reduce the mismatch between clean and noisy conditions by normalizing all moments of the probability distribution of the feature vector components. In this article, we propose to combine histogram equalization and multi-condition training for robust keyword detection in noisy speech. To better cope with conversational speaking styles, we show how contextual information can be effectively exploited in a multi-stream ASR framework that dynamically models context-sensitive phoneme estimates generated by a long short-term memory neural network. The proposed techniques are evaluated on the SEMAINE database—a corpus containing emotionally colored conversations with a cognitive system for “Sensitive Artificial Listening”.
Keywords:Long short-term memory  Neural networks  Histogram equalization  Keyword spotting  Cognitive agents
本文献已被 PubMed SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号