AppTek at IEEE ICASSP 2021

May 20, 2021

The AppTek Science Team, including Wei Zhou and Simon Berger, will be presenting its latest paper "Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition" at the IEEE ICASSP 2021 virtual conference held the week of June 6th-11th.

Abstract: To join the advantages of classical and end-to-end approaches for speech recognition, we present a simple, novel and competitive approach for phoneme-based neural transducer modeling. Different alignment label topologies are compared and word-end-based phoneme label augmentation is proposed to improve performance. Utilizing the local dependency of phonemes, we adopt a simplified neural network structure and a straightforward integration with the external word-level language model to preserve the consistency of seq-to-seq modeling. We also present a simple, stable and efficient training procedure using frame-wise cross-entropy loss. A phonetic context size of one is shown to be sufficient for the best performance. A simplified scheduled sampling approach is applied for further improvement and different decoding approaches are briefly compared. The overall performance of our best model is comparable to state-of-the-art (SOTA) results for the TED-LIUM Release 2 and Switchboard corpora.

Register for the conference at https://2021.ieeeicassp.org/. The presentation will be Tuesday, June 8th at 13:00 GMT at "Session SPE-1: Speech Recognition 1: Neural Transducer Models 1."

AI and ML Technologies to Bridge the Language Gap
Find us on Social Media:
ABOUT APPTEK.ai

AppTek.ai is a global leader in artificial intelligence (AI) and machine learning (ML) technologies for automatic speech recognition (ASR), neural machine translation (NMT), natural language processing/understanding (NLP/U), large language models (LLMs)  and text-to-speech (TTS) technologies. The AppTek platform delivers industry-leading solutions for organizations across a breadth of global markets such as media and entertainment, call centers, government, enterprise business, and more. Built by scientists and research engineers who are recognized among the best in the world, AppTek’s solutions cover a wide array of languages/ dialects, channels, domains and demographics.

SEARCH APPTEK.AI
Copyright 2021 AppTek    |    Privacy Policy      |       Terms of Service     |      Cookie Policy