At today's Interspeech 2021 conference, the science team will tackle a variety of subjects, including methods to improve language model integration.
M. Zeineldeen, A. Glushko, W. Michel, A. Zeyer, R. Schlüter, H. Ney:
"Investigating Methods to Improve Language Model Integration for Attention-based Encoder-Decoder ASR Models"
https://arxiv.org/abs/2104.05544
Other papers presented today make important contributions in ASR research by comparing different modeling architectures, and by proposing a novel fully acoustic-oriented subword modeling approach that combines the advantages of several methods into a single pipeline. The latter results in better word segmentation and a more balanced sequence length, both of which are pertinent issues particularly in streaming ASR output, which is used in live captioning scenarios.
W. Zhou, A. Zeyer, R. Schlüter, H. Ney:
"Equivalence of Segmental and Neural Transducer Modeling: A Proof of Concept"
https://arxiv.org/abs/2104.06104
W. Zhou, M. Zeineldeen, Z. Zheng, R. Schlüter, H. Ney:
"Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition"
https://arxiv.org/abs/2104.09106
AppTek.ai is a global leader in artificial intelligence (AI) and machine learning (ML) technologies for automatic speech recognition (ASR), neural machine translation (NMT), natural language processing/understanding (NLP/U), large language models (LLMs) and text-to-speech (TTS) technologies. The AppTek platform delivers industry-leading solutions for organizations across a breadth of global markets such as media and entertainment, call centers, government, enterprise business, and more. Built by scientists and research engineers who are recognized among the best in the world, AppTek’s solutions cover a wide array of languages/ dialects, channels, domains and demographics.