Large Language Models (LLMs) are the latest milestone in Natural Language Processing (NLP) and the topic that has captivated the headlines and the attention of the scientific community the world over since the launch of ChatGPT approximately a year and a half ago. Despite hallucinations, LLMs have demonstrated impressive capabilities in a wide array of scenarios and tasks, including translation, making our imaginations run wild about the future of multilingual communication.
Production-grade translation technology has come a long way from simple bilingual dictionaries, the systematic management of concept-oriented words and phrases in term bases, or the painstaking collection of parallel translation data (phrases or sentences) in databases known as translation memories.
Machine translation as a discipline is not even a century old, yet it has already evolved through three significant milestones. Early systems relied on rule-based approaches where linguistic rules were manually encoded into the system by computational linguists. Such methods struggled with the complexity and variability of the world’s natural languages and took a long time to develop, never reaching an acceptable level of fluency or being able to offer anything more than literal word-for-word translations which limited their effectiveness to very narrow domains with controlled and restricted vocabularies.
The introduction of Statistical Machine Translation (SMT) marked a significant improvement by leveraging bilingual translation corpora to generate translations based on probabilistic models. Google Translate was launched in 2006 and quickly became a household name. The generation of people that grew up with it was the first to think translation a commodity produced not by humans but by machines. Yet SMT struggled with languages for which no significant amounts of translation data were available or languages with rich linguistic structure. It also proved to have significant limitations regarding figurative language, context and ambiguity. This limited its use to non-creative, information-laden texts.
The advent of Neural Machine Translation (NMT) in 2015 represented a major leap forward. Artificial neural networks made it possible to model the machine translation process in an end-to-end manner. This significantly improved translation quality, especially for languages with high linguistic complexity and resulted in texts with unforeseen fluency, even though accuracy was sometimes compromised. However, NMT required substantial amounts of parallel data as well as computational resources for training, specifically the more expensive, but efficient Graphic Processing Units (GPUs). One type of neural network architecture – the Transformer (Vaswani et al., 2017) – was shown to be most effective for machine translation and many other NLP tasks. A few years later, it has also become the basis for larger models such as (chat)GPT – the Generative Pre-trained Transformer.
With GPT, the new era of Large Language Models has begun. Supplied with the number of parameters several orders of magnitude higher than what was previously used, trained on datasets comprising of billions of words instead of millions as was the case in NMT systems, LLMs are able to generate human-like text based on a long and freely formulated context provided to them, across a wide range of languages. As a result, they bring significant advantages to the translation task:
Notwithstanding the above benefits, the use of LLMs does not come without challenges and ethical concerns, especially around sensitive contexts. As with any MT system, LLMs can inadvertently perpetuate biases present in their training data, which is harder to curate due to its sheer size. At the same time, the issue of reliability and accuracy of translation remains a big question mark due to the LLM tendency for hallucination which makes its application prohibitive for high-stakes applications without human oversight.
Aside from the above, questions have been raised regarding privacy and data security concerns when using LLMs, which are imperative to establish if user trust is to be ascertained. An additional drawback of using LLMs for translating large amounts of text is that they are resource-hungry and thus expensive in deployment. They need large GPU machines, but still the translations are generated rather slowly. In contrast, the standard Transformer NMT models can be efficiently deployed even on CPU-only machines and still deliver translations at a comfortable speed of several sentences per second.
Though LLMs have not yet replaced the specialized enterprise-grade NMT models traditionally used in the translation workflows of language service providers, there are specific tasks for which they are particularly suitable. As such, they are steadily making headway in claiming their place in machine translation pipelines. At AppTek, we leverage LLMs primarily to improve the training process of state-of-the-art Transformer NMT model and MT quality estimation models. This can be done in a variety of ways:
The future of translation is undeniably intertwined with the continued advancement of LLMs. As these models become more sophisticated, their ability to represent and generate human language will only improve, leading to more accurate, context-aware translations. Moreover, ongoing research and development in areas like multilingual pre-training, cross-lingual transfer learning, and low-resource language translation will further enhance their capabilities.
As we begin to integrate LLMs into translation workflows, the advantages are already becoming apparent in specific, well-defined tasks. In doing so it is crucial to implement LLMs with guidance from a knowledgeable scientific team that can navigate the complexities while fully leveraging their potential to enhance task performance. At AppTek, we prioritize domain adaptation in LLM deployment, an approach that not only improves business efficiency but can also be used to support initiatives like environmental sustainability.
(Matusov et al., 2020) Evgeny Matusov, Patrick Wilken, and Christian Herold. 2020. Flexible Customization of a Single Neural Machine Translation System with Multi-dimensional Metadata Inputs. In Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 2: User Track), pages 204–216, Virtual. Association for Machine Translation in the Americas.
(Kocmi et al., 2023) Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondřej Bojar, Anton Dvorkovich, Christian Federmann, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Philipp Koehn, Benjamin Marie, Christof Monz, Makoto Morishita, Kenton Murray, Makoto Nagata, Toshiaki Nakazawa, Martin Popel, et al. 2023. Findings of the 2023 Conference on Machine Translation (WMT23): LLMs Are Here but Not Quite There Yet. In Proceedings of the Eighth Conference on Machine Translation, pages 1–42, Singapore. Association for Computational Linguistics.
(Kocmi and Federmann, 2023) Tom Kocmi and Christian Federmann. 2023. Large Language Models Are State-of-the-Art Evaluators of Translation Quality. In Proceedings of the 24th Annual Conference of the European Association for Machine Translation, pages 193–203, Tampere, Finland. European Association for Machine Translation.
(Moslem et al., 2023) Yasmin Moslem, Gianfranco Romani, Mahdi Molaei, John D. Kelleher, Rejwanul Haque, and Andy Way. 2023. Domain Terminology Integration into Machine Translation: Leveraging Large Language Models. In Proceedings of the Eighth Conference on Machine Translation, pages 902–911, Singapore. Association for Computational Linguistics.
(Thulke et al., 2024) Thulke, D., Gao, Y., Pelser, P., Brune, R., Jalota, R., Fok, F., ... & Erasmus, D. (2024). ClimateGPT: Towards AI Synthesizing Interdisciplinary Research on Climate Change. arXiv preprint arXiv:2401.09646.
AppTek.ai is a global leader in artificial intelligence (AI) and machine learning (ML) technologies for automatic speech recognition (ASR), neural machine translation (NMT), natural language processing/understanding (NLP/U), large language models (LLMs) and text-to-speech (TTS) technologies. The AppTek platform delivers industry-leading solutions for organizations across a breadth of global markets such as media and entertainment, call centers, government, enterprise business, and more. Built by scientists and research engineers who are recognized among the best in the world, AppTek’s solutions cover a wide array of languages/ dialects, channels, domains and demographics.