A: I grew up in Sendai, in the north of Japan, and my parents were both working with languages. My father was a professor in German studies and my mother, who was Italian, was a translator. As a result, I am bilingual in Japanese and Italian, and I also speak English and German, which I learned at school. I have always been fascinated by languages and I am very happy to be able to use that knowledge in my current job.
A: I enjoy cooking very much. Given my background, I cook both Japanese and Italian food, which are very dissimilar and I love both equally. My favorite food is the classic Italian pizza, and the Japanese miso soup. When I moved to Germany, I came across the Middle Eastern cuisine, which is also very different and was a completely new experience to me. I liked it very much and I am now experimenting with new spices in my cooking.
A: Ever since I was a kid, I was interested in science and initially I wanted to study physics. But I decided against it as I did not want to have an academic career, I much prefer working on implementation, building things that have a practical application, so I can see the fruits of my work in real life.
For my bachelor’s degree I studied computer science at Keio University in Tokyo, which had an exchange program with Aachen University. I visited Germany as an undergraduate student and this is when I decided I would move to Aachen for post-graduate studies. As I did.
I had not thought about working in human language technologies until I needed to select the topic of my MSc thesis. That’s when I came across the Institute of Human Language Technology and Pattern Recognition, which I thought was fascinating. I consider myself very lucky for coming across it as I’ve been enjoying working in this field very much. It was not easy to get accepted in the Institute either – the interview tasks were particularly hard from a coding point of view. For my thesis I ended up working on building a GPU decoder, versus a CPU one, and processing in parallel versus sequentially, which makes algorithms run a lot faster.
A: Because it opens up many possibilities for communication between people who wouldn’t otherwise be able to communicate, but now can do so with the help of speech recognition and machine translation.
At AppTek I continue working on a speech recognition decoder. I like it very much because one can directly process data with it in contrast to a neural network which is much harder to control. The decoder is only one component of the speech recognition pipeline, and I am continuously learning more about the bigger picture so as to really be able to understand this technology.
A: I knew of AppTek because of the company’s Lead Science Architect for Speech Recognition, Dr. Eugen Beck, who I worked with during my MSc thesis and is now my manager. I began working with the company part-time while I was still studying, as the topic of my thesis was very relevant to the work I am doing here at AppTek.
Last year I took on a full-time role and I am currently working on developing the speech recognition system. I expect to also implement in production a state-of-the-art system of the work I did at the Institute, and I am very much looking forward to it. I am also training the Italian ASR model, which is also very interesting and gives me a better overview of the entire pipeline.
A: In ASR, the quality of the English language is very good, but this isn’t the case for other languages because they are trained with fewer data. When it comes to my languages, Japanese and Italian, the quality is pretty good when the audio is clear. In order to improve the quality further in any language, the shortest road to success is to train the acoustic and language models with more data of good quality.
At the same time, the structure of a neural network plays a big role in both speech recognition and machine translation quality. This is updated to the state-of-the-art model and all ASR languages are trained on it. This is an ongoing effort that continues until we reach the ultimate goal of the perfect transcription.
A: I like working at AppTek as the work involves exactly what I want to be focusing on professionally and I am learning a lot in the process, from great colleagues. I feel very grateful for that. It is also very motivating for me to know that I am building something that will go out in the world as a product and will help people communicate.
I also like the fact that the company has a very tight connection with the university, which shows that it is research oriented. It is also not so big so that you are restricted in working on a small part of the pipeline alone. Instead, you get to work on a variety of tasks and have a better overview of what you are building. It is an environment in which I would want to stay for a while.
A: I only joined AppTek last year, and it was together with another female colleague. This year we also have a new female addition. Fortunately, it doesn’t feel very male-dominated at AppTek and I feel comfortable working in the office. University friends working in other companies tell me that things seem to be improving in that respect elsewhere as well.
To women and girls interested in this career I would probably just reiterate this. Thanks to everyone who worked very hard before us to widen our paths, the working environment is now really getting better and will continue to do so. So have no fear and dive into the exciting world of STEM!
AppTek.ai is a global leader in artificial intelligence (AI) and machine learning (ML) technologies for automatic speech recognition (ASR), neural machine translation (NMT), natural language processing/understanding (NLP/U), large language models (LLMs) and text-to-speech (TTS) technologies. The AppTek platform delivers industry-leading solutions for organizations across a breadth of global markets such as media and entertainment, call centers, government, enterprise business, and more. Built by scientists and research engineers who are recognized among the best in the world, AppTek’s solutions cover a wide array of languages/ dialects, channels, domains and demographics.