Or listen in your favorite podcast app
What happens when you combine A.I. with human intelligence? One answer is that you get Unbabel, a machine translation service on a mission to solve one of humanity’s most vexing challenges: language barriers. At Unbabel, the ultimate goal is to be able to communicate with anyone, anywhere in the world regardless of their native language. But to achieve that lofty goal we need data, we need translators and we need them to work together. João Graca is the Co-founder and CTO of Unbabel and on this episode of IT Visionaries, he discusses how his company is working to bring those two sides together to get closer to a future when we can rely on machine translation.
- A Long Way to Go: While there has been a lot of progress with machine translation, there still is a long way to go. Translation services, like Google Translate, have the ability to comprehend words and sentences, but understanding full phrases and rare words remains a struggle. Unbabel works in real-time with machine translation and in-house translators to efficiently translate emails and marketing materials
- We Need Data and Need it Now: One of the biggest challenges when it comes to machine translation is the lack of data, especially in uncommon languages. Data from popular languages are used to help decipher sentences and phrases. When we don’t constantly have an influx of data, that challenge becomes greater.
- You Just Don’t Understand: if we don’t utilize both data and translators to grow machine translation, the intelligence will continue to be stagnant. The biggest issue right now is companies hiring in-house translators, which is not efficient, doesn’t increase the amount of open source data, and is incredibly expensive for companies.
For a more in-depth look at this episode, check out the article below.
What happens when you combine A.I. with human intelligence? One result is Unbabel, a machine translation service on a mission to solve one of humanity’s biggest challenges: language barriers
While the ability for someone to speak english and simultaneously understand another person responding in Italian remains far off, the ability to communicate in real-time with humans across the globe in the written word has picked up steam in recent years, and it’s a problem Unbabel and João Graca, the co-founder and CTO of Unbabel, have been working toward since the company launched in 2013.
“I’m trying to solve the same problems I was trying to do through my PhD, now in the real world,” Graca said.
Today, Unbabel works to remove language barriers by combining A.I. and real time human translation by building enterprise solutions that allow brands to communicate with their customers in their native languages in an effective way.
For example, if a customer service rep receives an email in Chinese but speaks and reads English, Unbabel’s system will translate that same email to English before the representative ever receives it. Then once the representative responds, after they hit send, the system will automatically translate that same email to Chinese for the recipient. Creating an efficient and streamlined communication system.
Graca said as a company, Unbabel continues to grow at an exponential rate and one of the verticals where they have honed their focus on is customer service, an area where they have been able to excel in. But growth had to come at a slower pace because machine translation is still an inexact science.
“Machine translation is a very hard problem,” Graca said. “We’ve seen a lot of progress on machine translation over the last decade, but it’s still not there. What you have is a spectrum, from full machine intelligence solutions [like Google Translate], but it’s still not reliable. You don’t trust Google Translate to translate to something and send it to your customer. And then on the other side of the spectrum, you have the more traditional translation industry where you have professional translators working more on the project based on the translations.”
Unbabel’s model fits in the middle, utilizing a hybrid approach of machine translation, quatractics emission in combination with in-house human translators that correct the errors of their engines. This method allows Unbabel to correct errors in real-time while submitting it to its customers in as little as eight to 10 minutes, this hybrid approach allows the translators to consistently be adding data sets to their generic engines.
“This is a solution that allows you to basically leverage the most automatic translation part. It gives you the reliability that you otherwise don’t have.” he said.
Graca said that before his app, a vast majority of websites relied heavily upon a combination of in-house translators to handle all their marketing material. While this process is useful, Graca stressed that this approach can be incredibly expensive and time consuming and poses a problem when it comes to scalability.
“People use machine translation more for user reviews and basically to get an understanding of what the customers were doing,” he said.. “So this is why it’s been hard for companies to scale and support all the languages.”
But why is machine translation lagging behind services like Google Translate? Graca says that while we already have transcription services that can translate words and phrases in a matter of moments, the ability to understand evolving phrases and full phrases remains a barrier. A barrier that only becomes magnified when there is less data to rely on.
“it’s a very hard problem,” Graca said. “We’re still not good at dealing with multiple expressions. We are still not good at dealing with very large sentences. So there are still some challenges and again, language is always evolving.”
One of the areas where machine translation has improved is through statistical translation, which Graca states helped in the advancement of deep learning models. Deep learning helped improve algorithms for their data sets, but it also helped with the context of words and phrases. But there is still a ways to go before these services are perfect.
“We don’t understand language,” he said. “We find patterns in language and we know how to translate the patterns. One of the visions of machine translation since the early days in the fifties, was that you build this inter-lingual representation and then generate from there. And to build this, you understand the syntax and semantics. This was never reached. We just translate word by word. On the other hand, we still have difficulties dealing with rare events. Words you haven’t seen very frequently [like slang terms]. It’s hard to learn from that.”
Most current systems operate on a sentence-by-sentence structure, but in order to advance the technology there needs to be improvement on the models and their understanding.
One of the major reasons Unbabel continues to be a leader in space is because it relies heavily on human intelligence to train its engines. They can interpret phrases that might have cultural significance, but don’t carry the same importance in other markets.
But at the end of the day, Graca always comes back to the main problem that continues to drive his work at Unbabel.
“If you don’t have data, you can’t learn and I see that problem.”
To hear the entire discussion, tune into IT Visionaries here.