Automatic Translation with High-tech Glasses

Google Translate is probably the most used tool for translations. Its popularity has increased with its efficiency and now the software has been entirely renovated with a big focus on user experience. Google has worked hard to provide new ways of access the tool.

According to Wikipedia, “Google Translate supports more than 100 languages and can translate 37 languages via photo, 32 via voice in conversation mode, and 27 via real-time video in augmented reality mode. The last modes are probably the most futuristic features of the application. They work on smartphones and they allow you to translate text by simply taking a picture of it. In this way you can easily translate content from a book, a label or even a billboard and the best thing is that you can do it in a fast and practical way. You will not spend hours typing the text on your mobile because the application will automatically recognize the text in the photo, and actually, it does it very well. The use of the tool can be even more convenient when you have to deal with languages you don’t even know characters of: we are not only talking about saving time, but also about doing something you wouldn’t be able to do without this technology.

Nowadays, access to Google Translate service just requires you to pull your phone out of your pockets and open the app. However, it is likely the process will become even more easier. Google has already achieved real-time performance and it has designed the software to draw the translated text onto the original one, directly in the photograph. You can see some pictures below:

First Set of Photos

These photos shown Google Translate at work. To be honest it has been really hard to take a screenshot when the app has not finished its job yet: the app is so fast that it takes me several attempts before having a screenshot at the half of the process.

Second Set of Photos


The app actually encounter some difficulties when trying to translate a whole page in real-time mode. However, it is definitely a thing you don’t want to do with this app. How are you supposed to read a whole page on a small smartphone screen? The best way should be probably to use the app without real-time mode: it works well and recognize the text without any problem.


These screenshots show Google Translate on an Android smartphone. But what if we put this technology on a pair of Google Glasses or on some futuristic contact lenses? The tool can be able to start automatically and translate everything we see. More precisely, we would never see the original text in the foreign language, but only its translation. However, the automatic behavior of the tool should probably be limited in order to make the user familiar with the device. In fact, since we are applying modifications to user perceptions and understanding of the real world, we have to be sure that the user will be conscious of these changes.

In general, when developing a new high-tech device, we should always aim at users trust: gaining it is the first step to the diffusion of the new product. And if we want the user to trust the device, in order to reach this goal, we have to notify the user whether what he is seeing is pure reality or augmented reality. There are several reasons, both technical and psychological.

From a technical point of view, automatic translations are still imperfect: sometimes they produce something unintelligible. But of course, the behavior of the application must depend on the context. Let’s consider for example a student whose native language is English but whose knowledge of French is also quite good. The app must avoid the translation of French to English (and the opposite too!) but the user can still find the app quite useful when he encounters uncommon French words or also easy ones he can’t find a way to remember. Translating automatically everything the user sees will definitely bother the student, since he will probably understand French better than the automatic imperfect English translation. So, we should let the user choose what and when to translate. However, we must find a way to make this choice fast and smart. The best way will be probably to let the user set some generic parameters and then decide what has to be translated based on the information the user has provided. Data science will probably help to understand in what situation the user would like an help from our device. It can be also appreciable to display both the original and translated text at the same time, in order to give the user an opportunity to improve his knowledge of the language.

On the other hand, let’s consider another native English speaker, trying to understand Chinese without any knowledge of the Asian language. Since in this situation, the user cannot count on himself at all, we can probably decide to translate everything the user is looking at: assuming that the translation is not completely wrong, we are giving the user more and better information then the one he has at his disposal without our device.

Furthermore, I personally believe that this kind of devices will scary some users. Beyond problems related to privacy, I think some of the users will be afraid about the manipulation of the text they are reading. What if the text is wrong? What if companies or governs are trying to manipulate what we are reading? Sometimes, slightly different words can really make the difference about the interpretation and judgment about a topic. This is even more important when talking about politics. Moreover, we must ensure that the device won’t start creating graphical artifacts in dangerous situation. For instance, we don’t want to cover user sight when he is driving.

All these reasons make me believe that, at least for the moment, we should give the user the opportunity to see both the original and the translated text. But how can we show to the user two contents at the same time? One possibility is to show them in sequence. For instance, we can provide a translation only when the user is staring at something for a chosen time. In this way the device will not be annoying and moreover it will have the necessary time to recognize the text and provide a decent translation. More, giving the user the opportunity to completely disable some options could be really important for safety reason in particular context.

In conclusion, I believe Google is working hard and well on Google Translate. They have recently developed a new artificial language used by the software to improve translations. Machine learning techniques are still at their beginning, meaning that we will see huge improvement in the next few years. We will reach the goal of perfect translation and maybe that will be a good time for enabling automatic translation on our high-tech glasses, but in the meantime, we should give users a choice in order to decide how much the device should be invasive. In this way, we will be able to collect data about users usage and preferences in order to decide how to develop future projects.