Google AI Will Describe Images in 10 More Languages With Unified Translation Model
Google has released an AI tool to describe unlabeled images to those with impaired vision in 10 new languages in addition to the existing six. The linguistic expansion is thanks to a new singular model capable of producing and translating the description for the additional tongues.
Many screen readers rely on images having attached descriptions. Without those labels, it can be hard for those with difficulty seeing to understand what they are looking at. Google claims that billions of images online are unlabelled and inaccessible to the hundreds of millions of blind and visually impaired people. The AI feature produced by Google Research and Chrome Accessibility overcomes that limit by applying visual analytics and natural language processing to describe the image from scratch. Originally released in English two years ago, the feature extended its language options to include French, German, Hindi, Italian, and Spanish last year. That list has more than doubled with the latest release offering to describe images in Croatian, Czech, Dutch, Finnish, Indonesian, Norwegian, Portuguese, Russian, Swedish, and Turkish too. Though limited to natural pictures, as opposed to drawings or video stills, the linguistic feat is notable for how Google was able to link the new languages to the underlying AI tool.
“The major innovation behind this launch is the development of a single machine learning model that generates descriptions in each of the supported languages. This enables a more equitable user experience across languages in the sense that the generated image descriptions in any two languages can often be regarded as translations that respect the image details,” Chrome accessibility software engineer Dominic Mazzoni wrote in a blog post. “We considered fairness, safety and quality when developing this feature and implemented a process to evaluate the images and captions along these dimensions before they’re eligible to be shown to users. We are excited to take this next step towards improving accessibility for more people around the world and look forward to expanding support to more languages in the future.”
The unlabelled descriptions are similar to the way the Voice Access app for Android can spot and control icons on a screen regardless of labeling with more flexible language, on top of the ‘visual cortex’ for improving AI labeling. It all ties into Google’s many accessibility projects. The tech giant has been eager to showcase how it is accessible to those with disabilities through features like Android Action Blocks, Sound Notifications that alert people who can’t hear critical noises, eye-tracking controls, the Lookout feature that can read food labels for Android, and the voice cues for Google Maps to guide people with limited sight. There’s also the new Project Relate app to help those with speech impairments use Google Assistant.