Software that can understand images, sounds, and language is helping people with disabilities

Software that can understand images, sounds, and language is helping people with disabilities

FCC rules require TV stations to provide closed captions that convey speech, sound effects, and audience reactions such as laughter to deaf and hard of hearing viewers. YouTube isn’t subject to those rules, but thanks to Google’s machine-learning technology, it now offers similar assistance.

YouTube has used speech-to-text software to automatically caption speech in videos since 2009 (they are used 15 million times a day). Today it rolled out algorithms that indicate applause, laughter, and music in captions. More sounds could follow, since the underlying software can also identify noises like sighs, barks, and knocks.

The company says user tests indicate that the feature significantly improves the experience of the deaf and hard of hearing (and anyone who needs to keep the volume down). “Machine learning is giving people like me that need accommodation in some situations the same independence as others,” says Liat Kaver, a product manager at YouTube who is deaf.

Indeed, YouTube’s project is one of a variety that are creating new accessibility tools by building on progress in the power and practicability of machine learning. The computing industry has been driven to advance software that can interpret images, text, or sound primarily by the prospect of profits in areas such as ads, search, or cloud computing. But software with some ability to understand the world has many uses.

Last year, Facebook launched a feature that uses the company’s research on image recognition to create text descriptions of images from a person’s friends, for example.

Researchers at IBM are using language-processing software developed under the company’s Watson project to make a tool called Content Clarifier to help people with cognitive or intellectual disabilities such as autism or dementia. It can replace figures of speech such as “raining cats and dogs” with plainer terms, and trim or break up lengthy sentences with multiple clauses and indirect language.

The University of Massachusetts, Boston, is helping to test how the system could help people with reading or cognitive disabilities. Will Scott, an IBM researcher who worked on the project, says the company is talking with an organization that helps autistic high schoolers transition to college life about testing the system as a way of helping people understand administrative and educational documents. “The computing power and algorithms and cloud services like Watson weren’t previously available to perform these kinds of things,” he says.

Ineke Schuurman, a researcher at the University of Leuven in Belgium, says inventing new kinds of accessibility tools is important to prevent some people from being left behind as society relies more and more on communication through computers and mobile devices.

She is one of the leaders of an EU project testing its own text simplification software for people with intellectual disabilities. The technology has been built into apps that integrate with Gmail and social networks such as Facebook. “People with intellectual disabilities, or any disability, want to do what their friends and sisters and brothers do—use smartphones, tablets, and social networking,” says Schuurman.

Austin Lubetkin, who has autism spectrum disorder, has worked with Florida nonprofit Artists with Autism to help others on the spectrum become more independent. He welcomes research like IBM’s but says it will be a challenge to ensure that such tools perform reliably. A machine-learning algorithm recommending a movie you don’t care for is one thing; an error that causes you to misunderstand a friend is another.

Still, Lubetkin, who is working at a startup while pursuing a college degree, is optimistic that machine learning will open up many new opportunities for people with disabilities in the next few years. He recently drew on image-recognition technology from startup Clarifai to prototype a navigation app that offers directions in the form of landmarks, inspired by his own struggles to interpret the text and diagram information from conventional apps while driving. “Honestly, AI can level the playing field,” says Lubetkin.

Images Powered by Shutterstock