Thursday 1 December 2016

Google’s All New AI Knocks Out Humans

Artificial Intelligence may sound like you are in a science fiction movie to many people out there. But in reality, we have been using it in our day-to-day life, right from our personal assistant, wallet, smart home, to video games. With Google’s incredible invention, now it can lip read, too.


Researchers from Google’s DeepMind and the University of Oxford have developed the most accurate lip-reading AI ever. According to the report, the AI was trained using a video with a duration of 5,000 hours from six different television programmes including the BBC Breakfast, Newsnight and Question Time. In total, over 118,000 sentences were fed to the system. After training, the AI was tested to check its performance using the programmes broadcasted during March and September of this year.


And, the results?

Here are some awesome findings at the end of the test.

  • The system deciphered many phrases of the speaker by recognizing the lip gestures.

  • While the professional lip reader annotated 12.4 percentage of the words from the data set, the AI has recognized 46.8 percentage of all words without any error.

  • The mistakes incurred by the lip reading AI was minute like not recognizing the letter such as ‘s’ at the end of the word.

  • Most surprisingly, the AI is capable of outperforming several automatic lip-reading systems.


Similar to the DeepMind’s AI, a group of researchers from the University of Oxford has developed LIPNET, a lip-reading system for the hearing-impaired. This system has outperformed human lip-readers in a test conducted by the University with a data set called GRID. Whereas GRID contains a set of 51 unique words while the BBC TV shows contain a vocabulary of 17,500 unique words. This effectively means that Google’s lip-reading system has faced a bigger challenge than the LIPNET.


Whether you are in a noisy theatre or a crowded train, you can communicate with your personal assistant without shouting. Your assistant will open the phone camera and recognize your commands through the lip gestures to do a specific task, say, open your Facebook mobile app. This Artificial Intelligence can be of great help to annotate subtitles, surveillance CCTV cameras and importantly, it can support the life of hearing-impaired. Let this innovation be a key to a whole new level of text-to-speech technology.

No comments:

Post a Comment