Automated Lip Reading Software

from research organizations

Automated Lip Reading

Date:: March 25, 2016
Source:: University of East Anglia
Summary:: New lip-reading technology could help in solving crimes and provide communication assistance for people with hearing and speech impairments.
Share:

New lip-reading technology developed at the University of East Anglia (UEA) could help in solving crimes and provide communication assistance for people with hearing and speech impairments.

The visual speech recognition technology, created by Dr Helen L. Bear and Prof Richard Harvey of UEA's School of Computing Sciences, can be applied 'any place where the audio isn't good enough to determine what people are saying,' Dr Bear said.

Dr Bear, whose findings will be presented at the International Conference on Acoustics, Speech and Signal Processing (ICASSP) in Shanghai on March 25, said unique problems with determining speech arise when sound isn't available -- such as on CCTV footage -- or if the audio is inadequate and there aren't clues to give the context of a conversation. The sounds '/p/,' '/b/,' and '/m/' all look similar on the lips, but now the machine lip-reading classification technology can differentiate between the sounds for a more accurate translation.

Dr Bear said: 'We are still learning the science of visual speech and what it is people need to know to create a fool-proof recognition model for lip-reading, but this classification system improves upon previous lip-reading methods by using a novel training method for the classifiers.

Automated Lip Reading

'Potentially, a robust lip-reading system could be applied in a number of situations, from criminal investigations to entertainment. Lip-reading has been used to pinpoint words footballers have shouted in heated moments on the pitch, but is likely to be of most practical use in situations where are there are high levels of noise, such as in cars or aircraft cockpits.

Automated Lip Reading (ALR) is a software technology developed by speech recogni tion expert Frank Hubner.A video image of a person talking can b.

'Crucially, whilst there are still improvements to be made, such a system could be adapted for use for a range of purposes -- for example, for people with hearing or speech impairments. Alternatively, a good lip-reading machine could be part of an audio-visual recognition system.'

Prof Harvey said: 'Lip-reading is one of the most challenging problems in artificial intelligence so it's great to make progress on one of the trickier aspects, which is how to train machines to recognise the appearance and shape of human lips.'

The research was part of a three-year project and was supported by the Engineering and Physical Sciences Research Council (EPSRC).

The paper, 'Decoding visemes: Improving machine lip-reading,' will be published on March 25, 2016 in the Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing 2016.

Story Source:

Materials provided by University of East Anglia. Note: Content may be edited for style and length.

Cite This Page:

University of East Anglia. 'Read my lips: New technology spells out what's said when audio fails.' ScienceDaily. ScienceDaily, 25 March 2016. <www.sciencedaily.com/releases/2016/03/160325093702.htm>.

University of East Anglia. (2016, March 25). Read my lips: New technology spells out what's said when audio fails. ScienceDaily. Retrieved September 17, 2019 from www.sciencedaily.com/releases/2016/03/160325093702.htm

University of East Anglia. 'Read my lips: New technology spells out what's said when audio fails.' ScienceDaily. www.sciencedaily.com/releases/2016/03/160325093702.htm (accessed September 17, 2019).

RELATED TERMS
- Service dog
- Communication
- Hearing impairment
- Dyslexia
- Asperger syndrome
- Speech recognition
- Stuttering
- Telecommunication

World’s First 'Lip Password' Utilizes a User's Lip Motions to Create Password

Mar. 13, 2017 -- Scientists have invented a new technology entitled “lip motion password” (lip password) which utilizes a person’s lip motions to create a ... read more

Early Evidence Suggests Hybrid Cochlear Implants May Benefit Millions With Common Form of Hearing Loss

July 28, 2015 -- People with a common form of hearing loss not helped by hearing aids achieved significant and sometimes profound improvements in their hearing and understanding of speech with hybrid cochlear implant ... read more

Identifying Speech, Hearing Problems Early May Prevent Future Losses

May 1, 2015 -- Experts share tips and tools that identify and prevent speech, voice, and hearing impairments. Such impairments affect 43 million Americans, they say, noting how important it is to diagnose these ... read more

Case Study: Hearing Loss in One Infant Twin Affects Mother's Speech to Both Babies

Oct. 29, 2014 -- Is it possible that hearing loss in one infant from a pair of twins can affect the mother’s speech to both infants? A new acoustics study zeroes in on this question and suggests that not only is ... read more

Below are relevant articles that may interest you. ScienceDaily shares links with scholarly publications in the TrendMD network and earns revenue from third-party advertisers, where indicated.

by University of Oxford

A new computer software program has the potential to lip-read more accurately than people and to help those with hearing loss, Oxford University researchers have found.

Watch, Attend and Spell (WAS), is a new artificial intelligence (AI) software system that has been developed by Oxford, in collaboration with the company DeepMind.

The AI system uses computer vision and machine learning methods to learn how to lip-read from a dataset made up of more than 5,000 hours of TV footage, gathered from six different programmes including Newsnight, BBC Breakfast and Question Time. The videos contained more than 118,000 sentences in total, and a vocabulary of 17,500 words.

The research team compared the ability of the machine and a human expert to work out what was being said in the silent video by focusing solely on each speaker's lip movements. They found that the software system was more accurate compared to the professional. The human lip-reader correctly read 12 per cent of words, while the WAS software recognised 50 per cent of the words in the dataset, without error. The machine's mistakes were small, including things like missing an 's' at the end of a word, or single letter misspellings.

The software could support a number of developments, including helping the hard of hearing to navigate the world around them. Speaking on the tech's core value, Jesal Vishnuram, Action on Hearing Loss Technology Research Manager, said: 'Action on Hearing Loss welcomes the development of new technology that helps people who are deaf or have a hearing loss to have better access to television through superior real-time subtitling.

'It is great to see research being conducted in this area, with new breakthroughs welcomed by Action on Hearing Loss by improving accessibility for people with a hearing loss. AI lip-reading technology would be able to enhance the accuracy and speed of speech-to-text especially in noisy environments and we encourage further research in this area and look forward to seeing new advances being made.'

Commenting on the potential uses for WAS Joon Son Chung, lead-author of the study and a graduate student at Oxford's Department of Engineering, said: 'Lip-reading is an impressive and challenging skill, so WAS can hopefully offer support to this task - for example, suggesting hypotheses for professional lip readers to verify using their expertise. There are also a host of other applications, such as dictating instructions to a phone in a noisy environment, dubbing archival silent films, resolving multi-talker simultaneous speech and improving the performance of automated speech recognition in general.'

The research team comprised of Joon Son Chung and Professor Andrew Zisserman at Oxford, where the research was carried out, together with Dr Andrew Senior and Dr Oriol Vinyals at DeepMind. Professor Zisserman commented `this project really benefitted by being able to bring together the expertise from Oxford and DeepMind'.

Provided by University of Oxford

Citation: New computer software program excels at lip reading (2017, March 17) retrieved 21 September 2019 from https://phys.org/news/2017-03-software-excels-lip.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.