体育赛事投注记录

advertisement

Visual Speech Processing and Recognition

  • U. B. Mahadevaswamy
  • M. Shashank RaoEmail author
  • S. Vrushab
  • C. Anagha
  • V. Sangameshwar
Conference paper
  • 61 Downloads
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1141)

Abstract

体育赛事投注记录lip reading is the ability to understand what a person is communicating using just the video information. due to the advent of internet and computers, it is now possible to remove human intervention from lip reading. such automation is only feasible because of a couple of developments in the field of computer vision: availability of a large-scale dataset for training and use of neural network models. the applications to this are numerous. from dictating messages to a device in a noisy environment to improving speech recognition in the current technologies, visual speech recognition has proved to be pivotal. in this paper, the lip-reading models are based on deep neural network architectures that capture temporal data which are created for the task of speech recognition.

Keywords

CNN Viseme ConvLSTM PyDrive HOG 

References

  1. 1.
    Chou, G., Hwang, J.: Lipreading from color video. TIP 6(8), 1192–1195 (1997)
  2. 2.
    Saenko, K., Livescu, K., Siracusa, M., Wilson, K., Glass, J., Darrell, T.: Visual speech recognition with loosely synchronized feature streams. In: ICCV, pp. 1424–1431 (2005)
  3. 3.
    Zhao, G., Barnard, M., Pietikainen, M.: Lipreading with local spatial temporal descriptors. TMM 11(7), 1254–1265 (2009)
  4. 4.
    Benhaim, E.; Sahbi, H., Vittey, G.: Continuous visual speech recognition for audio speech enhancement. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  5. 5.
    Tran, D., Bourdev, L.D., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: ICCV (2015)
  6. 6.
    Learning Spatiotemporal Features using 3DCNN and Convolutional LSTM for Gesture Recognition
  7. 7.
    Xingjian, S.H.I., Chen, Z., Wang, H., Yeung, D.Y.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting (2015). arXiv: 1506.04214[cs.CV]
  8. 8.
    Arifin, F., Nasuha, A., Hermawan, H.D.: Lip reading based on background subtraction and image projection. In: 2015 IEEE International Conference on Information Technology Systems and Innovation (ICITSI)
  9. 9.
    Chung, J.S., Zisserman, A.: Lip reading in the wild. In: Asian Conference on Computer Vision (ACCV)

Copyright information

© Springer Nature Singapore Pte Ltd. 2021

Authors and Affiliations

  • U. B. Mahadevaswamy
    • 1
  • M. Shashank Rao
    • 1
    Email author
  • S. Vrushab
    • 1
  • C. Anagha
    • 1
  • V. Sangameshwar
    • 1
  1. 1.Department of Electronics and Communication EngineeringSri Jayachamarajendra College of EngineeringMysuruIndia

Personalised recommendations