- 著者
-
KIM Jinyoung
LEE Joohun
SHIRAI Katsuhiko
- 出版者
- 一般社団法人電子情報通信学会
- 雑誌
- IEICE transactions on fundamentals of electronics, communications and computer sciences (ISSN:09168508)
- 巻号頁・発行日
- vol.85, no.9, pp.2164-2168, 2002-09-01
- 参考文献数
- 9
- 被引用文献数
-
13
In this paper, for real-time automatic image transform based lip-reading under illumination variations, an efficient (smaller feature data size) and robust (better recognition under different lighting conditions) method is proposed. Image transform based approach obtains a compressed representation of image pixel values of speaker's mouth and is reported to show superior lip-reading performance. However, this approach inevitably produces large feature vectors relevant to lip information to require much computation time for lip-reading even when principal component analysis (PCA) is applied. To reduce the necessary dimension of feature vectors, the proposed method folded the lip image based on its symmetry in a frame image. This method also compensates the unbalanced illumination between the left and the right lip areas. Additionally, to filter out the inter-frame time-domain spectral distortion of each pixel contaminated by illumination noise, our method adapted the hi-pass filtering on the variations of pixel values between consecutive frames. In the experimental results performed on database recorded at various lighting conditions, the proposed lip-folding or/and inter-frame filtering reduced much the necessary number of feature data, principal components in this work, and showed superior recognition rate compared to the conventional method.