|Project Description: ||Current computer animated speech systems do not take into account the visual impact of prosody. This leads to nonrealistic animated figures, since prosodic effects are fundamental in human communications. Prosody, for a person speaking in English, is the stress, intonation, length, and rhythm of syllables and sentences. A personís use of prosody during speech causes his/her mouth, face, jaw, and lips to visually change.
The project had three distinct phases. The first phase started with an analysis of the existing linguistics research on standard American English. During this phase an experimental corpus of words and sentences was developed that exhibits prosody. This corpus was used in a motion capture environment to capture raw data of prosodic effects on the human face. In the second phase, computer assisted data segmentation was used to remove noise, determine the timing of phonemes, and to match the data captured to prosody parameters. During the third phase, computer software was developed to implement an algorithm to extract jaw, mouth, and facial muscle parameters using the motion capture data as input data. These parameters were used to animate a parametric facial model. The extracted parametric curves can then be used to develop a model of prosody to create advanced facial animations.