Interactive Poem is a new type of poem, created by you and a computer agent, collaborating in a poetic world full of inspiration, emotion and sensitivity. The concept of this interactive poem is based on conventional poetry, but goes beyond traditional limits by introducing the capability of interaction. You and a computer agent create a dialogue by exchanging short poetic phrases, and through this exchange produce a new poetic world that integrates the poetic world of the agent with your own.
Interaction: A computer agent called “MUSE” who has been carefully designed with a face suitable for expressing the emotion of a poetic world, appears on the screen. She will utter a short poetic phrase to you. Hearing it allows you to enter the world of the poem and, at the same time, feel an impulse to respond by uttering one of the optional phrases or by creating your own poetic phrase. Exchanging poetic phrases through this interactive processes allows you and MUSE to become collaborative poets who generate a new poem and a new poetic world. The interaction mechanism operates as follows.
1) When MUSE utters a phrase, the recognition process is activated. A participant then utters a phrase and it is recognized by the phrase recognition function, which uses the lexicon subset corresponding to the next set of phrases in the transition network. At the same time, emotion contained in the utterance is recognized by the emotion recognition function.
2) Based on information pertaining to recognition and the transition network, reaction of the system is decided. The facial expression of MUSE changes according to the results of emotion recognition, and the phrase MUSE utters is based on the results of phrase recognition and the transition network. The background scene changes as the transitions continue.
3) In the above atated manner, poetic phrases between MUSE and the participant are consecutively produced.
The speech recognition unit has two different speech recognition functions: phrase recognition and emotion recognition. To recognition each phrase uttered by a participant, HMM (hidden Markov model) based speaker-independent speech recognition technology has been adopted. Each phrase to be uttered is represented in the form of a phoneme sequence and is stored in the lexicon. To simultaneously detect the emotional state of a participant, the emotion recognition function is introduced. A neural network architecture has been adopted as the basic architecture for emotion recognition. This neural network is trained by using the utterances of many speakers to express the eight emotional states of joy, happiness, anger, fear, teasing, disgust, disappointment, and emotionless. As such, speaker-independent and content-independent emotion recognition is realized.
- Ryohel Nakatsu (Japan) received his B.S., M.S. and Ph.D. degrees in electronic engineering from Kyoto University in 1969, 1971 and 1982, respectively. After joining NTT (Nippon Telegraph & Telephone Cooperation) in 1971, he mainly worked on speech recognition technology. Since 1994, he has been with ATR and currently is the president of the AIR Media Integration & Communications Research Laboratories. Recently, he has become interested in the recognition of non-verbal information such as emotions in speech. He is a member of the IEEE, the Institute of Electronics, Information and Communication Engineers Japan (lEICE-1) and the Acoustical Society of Japan. In 1996 he met Naoko Tosa, a media artist, and they started collaborating. They have developed several computer characters which are able to communicate with people based on emotions. Their works were exhibited at the National Museum of Art in Osaka, 0 Museum in Tokyo, and other museums and art exhibitions.Their recent work, called Interactive Poem, was awarded L’Oreal Prize in 1997.
- Naoko Tosa (Japan) is a Director (Interactive Theater Project) in the ATR Media Integration & Communications Research Laboratories. She is also an Visiting Associate Professor in Kobe University, and a lecturer in the Dept. of Imaging Arts and Sciences, Mu.s.a.shino Art University. Her major research area is Art and Technology where she is working on the creation of Experimental Film, Video Art, computer graphics animation, and interactive arts. Her recent work includes the Neuro-Baby project, an autonomous computer agent with automatic facial expression and behavior synthesis that can respond to human voice by recognizing emotions and feelings. Her work was exhibited at Museum of Modern Art (New York), Metropolitan Art Museum, SIGGRAPH, Ars ELECTRONICA, Long Beach Museum, and other locations worldwide. Also, her works are collected at The Japan Foundation, American Film Association, Japan Film Culture Center, Nagoya Prefecture Modern Art Museum Japan, and other institutions in Japan. tosa.media.kyoto-u.ac.jp