Research Trends

Home | My Profile | Contact Us

Products | order gateway | author gateway | editor gateway

Author Resources

Author Gateway

Article submission guidelines

Editor Resources

Editor/Referee Gateway

Agents/Distributors

Regional Subscription Agents/Distributors

Current Topics in Acoustical Research → Volumes → Volume 1 Issue 1

Abstract

Development of a female voice for a concatenative text-to-speech synthesis system
Ann K. Syrdal
Pages: 169 - 181
Number of pages: 13

ABSTRACT

This paper describes technical issues involved with and procedures used in the development of a 1000-element diphone female voice and subsequently of a 2500-element polyphone female voice for AT&T Network systems concatenative synthesis text-to-speech (TTS) system. Telephone bandpass filtering and speech coding techniques each reduce the intelligibility of natural human female speech relatively more than that of male speech. Both of these factors were hurdles to developing a satisfactory female TTS voice for telephony applications. Despite these difficulties, a highly intelligible female voice was developed by careful selection of elements for the acoustic inventory. The approach taken was to refine the acoustic inventory by iterative intelligibility testing to identify poor acoustic elements, to replace them with superior elements, and to repeat the process, covering as wide a range of acoustic elements and contexts as possible. This technique was very successful, and the percentage more errors for the female diphone TTS voice than for the male diphone TTS voice was reduced from 67% in early 1990 to 21% in late 1990. Expansion from the diphone to the polyphone TTS system in 1991 resulted in significant listener preferences of the polyphone over the diphone system for both the male TTS voice (60%-40%) and for the female voice (68%-32%). Intelligibility tests of the male and female polyphone TTS voices in 1992 indicated that TTS intelligibility was significantly lower (about 90% as intelligible) than that of the human voices on which the TTS systems were based. Intelligibility scores averaged across both human and TTS conditions were significantly higher for male voices than for female voices, but there was no significant interaction between sex and condition (human or TTS), indicating that the small differences in intelligibility between male and female voices were comparable for human and synthetic voices.

Buy this Article

E-Commerce
	Buy this article
	Buy this volume
	Subscribe to this title
	Shopping Cart

Quick Links
	Login
	Search Products
	Browse in Alphabetical Order : Journals Series/Books
	Browse by Subject Classification : Journals Series/Books

Miscellaneous
	Ordering Information
	Downloadable Forms