Spectral voice conversion based on unsupervised clustering of acoustic space

Masoud Geravanchizadeh

نویسندگان	Masoud Geravanchizadeh
همایش	INTERSPEECH
تاریخ برگزاری همایش	2000-10-16
محل برگزاری همایش	Beijing, China
شماره صفحات	614-617
نوع ارائه	پوستر
سطح همایش	بین المللی

چکیده مقاله

Abstract:

Voice conversion systems aim at modifying a source
speaker’s speech so that it is perceived as if a target speaker
had spoken it. Applying voice conversion techniques to a
concatenative text-to-speech synthesizer allows for the per-
sonification of such systems, so that additional voices from
a single source-speaker database can be produced quickly
and automatically. This paper presents a new algorithm in
which an effective and simple solution to the problem of
voice conversion is suggested with the goal of maintain-
ing high speech quality. Here, spectral conversion is per-
formed by locally linear transformations, where the min-
imum mean square estimation (MMSE) method is used
to compute the transformations. The acoustic features in-
cluded in the conversion are vocal tract parameters, which
are represented by log area ratio coefficients. Evaluation by
listening tests shows that the proposed algorithm makes it
possible to convert speaker individuality while maintaining
high quality.

لینک ثابت مقاله