Spectral voice conversion based on unsupervised clustering of acoustic space

نویسندگانMasoud Geravanchizadeh
همایشINTERSPEECH
تاریخ برگزاری همایش2000-10-16
محل برگزاری همایشBeijing, China
شماره صفحات614-617
نوع ارائهپوستر
سطح همایشبین المللی

چکیده مقاله

Abstract:

Voice conversion systems aim at modifying a source
speaker’s speech so that it is perceived as if a target speaker
had spoken it. Applying voice conversion techniques to a
concatenative text-to-speech synthesizer allows for the per-
sonification of such systems, so that additional voices from
a single source-speaker database can be produced quickly
and automatically. This paper presents a new algorithm in
which an effective and simple solution to the problem of
voice conversion is suggested with the goal of maintain-
ing high speech quality. Here, spectral conversion is per-
formed by locally linear transformations, where the min-
imum mean square estimation (MMSE) method is used
to compute the transformations. The acoustic features in-
cluded in the conversion are vocal tract parameters, which
are represented by log area ratio coefficients. Evaluation by
listening tests shows that the proposed algorithm makes it
possible to convert speaker individuality while maintaining
high quality.

لینک ثابت مقاله