Tutorial Example

Extract Mandarin Chinese Phonemes in TTS – TTS Tutorial

In order to build a Mandarin Chinese TTS system, we should extract chinese phonemes. In this tutorial, we will introduce this topic.

Pinyin and Phonemes

We usually extract chinese phonemes from chinese pinyin. It is easy to get the pinyin of a chinese word or sentence.

Python Convert Chinese String to Pinyin: A Step Guide – Python Tutorial

For example, we may get a pinyin sequence as follows:

tong2 ping2 hu4 dong4 shuang1 xiang4 tong2 bu4 xu1 qiu2 kai1 fa1 ,xia4 zhou1 ji4 xu4 kai1 fa1

In order to use pinyin to convert text to speech, we can view [t,0,n,g…..,f,a,1] as basic chinese phonemes to use.

However, using single english char, for example t, 0, n, 2 et al, to convert chinese text to speech may get worse result. Because phonemes are less.

In order to get more chinese phonemes, we can extract initial consonants and simple or compound vowels.

For example:

23 initial consonants

b p m f d t n l g k h j q x zh ch sh r z c s y w

24 simple or compound vowels

a o e i u v ai ei ui ao ou iu ie ve er an en in un vn ang eng ing ong

We also can get them using python. For example:

from pypinyin import pinyin, lazy_pinyin, Style
from pypinyin import phonetic_symbol
from pypinyin.style._utils import get_initials, get_finals

def get_shengmu_yunmu(pinyin_word):
    strict = False
    x = ('%s %s' % (get_initials(pinyin_word, strict), get_finals(pinyin_word, strict)))
    return x

tx = lazy_pinyin('我抱着一大堆纸箱回了家', style=Style.TONE3, strict=False, tone_sandhi = True, neutral_tone_with_five=True)
print(tx)

x = [get_shengmu_yunmu(w) for w in tx]
print(x)

Run this code, we will get:

['wo3', 'bao4', 'zhe5', 'yi1', 'da4', 'dui1', 'zhi3', 'xiang1', 'hui2', 'le5', 'jia1']
['w o3', 'b ao4', 'zh e5', 'y i1', 'd a4', 'd ui1', 'zh i3', 'x iang1', 'h ui2', 'l e5', 'j ia1']