SSML Extensions SSML Extensions for for Chinese Voice Browsing Chinese Voice Browsing Helen MENG, Wai-Kit LO, Tien-Ying FUNG, Yuk-Chi LI and Zhiyong WU Human-Computer Communications Laboratory Human-Computer Communications Laboratory Department of Systems Engineering and Engineering Management Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong The Chinese University of Hong Kong 2nd November, 2005 2nd November, 2005
SSML Extensions for Chinese Voice Browsing. Helen MENG, Wai-Kit LO, Tien-Ying FUNG, Yuk-Chi LI and Zhiyong WU Human-Computer Communications Laboratory Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2nd November, 2005. Outline. - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
SSML Extensions SSML Extensions for for
Chinese Voice BrowsingChinese Voice Browsing
Helen MENG, Wai-Kit LO, Tien-Ying FUNG, Yuk-Chi LI and Zhiyong WU
Human-Computer Communications LaboratoryHuman-Computer Communications LaboratoryDepartment of Systems Engineering and Engineering ManagementDepartment of Systems Engineering and Engineering Management
The Chinese University of Hong Kong The Chinese University of Hong Kong
2nd November, 20052nd November, 2005
22
OutlineOutline
• Characteristics of ChineseCharacteristics of Chinese
• Proposed attributes for existing elementsProposed attributes for existing elements– ““dialect-accent”dialect-accent”
• Proposed elementsProposed elements– <phrase><phrase> and and <word><word>– <tone><tone>
• Proposed attribute valuesProposed attribute values– for for “interpret-as”“interpret-as” attribute in attribute in <say-as><say-as> element element
• SummarySummary
33
Characteristics of ChineseCharacteristics of Chinese
• Rich in dialects, Rich in dialects, e.g., Cantonese, Shanghaiese, Mandarine.g., Cantonese, Shanghaiese, Mandarin
– Write alike, speak differently Write alike, speak differently • similar writing system; e.g., similar writing system; e.g., 中国 中国 and and 中國中國• significantly different pronunciationssignificantly different pronunciations
– MandarinMandarin with different with different accentsaccents
• No explicit phrase and word boundariesNo explicit phrase and word boundaries– e.g., e.g., 我 們 現 在 在 開 電 話 會 議我 們 現 在 在 開 電 話 會 議
(we are) (now) (having) (a teleconference)(we are) (now) (having) (a teleconference)
– proper segmentation is critical for proper segmentation is critical for prosodic controlprosodic control, , pronunciation selectionpronunciation selection for homographs and resolution of for homographs and resolution of semantic ambiguitysemantic ambiguity
• Monosyllabic and tonalMonosyllabic and tonal– Syllable + Lexical Tone Syllable + Lexical Tone lexical meaning of Chinese character lexical meaning of Chinese character
• tone can change according to meaning, context, mode of speakingtone can change according to meaning, context, mode of speaking
• Specify dialects and accents in a languageSpecify dialects and accents in a language– use with use with xml:langxml:lang [XML1.0] [XML1.0]– dialect-accent = primary-subtag[“-”optional-subtag]dialect-accent = primary-subtag[“-”optional-subtag]– primary-subtag = 2ALPHAprimary-subtag = 2ALPHA
• specify specify dialectdialect• e.g., e.g., MDMD for Mandarin, for Mandarin, CTCT for Cantonese for Cantonese
– optional-subtag = 2ALPHAoptional-subtag = 2ALPHA• specify specify accentaccent• e.g., e.g., BJBJ for Beijing, for Beijing, GDGD for Guangdong, for Guangdong, HKHK for Hong Kong for Hong Kong• follows the abbreviations of Chinese provinces, autonomous regions follows the abbreviations of Chinese provinces, autonomous regions
and special administrative regions listed in the EDU.CN Domain Policy and special administrative regions listed in the EDU.CN Domain Policy (( 中國教育和科研計算機網 中國教育和科研計算機網 EDU.CN EDU.CN 網絡域名註冊辦法網絡域名註冊辦法 ))11
– examplesexamples• Mandarin in Beijing and Guangdong accent: Mandarin in Beijing and Guangdong accent: MD-BJMD-BJ, , MD-GD MD-GD • Cantonese in Hong Kong and Guangdong accent: Cantonese in Hong Kong and Guangdong accent: CT-HKCT-HK, , CT-GDCT-GD
1 Defined by the China Education and Research Network Information Centre (CERNET 網絡信息中心)
我我 (I am) (I am) 從 從 (from) (from) 香港香港 (Hong Kong)(Hong Kong) 來的。來的。 </p></p>
xml:lang values Dialect Accent“dialect-accent”
value
CT-HK
CT-GD
MD-HK
MD-BJ
MD-TW
Hong Kong
Guangdong
Hong Kong
Beijing
Taiwan
Cantonese
Mandarin
zh-HK
Mandarin withBeijing accent
Mandarin with Guangdong accent
Cantonese with Hong Kong accent
<phrase> and <word> elements<phrase> and <word> elements
1010
Enrich <p>, <s> with <phrase>, <word>Enrich <p>, <s> with <phrase>, <word>
• Current SSML 1.0: <p> and <s>Current SSML 1.0: <p> and <s>• Proposed elements: Proposed elements: <phrase><phrase> and and <word><word>
– Serve as cues for prosodic control (e.g., pause)Serve as cues for prosodic control (e.g., pause)– Assist correct pronunciation selection for Assist correct pronunciation selection for
homographshomographs• A Cantonese exampleA Cantonese example
– The character The character 行行 has FIVE pronunciationshas FIVE pronunciations
Proposed <phrase> ElementProposed <phrase> Element
• Definition:Definition:– Defines the course of a Chinese phraseDefines the course of a Chinese phrase– No attributesNo attributes– Occurs within <s>Occurs within <s>– These elements can be nested within <phrase>These elements can be nested within <phrase>
– Defines the course of a Chinese wordDefines the course of a Chinese word– No attributesNo attributes– Occur within <s> and <phrase>Occur within <s> and <phrase>– These elements can be nested within <phrase>These elements can be nested within <phrase>
• ToneTone– Important in Chinese pronunciationImportant in Chinese pronunciation– Tones can vary according to differences in meaning, context and Tones can vary according to differences in meaning, context and
mode of speakingmode of speaking– 相相
• in in tone 2tone 2 means means photophoto• in in tone 3tone 3 means means facial appearance / ministerfacial appearance / minister
• Current SSML 1.0: phonemeCurrent SSML 1.0: phoneme– Requires pronunciation transcriptionRequires pronunciation transcription– Example Example
• Specify as fractionSpecify as fraction– e.g. 3/4e.g. 3/4
• Verbalization of fraction in Chinese:Verbalization of fraction in Chinese:– with an additional word: with an additional word: 分之分之 ((out of)out of)– A / B (A / B (AA out ofout of BB): ): BB 分之分之 AA [note that the order is reversed!][note that the order is reversed!]
– e.g. e.g. 33//44 is verbalized as is verbalized as 四 (four) 分之分之 (out of)(out of) 三 (three)
• ““format” and “detail” attributes not requiredformat” and “detail” attributes not required• ExampleExample
我吃了 3/4 個橙(I) (ate) (orange)
我吃了 <say-as interpret-as="fraction">3/4</say-as> 個橙我吃了四分之三個橙 (I ate three-fourth of the orange)
2020
““measure” Valuemeasure” Value
• Specify as measurementSpecify as measurement– e.g. 10cm, 30ml e.g. 10cm, 30ml
• measurement = measurement = numbernumber + + unitunit– numbernumber [VoiceXML2.0]; e.g. 10 is ten (not one zero) [VoiceXML2.0]; e.g. 10 is ten (not one zero)– unitunit: translated and pronounced in Chinese, : translated and pronounced in Chinese,
e.g. cm is e.g. cm is 厘米厘米 , g is , g is 克克 , oz is , oz is 安士安士 , yd is , yd is 碼碼 • ““format” and “detail” attributes not required format” and “detail” attributes not required • Example Example
• Specify as Specify as URIURI or or emailemail address address• Possible ways to verbalize a URI:Possible ways to verbalize a URI:
1.1. Read the whole string in Read the whole string in EnglishEnglish, including punctuations, including punctuations2.2. Omit http:// (ftp://, etc.), read the rest in Omit http:// (ftp://, etc.), read the rest in EnglishEnglish3.3. Read alphabets in Read alphabets in EnglishEnglish, punctuations in , punctuations in ChineseChinese
• ““format” attribute value: “email” or “uri”format” attribute value: “email” or “uri”• ExampleExample
詳情請瀏覽 http://www.w3.org
(for details) (please) (browse)– Possible verbalizations:
1. H T T P colon slash slash W W W dot W three dot O R G2. W W W dot W three dot O R G3. W W W 點 W 三 點 O R G (點 : dot 三 : three)
[Similarly the protocol part may be kept as another option]
• Specify as percentageSpecify as percentage• Verbalization of percentage in ChineseVerbalization of percentage in Chinese
– with an additional word: with an additional word: 百分之百分之 (out of a hundred)(out of a hundred)– A%: A%: 百分之百分之 A A – e.g. e.g. 7070%% is verbalized as is verbalized as 百 分 之百 分 之 (out of a hundred)(out of a hundred) 七 十七 十
(seventy)(seventy)
• ““format” and “detail” attributes not requiredformat” and “detail” attributes not required• ExampleExample
海洋約佔全球總面積的百分之七十 (ocean covers 70% of global surface)
(ocean) (covers) (global) (surface)
2323
““ratio” Valueratio” Value
• Specify as ratioSpecify as ratio– e.g. 1:3 e.g. 1:3
• Verbalization of ratio in Chinese:Verbalization of ratio in Chinese:– with an additional word: with an additional word: 比比 (to)(to) – A:B (A:B (A A toto BB): ): A A 比比 BB– e.g. 1:99 is verbalized as e.g. 1:99 is verbalized as 一一 (one)(one)比比 (to)(to) 九十九九十九 (ninety nine)(ninety nine)
• ““format” and “detail” attributes not requiredformat” and “detail” attributes not required• ExampleExample
用 1:99 的稀釋漂白水
用 <say-as interpret-as="ratio">1:99</say-as> 的稀釋漂白水用一比九十九的稀釋漂白水 (use diluted bleach at a ratio of 1:99)
(use) (diluted) (bleach water)
2424
SummarySummary
• ““dialect-accent”dialect-accent” attribute to enrich the attribute to enrich the xml:lang attributexml:lang attribute
• <phrase><phrase> and and <word><word> for text processing for text processing• <tone><tone> for pronunciation for pronunciation• 6 values for “interpret-as” attribute6 values for “interpret-as” attribute