Page 45 - MSDN Magazine, June 2019
P. 45
Figure 1 System.Speech.Synthesis Method
Another way to structure input and specify how to read it is to use Speech Synthesis Markup Language (SSML), which is a cross-plat- form recommendation developed by the international Voice Browser Working Group (w3.org/TR/speech-synthesis). Microsoft TTS engines provide comprehensive support for SSML. This is how to use it:
string phrase = @"<speak version=""1.0"" xmlns=""http://www.w3.org/2001/10/synthesis"" xml:lang=""en-US"">";
phrase += @"<say-as interpret-as=""ordinal"">3rd</say-as>"; phrase += @"<break time=""1s""/>";
phrase += @"<say-as interpret-as=""cardinal"">3rd</say-as>"; phrase += @"</speak>";
synthesizer.SpeakSsml(phrase);
Notice it employs a different call on the SpeechSynthesizer class.
Now you’re ready to work on the prototype. This time create a new Windows Presentation Foundation (WPF) project. Add a form and a couple of buttons for prompts in two different languages. Then add click handlers as shown in the XAML in Figure 4.
Obviously, this is just a tiny prototype. In real life, PopulateMes- sages will probably read from an external resource. For example, a flight attendant can generate a file with messages in multiple languages by using an application that calls a service like Bing Translator (bing.com/translator). The form will be much more sophis- ticated and dynamically generated based on available languages.
Figure 2 Voice Info Code
using System.Speech.Synthesis;
namespace KeepTalking {
class Program {
static void Main(string[] args) {
var synthesizer = new SpeechSynthesizer(); synthesizer.SetOutputToDefaultAudioDevice();
synthesizer.Speak("All we need to do is to make sure we keep talking");
} }
}
When you were typing this code, IntelliSense opened a window with all the public methods and properties of the SpeechSynthesizer class. If you missed it, use “Control-Space” or the “dot” keyboard shortcut (or look at bit.ly/2PCWpat). What’s interesting here?
First, you can set different output targets. It can be an audio file or a stream or even null. Second, you have both synchronous (as in the previous example) and asynchronous output. You can also adjust the volume and the rate of speech, pause and resume it, and receive events. You can also select voices. This feature is important here, because you’ll use it to generate output in differ- ent languages. But what voices are available? Let’s find out, using the code in Figure 2.
On my machine with Windows 10 Home the resulting output from Figure 2 is:
Id: TTS_MS_EN-US_DAVID_11.0 | Name: Microsoft David Desktop | Age: Adult | Gender: Male | Culture: en-US
Id: TTS_MS_EN-US_ZIRA_11.0 | Name: Microsoft Zira Desktop | Age: Adult | Gender: Female | Culture: en-US
There are only two English voices available, and what about other languages? Well, each voice takes some disk space, so they’re not installed by default. To add them, navigate to Start | Settings | Time & Language | Region & Language and click Add a language, making sure to select Speech in optional features. While Windows supports more than 100 languages, only about 50 support TTS. You can review the list of supported languages at bit.ly/2UNNvba.
After restarting your computer, a new language pack should be available. In my case, after adding Russian, I got a new voice installed:
Id: TTS_MS_RU-RU_IRINA_11.0 | Name: Microsoft Irina Desktop | Age: Adult | Gender: Female | Culture: ru-RU
Now you can return to the first program and add these two lines instead of the synthesizer.Speak call:
synthesizer.SelectVoice("Microsoft Irina Desktop"); synthesizer.Speak("Всё, что нам нужно сделать, это продолжать говорить");
If you want to switch between languages, you can insert SelectVoice calls here and there. But a better way is to add some structure to speech. For that, let’s use the PromptBuilder class, as shown in Figure 3.
Notice that you have to call EndVoice, otherwise you’ll get a runtime error. Also, I used CultureInfo as another way to specify a language. PromptBuilder has lots of useful methods, but I want to draw your attention to AppendTextWithHint. Try this code:
var builder = new PromptBuilder(); builder.AppendTextWithHint("3rd", SayAs.NumberOrdinal); builder.AppendBreak(); builder.AppendTextWithHint("3rd", SayAs.NumberCardinal); synthesizer.Speak(builder);
Figure 3 The PromptBuilder Class
using System;
using System.Speech.Synthesis;
namespace KeepTalking {
class Program {
static void Main(string[] args) {
var synthesizer = new SpeechSynthesizer();
foreach (var voice in synthesizer.GetInstalledVoices()) {
var info = voice.VoiceInfo;
Console.WriteLine($"Id: {info.Id} | Name: {info.Name} |
Age: {info.Age} | Gender: {info.Gender} | Culture: {info.Culture}"); }
Console.ReadKey(); }
} }
using System.Globalization; using System.Speech.Synthesis;
namespace KeepTalking {
class Program {
static void Main(string[] args) {
var synthesizer = new SpeechSynthesizer(); synthesizer.SetOutputToDefaultAudioDevice();
var builder = new PromptBuilder();
builder.StartVoice(new CultureInfo("en-US")); builder.AppendText("All we need to do is to keep talking."); builder.EndVoice();
builder.StartVoice(new CultureInfo("ru-RU")); builder.AppendText("Всё, что нам нужно сделать, это продолжать говорить"); builder.EndVoice();
synthesizer.Speak(builder);
} }
}
msdnmagazine.com
June 2019 41