Posted on 16 November, 2017 (10 months ago)
We recently rolled out an update to the web simulator
which includes a new SSML audio design experience.
We now give you more options for creating natural,
quality dialog using newly supported SSML tags, including <prosody>,
<emphasis>, <audio> and others. The new tag <par> is coming soon
and lets you add mood and richness, so you can play background music
and ambient sounds while a user is having a conversation with your app.
To help you get started, we've added over 1,000 sounds to the sound library.
Listen to a brief SSML audio experiment that shows off some of the new features here.
So far, Actions on Google had limited SSML support, but today, there's a bit more you can do with SSML to enhance your apps' voice!
At the Devoxx Belgium conference last week, in a couple of talks showing Dialogflow,
Actions on Google, and Cloud Functions, I showed some quick examples of SSML.
For example, I made an attendee do some squats on stage! (but the camera didn't catch that unfortunately.) I created a loop over a tick-tock sound to mimick a countdown. I repeated x times the tick-tock sound. With x audio elements. But we can do better now, by using the repeatCount attribute instead!
<audio src="gs://my-bucket-sounds/tick-tock-1s.wav" repeatCount="10" />
It's much better than repeating my audio tag 10 times!
If you want to make your interactions even more lively, you could already use the Actions on Google sound library, or use a free sound library like Freesound.
But there's a promising upcoming tag that's gonna be supported soon: <par/>
If you will, par is a bit like a multi-track audio mixer. You'll be able to play different sounds in parallel, or make the voice speak in parallel. So you could very well have a background sound or music, with your app speaking at the same time.
Speaking of voice, the human voice goes up and down in pitch. With the prosody element, you can define the rate, pitch, and volume attributes. For instance, I make my voice sing some notes with semitones (but to be honest, it doesn't quite sound yet like a real singer!)
<prosody rate="slow" pitch="-7st">C</prosody>
<prosody rate="slow" pitch="-5st">D</prosody>
<prosody rate="slow" pitch="-3st">E</prosody>
<prosody rate="slow" pitch="-2st">F</prosody>
<prosody rate="slow" pitch="0st">G</prosody>
<prosody rate="slow" pitch="+2st">A</prosody>
<prosody rate="slow" pitch="+4st">B</prosody>
<prosody rate="slow" pitch="+6st">C</prosody>
You can also play with different levels of emphasis: