Step 4: Pausing and Resuming the Audio Stream

by Kaan Kilic on Feb 13, 2019

In this step, we will flesh out our podcast player and allow the user to pause and resume the audio file.

On Amazon Alexa

Currently, our podcast player can only start playing audio files, nothing else. It's time to change that. We will start with Amazon since they provide us with built-in intents, which we can use as a guideline later on.

The built-in intents for playback control are:

  • AMAZON.PauseIntent
  • AMAZON.ResumeIntent
  • AMAZON.CancelIntent
  • AMAZON.LoopOffIntent
  • AMAZON.LoopOnIntent
  • AMAZON.NextIntent
  • AMAZON.PreviousIntent
  • AMAZON.RepeatIntent
  • AMAZON.ShuffleOffIntent
  • AMAZON.ShuffleOnIntent
  • AMAZON.StartOverIntent

Two of these built-in intents, AMAZON.PauseIntent and AMAZON.ResumeIntent, are required, the other intents are technically not, but these intents can still be invoked by our user, so we should, as Amazon says, handle the situation gracefully and return a short response, explaining that the command is not supported, otherwise we will run into errors.

Updating the Jovo Language Model

Learn more about the Jovo language Model here.

Before we can work with the AudioPlayer interface on an Alexa device, we have to prepare one more thing. As I said two of these intents are required, because of that, an Alexa Skill, that has the AudioPlayer interface enabled, has to have both intents in its language model as well. Instead of using the Alexa Developer Portal to add those intents, we will use the Jovo Language Model, which allows us to maintain a single file, that will be used to create the platform specific language models.

We can find the language model file in the /models folder of our Jovo project:

Jovo Folder Structure

Here's its default state:

Right next to the other Alexa built-in intents inside the alexa object, which is used to add platform-specific intents and input types, we will add the AMAZON.PauseIntent and AMAZON.ResumeIntent in the same format:

The content of each platform object will be used to extend the language model defined in the platform-independent part.

We can also change the invocation name to my podcast player:

Now that our preparation are done, we can use the Jovo Language Model to create the platform specific files and deploy them to the Amazon Developer Console right after:

This was just a small sneak peek into the Jovo Language Model. We will use the it again in one the next steps, where we will go more in depth.

Updating the Logic

After we have updated our language model, we need to add the intents to the logic of the code:

Adding AMAZON.PauseIntent

Open up the app.js file inside your src/ folder and add the AMAZON.PauseIntent intent to our handler.

Every time the intent is called we want to pause the audio stream by sending out a stop directive (learn more in the official reference by Amazon):

To test it out we run our Jovo Webhook and say Alexa, start my podcast player to launch the app with our Amazon Echo. After the audio started playing, we just say Alexa, pause to invoke the AMAZON.PauseIntent.

That's it. We have successfully paused the audio file.

Adding AMAZON.ResumeIntent

Alright, our user can now pause the audio, but now they also have to be able to resume where they left off. That's what the second required built-in AMAZON.ResumeIntent is for.

As we learned at the beginning of the course, we can specify the point at which the audio stream should start playing using the offset variable. To let the user resume, we simply save the offset at the time at which they stop the audio in a database, so we can retrieve the offset and use it in our ResumeIntent to start the audio back at the correct spot.

For this, we have to do the following steps:

An Introduction to Databases in Jovo

Learn more about different data types here.

As the session is closed after the PauseIntent, we have to find a way to persist data across sessions. This is where the Jovo Database Integrations can be helpful.

The default databse is File DB for local development and prototyping. Jovo will save the data inside a JSON file, which we can find in our project's root folder under db/db.json. To save and load data we use Jovo Framework's user class:

Later on, when you prepare to launch your application and host your app on e.g. AWS Lambda, you have switch to a different database, e.g. DynamoDB. Find all supported databases here.

Updating the PlaybackStopped Request Handler

The most convenient way to save the offset is with the help of the AlexaSkill.PlaybackStopped request.

The request will be sent if:

  • we stop the current stream and start a new one,
  • we stop the audio stream altogether,
  • the user pauses the current stream,
  • the user makes any type of other voice requests, which will pause the audio for the moment. It will resume after the interaction has ended.

That JSON request will contain the data we need:

We save the offset to our database using one of $audioPlayer class's built-in methods:

Now we add the AMAZON.ResumeIntent to our handler and retrieve the offset from the database.

But, there's one more thing missing. How do we know, which audio file to stream? We have to not only save the offset, but the current audio files URL as well.

Saving and Retrieving the Current Audio File

The first step is to save the current episode before we send out the first play directive:

Besides that, we have to remember that we enqueue the next song, which we have to save as the current episode after the first one finished playing. For that we save the audio file, which we enqueue, as the nextEpisode and switch out currentEpisode with nextEpisode as soon as the audio stream finishes, which we get notified about with the PlaybackFinished request:

Now we can finish implementing the AMAZON.ResumeIntent. Simply retrieve the current episode from the database and send out a play directive using that and the offset:

To test it out, simply pause the audio stream at some point and restart it with Alexa, resume.

Remaining Alexa built-in Intents

There are still quite many built-in intents remaining. For now we will simply tell the user that these are not implemented and revisit at least two of them at a later point:

On Google Assistant

Here's the deal: Implementing a resume and pause intent the same way won't work here. First of all, pausing an audio stream is handled by Google, so those requests don't reach our app's code. Resuming an audio stream at a certain point won't work either, because we can't specify the offset as we did with Alexa.

But, there's one thing we can do. Instead of starting with the very first audio file every time, we will automatically start playing the audio file the user last listened to.

To do that, we have to additionally save the current episode in the GoogleAction.Finished intent:

Now we can check at our LAUNCH intent if the request is from a new user, if that's the case we start with the first track, otherwise we stream the most recently listened one:

As you can see, for the Google Action we still need to find a way to use different title parameters for different episodes. We will look into this in a later step.


Wow, that was quite a long step, but we can now successfully pause and resume episodes. Our handler should look like this:

Next Step

In the next step, we will build a system to store and retrieve the episodes of our podcast.

Step 5: Store and Retrieve Multiple Episodes

Kaan Kilic

Technical Content Marketing Associate at Jovo

Comments and Questions

Any specific questions? Just drop them below or join the Jovo Community Forum.

Join Our Newsletter

Be the first to get our free tutorials, courses, and other resources for voice app developers.