Quantcast
Channel: Telerik Blogs
Viewing all articles
Browse latest Browse all 5210

Speech Recognition Made Easy with Telerik RadControls for WP8

$
0
0

Introduction

Hello everyone! I’m sure by now you’ve heard the news from //Build/ that Telerik has released its Windows Phone 8 controls and this release includes several new components:

  • Speech Recognition – Allows your end-users to navigate and interact with your app using their voice.
  • MultiResolutionImage – Windows Phone 8 supports new display resolutions up to 1280x768 and Telerik RadControls make it easy for developers to display the appropriate images in their apps, according to the device resolution through this control.
  • DataForms – This new component allows you to automatically generate the UI for your input forms, with support for validation and custom layouts.
  • Along with many others detailed here.

In this post, we are going to explore the Speech Recognition API provided by Telerik that is currently in CTP (Community Tech Preview) that you can grab now from your account. But before we begin, let’s take a look at the completed application, then build it step-by-step together

Note: You may also download the completed source code now if you wish.




Support in the Official Windows Phone 8 SDK

Before we get started, let’s take a look at what is included in the official Windows Phone 8 SDK. There are three speech components that you can integrate with your app:

  • Voice Commands – After your user installs your application, they have the ability to say “open” or “start”, followed by your app name. They can also use the voice commands to deep link into a page inside your application.
  • Speech Recognition (and the focus of this post) – After your app launches, you are required to use a GUI (Graphical User Interface) for speech recognition that provides visual feedback to users. While a GUI is required, you can use your own or use the one built-in. It also requires that you setup voice commands from C#/VB, which adds another layer of testing to be performed. Below is a sample of what the built-in GUI provides.

WP8-2WP8-3WP8-4

While this is great for some applications, others may like a less intrusive way of obtaining speech data from the user. This is where Telerik RadControls for Windows Phone 8 comes in. No GUI is required and you can declare voice commands declaratively through XAML instead of code-behind.

  • Text-to-Speech (TTS) – is the last on the list and simply allows your app to translate text to speech and output it through the phones speaker. This is also known as “Speech Synthesis”.

Taking a look at the Telerik Speech Recognition API (CTP) in RadControls for Windows Phone 8.

Now that we have seen what Microsoft provides out of the box, let’s examine how to implement using our API.

First, you will need to obtain RadControls for Windows Phone 8 and then download the SpeechRecognition DLL found in your account.

Begin by creating a new VS2012 Windows Phone 8 project and adding references to:

  1. Telerik.Windows.Controls.Primitives – Only required to use our RadSlideView control which is shown in the demo.
  2. Telerik.Windows.Controls.Speech – Required to use our Speech Recognition API.


You will also need to add the ID_CAP_SPEECH_RECOGNITION and ID_CAP_MICROPHONE capabilities in the app manifest. If you need help in learning how to do this, then consult this document.

We are going to use the RadSlideView in this demo, so let’s begin by adding some images to our application. If you don’t have any images readily available, then download the completed project and use mine. We are going to create a folder called, “Images” in the current solution and add several images to this folder.

Let’s jump into the MainPage.xaml and declare our UI (User Interface)

Our UI is simply going to be the RadSlideView control taking up the entire screen with the Speech Recognition API added in. Let’s go ahead and add the proper XML namespaces as shown below:

   1: xmlns:telerikSpeech="clr-  namespace:Telerik.Windows.Controls;assembly=Telerik.Windows.Controls.Speech"
   2: xmlns:telerikPrimitives="clr-namespace:Telerik.Windows.Controls;assembly=Telerik.Windows.Controls.Primitives"
   3: xmlns:speech="clr-namespace:TelerikSpeechRecognitionSampleApp.SpeechHandlers"

This is going to allow us to reference our controls in XAML, and we are also going to add a new SpeechHandlers class to separate the voice actions from our UI.

Let’s go ahead now and replace the existing Grid with the following code snippet.

   1:<Gridx:Name="ContentPanel"Grid.Row="1"Margin="12,0,12,0">
   2:<telerikPrimitives:RadSlideViewx:Name="xSlideView"ItemsSource="{Binding}"TransitionMode="Flip"IsLoopingEnabled="True">
   3:<telerikPrimitives:RadSlideView.ItemTemplate>
   4:<DataTemplate>
   5:<ImageSource="{Binding}"Stretch="Fill"Margin="6,0,6,0"/>
   6:</DataTemplate>
   7:</telerikPrimitives:RadSlideView.ItemTemplate>
   8:<telerikSpeech:SpeechManager.SpeechMetadata>
   9:<telerikSpeech:SpeechRecognitionMetadata
  10:InputIdentificationHint="Available commands: 'next' or 'previous'"
  11:InputIdentificationToken="MainContent">
  12:<telerikSpeech:SpeechRecognitionMetadata.InputHandler>
  13:<speech:PreviousNextSpeechHandler/>
  14:</telerikSpeech:SpeechRecognitionMetadata.InputHandler>
  15:<telerikSpeech:SpeechRecognitionMetadata.RecognizableStrings>
  16:<telerikSpeech:RecognizableStringValue="next"/>
  17:<telerikSpeech:RecognizableStringValue="previous"/>
  18:</telerikSpeech:SpeechRecognitionMetadata.RecognizableStrings>
  19:</telerikSpeech:SpeechRecognitionMetadata>
  20:</telerikSpeech:SpeechManager.SpeechMetadata>
  21:</telerikPrimitives:RadSlideView>
  22:</Grid>

In this sample, we are adding a RadSlideView and setting its DataTemplate to a simple Image control. We are going to add the images to the ItemSource shortly.

Next, we declare the SpeechManager.SpeechMetadata and provide the text the application will say to the user upon execution. In this instance it will say, “Available commands: ‘next’ or ‘previous’” as shown in the video earlier in this blog post. We need to create some sort of InputHandler to perform actions on those commands and that is accomplished with the PreviousNextSpeechHandler. The two recognizable voice commands are next and previous. At this point, we can simply close our XAML tags and write some C#.

Let’s begin with MainPage.xaml.cs

This page is going to simply add the Images to RadSlideView and handle the OnNavigatedTo and OnNavigatedFrom event handlers. This will ensure that when the page is “Navigated To” the SpeechManager will start listening for events and when the page is “Navigated Away” then it will reset the SpeechManager and clean up outstanding resources.

The entire code for this page is located below.

   1:public MainPage()
   2: {
   3:     InitializeComponent();
   4:this.Loaded += new RoutedEventHandler(SlideView_Loaded);  
   5: }
   6:  
   7:void SlideView_Loaded(object sender, RoutedEventArgs e)
   8: {
   9:string[] uris = newstring[8];
  10:for (int i = 0; i < 8; i++)
  11:     {
  12:         uris[i] = "Images/transitionsNew-" + (i + 1) + ".png";
  13:     }
  14:  
  15:     xSlideView.DataContext = uris;
  16: }
  17:  
  18:protectedoverridevoid OnNavigatedTo(NavigationEventArgs e)
  19: {
  20:base.OnNavigatedTo(e);
  21:     Telerik.Windows.Controls.SpeechManager.StartListening();
  22: }
  23:  
  24:protectedoverridevoid OnNavigatedFrom(NavigationEventArgs e)
  25: {
  26:     Telerik.Windows.Controls.SpeechManager.Reset();
  27:base.OnNavigatedFrom(e);
  28: }

Finishing up by adding the SpeechHandlers

Since you would typically want all the SpeechHandler code in a separate file for separation from the view and easier readability (when multiple Handlers are used), we will create a folder in our main project called, “SpeechHandlers” and add a class called that implements the ISpeechInputHandler interface. After doing that our solution explorer should look like the following with the other steps added:

Solution_Explorer

Double click on the PreviousNextSpeechHanlder.cs and replace the existing class with the following one:

   1:publicclass PreviousNextSpeechHandler : ISpeechInputHandler
   2: {
   3:  
   4:privateconststring NEXT_PHOTOS = "next";
   5:privateconststring PREVIOUS_PHOTOS = "previous";
   6:  
   7:publicbool CanHandleInput(string input)
   8:     {
   9:returntrue;
  10:     }
  11:  
  12:publicvoid HandleInput(FrameworkElement target, string input)
  13:     {
  14:if (string.Compare(NEXT_PHOTOS, input, StringComparison.InvariantCultureIgnoreCase) == 0)
  15:         {
  16:  
  17:             RadSlideView sv = target as RadSlideView;
  18:             sv.MoveToNextItem();
  19:             Telerik.Windows.Controls.SpeechManager.StartListening();
  20:  
  21:         }
  22:  
  23:if (string.Compare(PREVIOUS_PHOTOS, input, StringComparison.InvariantCultureIgnoreCase) == 0)
  24:         {
  25:             RadSlideView sv = target as RadSlideView;
  26:             sv.MoveToPreviousItem();
  27:             Telerik.Windows.Controls.SpeechManager.StartListening();
  28:         }
  29:     }
  30:  
  31:publicvoid NotifyInputError(FrameworkElement target)
  32:     {
  33:         MessageBox.Show("Error");
  34:  
  35:     }
  36: }

As we can see from the code snippet, whenever you implement the ISpeechInputHanlder interface then the CanHandleInput, HandleInput and NotifyInputError methods are added automatically for you. The CanHandleInput can return true or false depending on whether you want to enable or disable the API. The HandleInput is going to listen for certain keywords like ‘next’ or ‘previous’ in our example and perform actions based upon the voice commands. In this instance, it is going to move our RadSlideView to the Next or Previous Image. Of course, you could do anything you want for example, zooming in or out of an image, panning, etc. The NotifyInputError is going to trigger upon any type of input error, in this case I’m just showing a simple MessageBox, but you would probably want to log the error to a web service or such.

Conclusion

I hope this blog post helps clear up any confusion regarding Speech Recognition in Windows Phone 8. We are very excited about our Speech Recognition API and would encourage you to leave feedback. We have also extended the Telerik Picture Gallery sample app (with source code available) to support this new API. As stated earlier, this control is in CTP and we are working hard to make sure it is easy to use. Don’t forget that the source code for this project is available for download as well.

I would also like to take this opportunity and invite you to join the Nokia premium developer program to get RadControls for Windows Phone for free. They also have a vast variety of resources available to help you get your next app into the marketplace.

Thanks for reading!

-Michael Crump (@mbcrump)

image

About the author

Michael Crump - XAML RockStar!

Michael Crump

Michael Crump is a Microsoft MVP, INETA Community Champion, and an author of several .NET Framework eBooks. He speaks at a variety of conferences and has written dozens of articles on .NET development. He works at Telerik with a focus on our XAML control suite. You can follow him on Twitter at @mbcrump or keep up with his various blogs by visiting his Telerik Blog or his Personal Blog.


Viewing all articles
Browse latest Browse all 5210

Trending Articles