Android Core

Android Voice Recognition Tutorial

You may have heard about the “ Google Now project” where you give the voice command and Android fetches result for you. It recognizes your voice and converts it into the text or takes the appropriate action. Have you ever thought how is it done? If your answer is voice recognition API, then you are absolutly right. Recently while playing with Android voice recognition APIs, I found some interesting stuffs. APIs are really easy to use with application. Given below is a small tutorial on voice/speech recognition API. The final application will look similar to that of application shown below. The application may not work on the Android Emulator because it doesn’t support voice recognition. But the same can work on the phone.

Project Information: Meta-data about the project.

Platform Version : Android API Level 15.
IDE : Eclipse Helios Service Release 2
Emulator : Android 4.1(API 16)

Prerequisite: Preliminary knowledge of Android application framework, and Intent.

Voice recognition feature can be achieved by RecognizerIntent. Create an Intent of type RecognizerIntent and pass the extra parameters and start activity for the result. It basically starts the recognizer prompt customized by your extra parameters. Internally voice recognition communicates with the server and gets the results. So you must provide the internet access permission for the application. Android Jelly Bean(API level 16) doesn’t require internet connection to perform voice recognition. Once the voice recognition is done, recognizer returns value in onActivityResult() method parameters.

First create project by Eclipse > File> New Project>Android Application Project. The following dialog box will appear. Fill the required field, i.e Application Name, Project Name and Package. Now press the next button. 

Once the dialog box appears, select the BlankActivity and click the next button.

Fill the Activity Name and Layout file name in the dialog box shown below and hit the finish button.

This process will setup the basic project files. Now we are going to add four buttons in the activity_voice_recognition.xml file. You can modify the layout file using either Graphical Layout editor or xml editor. The content of the file is shown below. As you may notice that we have attached speak() method with button in onClick tag. When the button gets clicked, the speak() method will get executed. We will define speak() in the main activity.

<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    android:orientation="vertical" >

    <EditText
        android:id="@+id/etTextHint"
        android:gravity="top"
        android:inputType="textMultiLine"
        android:lines="1"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:text="@string/etSearchHint"/>

    <Button
        android:id="@+id/btSpeak"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:onClick="speak"
        android:padding="@dimen/padding_medium"
        android:text="@string/btSpeak"
        tools:context=".VoiceRecognitionActivity" />

    <Spinner
        android:id="@+id/sNoOfMatches"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:entries="@array/saNoOfMatches"
        android:prompt="@string/sNoOfMatches"/>

    <TextView
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:text="@string/tvTextMatches"
        android:textStyle="bold" />

    <ListView
        android:id="@+id/lvTextMatches"
        android:layout_width="match_parent"
        android:layout_height="wrap_content" />

</LinearLayout>

You may have noticed that the String constants are being accessed from the resource. Now add the string constants in string.xml. This file should look similar to the one shown below.

<resources>
    <string name="app_name">VoiceRecognitionExample</string>
    <string name="btSpeak">Speak</string>
    <string name="menu_settings">Settings</string>
    <string name="title_activity_voice_recognition">Voice Recognition</string>
    <string name="tvTextMatches">Text Matches</string>
    <string name="sNoOfMatches">No of Matches</string>
    <string name="etSearchHint">Speech hint here</string>
    <string-array name="saNoOfMatches">
        <item>1</item>
        <item>2</item>
        <item>3</item>
        <item>4</item>
        <item>5</item>
        <item>6</item>
        <item>7</item>
        <item>8</item>
        <item>9</item>
        <item>10</item>
    </string-array>
</resources>

Now let’s define the Activity class. This activity class, with the help of checkVoiceRecognition() method, will first check whether the Voice recognition is available or not. If voice recognition feature is not available, then toast a message and disable the button. Speak() method is defined here which gets called once the speak button is pressed. In this method we are creating RecognizerIntent and passing the extra parameters. The code below has embedded comments which makes it easy to understand.

package com.rakesh.voicerecognitionexample;

import java.util.ArrayList;
import java.util.List;

import android.app.Activity;
import android.app.SearchManager;
import android.content.Intent;
import android.content.pm.PackageManager;
import android.content.pm.ResolveInfo;
import android.os.Bundle;
import android.speech.RecognizerIntent;
import android.view.View;
import android.widget.AdapterView;
import android.widget.ArrayAdapter;
import android.widget.Button;
import android.widget.EditText;
import android.widget.ListView;
import android.widget.Spinner;
import android.widget.Toast;

public class VoiceRecognitionActivity extends Activity {
 private static final int VOICE_RECOGNITION_REQUEST_CODE = 1001;

 private EditText metTextHint;
 private ListView mlvTextMatches;
 private Spinner msTextMatches;
 private Button mbtSpeak;

 @Override
 public void onCreate(Bundle savedInstanceState) {
  super.onCreate(savedInstanceState);
  setContentView(R.layout.activity_voice_recognition);
  metTextHint = (EditText) findViewById(R.id.etTextHint);
  mlvTextMatches = (ListView) findViewById(R.id.lvTextMatches);
  msTextMatches = (Spinner) findViewById(R.id.sNoOfMatches);
  mbtSpeak = (Button) findViewById(R.id.btSpeak);
  checkVoiceRecognition()
 }

 public void checkVoiceRecognition() {
  // Check if voice recognition is present
  PackageManager pm = getPackageManager();
  List<resolveinfo> activities = pm.queryIntentActivities(new Intent(
    RecognizerIntent.ACTION_RECOGNIZE_SPEECH), 0);
  if (activities.size() == 0) {
   mbtSpeak.setEnabled(false);
   mbtSpeak.setText("Voice recognizer not present")
   Toast.makeText(this, "Voice recognizer not present",
     Toast.LENGTH_SHORT).show();
  }
 }

 public void speak(View view) {
  Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);

  // Specify the calling package to identify your application
  intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, getClass()
    .getPackage().getName());

  // Display an hint to the user about what he should say.
  intent.putExtra(RecognizerIntent.EXTRA_PROMPT, metTextHint.getText()
    .toString());

  // Given an hint to the recognizer about what the user is going to say
  //There are two form of language model available
  //1.LANGUAGE_MODEL_WEB_SEARCH : For short phrases
  //2.LANGUAGE_MODEL_FREE_FORM  : If not sure about the words or phrases and its domain.
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
    RecognizerIntent.LANGUAGE_MODEL_WEB_SEARCH);

  // If number of Matches is not selected then return show toast message
  if (msTextMatches.getSelectedItemPosition() == AdapterView.INVALID_POSITION) {
   Toast.makeText(this, "Please select No. of Matches from spinner",
     Toast.LENGTH_SHORT).show();
   return;
  }

  int noOfMatches = Integer.parseInt(msTextMatches.getSelectedItem()
    .toString());
  // Specify how many results you want to receive. The results will be
  // sorted where the first result is the one with higher confidence.
  intent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, noOfMatches);
  //Start the Voice recognizer activity for the result.
  startActivityForResult(intent, VOICE_RECOGNITION_REQUEST_CODE);
 }

 @Override
 protected void onActivityResult(int requestCode, int resultCode, Intent data) {
  if (requestCode == VOICE_RECOGNITION_REQUEST_CODE)

   //If Voice recognition is successful then it returns RESULT_OK
   if(resultCode == RESULT_OK) {

    ArrayList<string> textMatchList = data
    .getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);

    if (!textMatchList.isEmpty()) {
     // If first Match contains the 'search' word
     // Then start web search.
     if (textMatchList.get(0).contains("search")) {

        String searchQuery = textMatchList.get(0);
                                           searchQuery = searchQuery.replace("search","");
        Intent search = new Intent(Intent.ACTION_WEB_SEARCH);
        search.putExtra(SearchManager.QUERY, searchQuery);
        startActivity(search);
     } else {
         // populate the Matches
         mlvTextMatches
      .setAdapter(new ArrayAdapter<string>(this,
        android.R.layout.simple_list_item_1,
        textMatchList));
     }

    }
   //Result code for various error.
   }else if(resultCode == RecognizerIntent.RESULT_AUDIO_ERROR){
    showToastMessage("Audio Error");
   }else if(resultCode == RecognizerIntent.RESULT_CLIENT_ERROR){
    showToastMessage("Client Error");
   }else if(resultCode == RecognizerIntent.RESULT_NETWORK_ERROR){
    showToastMessage("Network Error");
   }else if(resultCode == RecognizerIntent.RESULT_NO_MATCH){
    showToastMessage("No Match");
   }else if(resultCode == RecognizerIntent.RESULT_SERVER_ERROR){
    showToastMessage("Server Error");
   }
  super.onActivityResult(requestCode, resultCode, data);
 }
 /**
 * Helper method to show the toast message
 **/
 void showToastMessage(String message){
  Toast.makeText(this, message, Toast.LENGTH_SHORT).show();
 }
}

Here is the Android manifest file. You can see that INTERNET permission has been provided to the application because of the voice recognizer’s need to send the query to the server and get the result.

<manifest xmlns:android="http://schemas.android.com/apk/res/android"
    package="com.rakesh.voicerecognitionexample"
    android:versionCode="1"
    android:versionName="1.0" >

    <uses-sdk
        android:minSdkVersion="8"
        android:targetSdkVersion="15" />
    <!-- Permissions -->
 <uses-permission android:name="android.permission.INTERNET" />

    <application
        android:icon="@drawable/ic_launcher"
        android:label="@string/app_name"
        android:theme="@style/AppTheme" >

        <activity
            android:name=".VoiceRecognitionActivity"
            android:label="@string/title_activity_voice_recognition" >
            <intent-filter>
                <action android:name="android.intent.action.MAIN" />

                <category android:name="android.intent.category.LAUNCHER" />
            </intent-filter>
        </activity>
    </application>
</manifest>

Once you are done with coding then connect the phone with your system and hit the run button on Eclipse IDE. Eclipse will install and launch the application. You will see the following activities on your device screen.

In the next tutorial, we will learn how to use the new voice recognition API introduced in Android Jelly Bean(API level 16) along with the examples.

If you are interested in the source code, then you can get it from github.

Reference: Tutorial on Android Voice recognition from our JCG partner Rakesh Cusat at the Code4Reference blog.

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

22 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Paolo Martinello
11 years ago

Hi, very nice tutorial.
Is there a way to programmatically use the new off line voice typing of JellyBean?
I can see on my tablet that when tapping on keyboard microphone is still possible to get speech even without net connection.

What I need to do is to tap onto a notification to start listening and then to use somehow this text to make some action like web surfing or similar.

Thanks in advance for your help!

Andrew Teeluck
Andrew Teeluck
11 years ago

I want to count the number of times the user says the same thing, such as a chant. Is it possible to detect the number of chants and return the count? Preferably without knowing what the words will be (so it’s compatible to any kind of chant). Is this possible?

Thanks!

Mar Cial R
Mar Cial R
11 years ago

Great tut!

rishi
rishi
11 years ago

need a quick reply
this code gives an error at onCreate bundle (btSpeak etc ) it says can not be resolved or not a field please help

張祐維
張祐維
11 years ago
Reply to  rishi

maybe you can try to import other package

張祐維
張祐維
11 years ago
Reply to  rishi

import android.widget.Button;
import android.widget.EditText;
import android.widget.TextView;

I import these package and it works
maybe the problem is due to different android version

建誌 林
建誌 林
11 years ago

if I want to open other applications on the phone ,how could i do

JJ
JJ
11 years ago

Hi, is there a way where instead of showing a list view after speaking, can the result show it directly on the text field? i need this badly so that after i get the result into the text field, i can translate it to another language. Thanks

rr
rr
11 years ago

Grate tutorial but i have a question .How can i make that the user will be able to chose which is the correct word from the list and then that word will go to google .Cuz right now it takes only the first word from the list .ANy sugestions ?

prasanth
prasanth
11 years ago

Nice tut… but i can’t get web search.. Every thing is correct.. permission also there.. my speech word is not opens the web search… Why???

tuyenkhuc
tuyenkhuc
10 years ago
Reply to  prasanth

Hi,
It sames problem as me. I can not open the web browse to surf on the internet. Anyhelp here ?

malitha
malitha
9 years ago
Reply to  prasanth

go through MindMeld Expectlabs

Malai
Malai
11 years ago

Good Post,

Is that gitup source code downloadable ?

Hitesh
Hitesh
11 years ago

Thank you :)

Victor
Victor
10 years ago

Hi,

Thank for great TUT.
Although the TUT is clear, but the code still get some error. They’re simple error so I can correct them, but when I ran the app, it stopped right away with message “unfortunately, app has stopped”
I don’t know what is possibly the error make the app stopped! So please help me with some suggestion.

Thank in advance!

Darshan Panse
Darshan Panse
10 years ago

Sir,
I want to use a voice recognition service instead of activity.
Is that possible? Please help me with this.
I want to create a service such that whenever i open my screen lock, the service starts and i can say (for eg:) call dad and it directly calls dad without any user interface.
Thanks for your help in advance.

touta
touta
9 years ago
Reply to  Darshan Panse

hello , i’m searching the same things please if you have any things to help please ;)

Neeraj
Neeraj
10 years ago

Hello Tut,

I’m planning to convert Voice/ Video Call Speech to text in real-time.
Can you suggest more about it?

Thanks!

mariam
mariam
9 years ago

hello , please can you make a code to lunch your application with voice command ?

touta
touta
9 years ago
Reply to  mariam

you found any things?

VU Phan
VU Phan
9 years ago

Great Tut but is posible to detect characters from voice in offline mode?
Do you get some suggest? Thanks

Shahid Alam
Shahid Alam
7 years ago

hi i am workin on soud detetction if you have some idea please share with me……
i want to detect accident sound..

Back to top button