Voice to Text App for Android Tutorial

Voice To Text services are all the rage these days.  Just look at the VUI (Voice User Interface) systems many of us have in our homes like Google Home and Alexa.  Industry leaders like Google, Amazon, Apple, and Microsoft are all betting on voice assistants by creating interfaces like Google Assistant, Siri, Alexa, OneNote dictation and Cortana. These systems wouldn’t be possible without some kind of underlying technology that is able to convert our spoken audio into discernible text. 

If you’ve ever wanted to integrate speech to text into your applications, you might have though that it would be difficult or complicated. However, using Googles Speech to Text API, its actually quite simple.  Many apps out there utilize the speech recognition feature, such as Google Keep. In this article, we’ll learn how to build a simple speech to text app on Android using the speech recognizer library provided by Google.  By the end of this short tutorial, you’ll have a working version of a voice to text dictation app!

Prerequisites

  • Android Studio IDE downloaded and configured on your PC or Mac.
  • An Android Smartphone or Tablet (Unfortunately, Voice Recognition does not work on an emulator).
  • Basic Android knowledge like how to create a layout and run a program.

Step 1.  Create a Simple UI

To get started, let’s first create a new project in Android Studio. Go ahead and create a new project using the “Create New Project” wizard in Android Studio.  I’ll name my project “Speech Recognizer”, but you can name it whatever you like.

Start a New Project in Android Studio

I’ll also choose to create an empty activity for this example.

Android Studio Select Blank Activity

 

Android Interface with Microphone Button

For the purpose of this simple speech to text tool, let’s create a new Activity with only two elements on the page:  a TextView and a Button or ImageButton.  We’ll use the button to turn on/off listening for speech, while the TextView will be used to place our converted speech text on the screen.  In this example I’ve also wrapped theTextView in a ScrollView, just in case there is enough text to fill the page!  The interface looks pretty simple, but will do just fine for what we need.

 

 

 

 

 

 

 

 

 

Currently, my activity_main.xml file looks like this:

<?xml version="1.0" encoding="utf-8"?>
<RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    android:orientation="vertical" >

    <ScrollView
        android:layout_width="match_parent"
        android:layout_height="279dp">

        <TextView
            android:id="@+id/textView"
            android:layout_width="match_parent"
            android:layout_height="match_parent"
            android:layout_centerHorizontal="true"
            android:layout_margin="10dp"
            android:layout_marginTop="20dp"
            android:textSize="25sp" />
    </ScrollView>

    <ProgressBar
        android:layout_width="0dp"
        android:layout_height="0dp"
        android:id="@+id/progressbar"
        />

    <ImageButton
        android:id="@+id/recordButton"
        android:layout_width="100dp"
        android:layout_height="100dp"
        android:layout_alignParentBottom="true"
        android:layout_centerHorizontal="true"
        android:layout_marginBottom="112dp"
        android:background="@drawable/microphone_button_off" />

</RelativeLayout>

Step 2. Implement RecognizerListener in our Main Activity

Now that you have a UI with a TextView and Button, it’s time to get into the meat of the code.    To get our speech recognizer to work, we’ll need to implement RecognizerListener.  This class will enable use to to use Google’s voice to text engine and add our own custom actions at different points in the voice recognition life cycle.  To do this, you will want to change the class declaration in your MainActivity.java to:

public class MainActivity extends AppCompatActivity implements
        RecognitionListener {

 

Like I said before, RecognizerListener has all of the underlying methods which you can edit.  In our Main Activity, we will also want to add the following code (we’ll fill in the required functions later).




//We'll need this to ask the user for permission to record audio
@Override
public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) {
        
}

@Override
public void onResume() {
	super.onResume();
}

@Override
protected void onPause() {
	super.onPause();
}

@Override
protected void onStop() {
	super.onStop();
}

//Executes when the user has started to speak.
public void onBeginningOfSpeech() {
	Log.i(LOG_TAG, "onBeginningOfSpeech");
}

//Executes when more sound has been received.
@Override
public void onBufferReceived(byte[] buffer) {
	Log.i(LOG_TAG, "onBufferReceived: " + buffer);
}

//Called after the user stops speaking.
@Override
public void onEndOfSpeech() {

}

//Called when any error has occurred

@Override
public void onError(int errorCode) {
	String errorMessage = getErrorText(errorCode);
	Log.d(LOG_TAG, "FAILED " + errorMessage);
}


//Called when partial recognition results are available.
@Override
public void onPartialResults(Bundle arg0) {
 	Log.i(LOG_TAG, "Results");
}

//Called when the endpointer is ready for the user to start speaking.
@Override
public void onReadyForSpeech(Bundle arg0) {
	Log.i(LOG_TAG, "Ready For Speech");
}

//Called when recognition results are ready.
@Override
public void onResults(Bundle results) {
	Log.i(LOG_TAG, "Results");
}

//The sound level in the audio stream has changed.
@Override
public void onRmsChanged(float rmsdB) {
	Log.i(LOG_TAG, "RMS Changed: " + rmsdB);
	progressBar.setProgress((int) rmsdB);
}




3. Requesting Permission

One important aspect of the speech recognizer functions is that you will need to use the microphone to get any sound (obviously). Any time you use a feature on your device that can record the user or access their personal information like the microphone, camera, or reading files, the user must grant the app permission first. The same goes here, and we will have to prompt the user for permission to access their microphone before the Voice to Text engine will work. Ideally, we’ll call a function to ask our system if the user has already granted permission before starting to listen.  If they have not yet granted permission, we will then ask them.

 
    @Override
    public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) {
        super.onRequestPermissionsResult(requestCode, permissions, grantResults);
        switch (requestCode) {
            case REQUEST_RECORD_PERMISSION:
                if (grantResults.length > 0 && grantResults[0] == PackageManager.PERMISSION_GRANTED) {
                    speech.startListening(recognizerIntent);
                } else {
                    Toast.makeText(MainActivity.this, "Permission Denied!", Toast
                            .LENGTH_SHORT).show();
                }
        }
    }

In addition to this piece of code, you will also need to add the following lines to your `AndroidManifest.xml`

    

<uses-permission android:name="android.permission.RECORD_AUDIO"/>
<uses-permission android:name="android.permission.INTERNET" />

Step 4.  Triggering the listener to begin

Now that we’ve got our permissions function started, it’s time to add a trigger when the user clicks our button.  In this section, we also need to declare our variables and set up the recognizer in the `onCreate()` function:

returnedText = (TextView) findViewById(R.id.textView);
recordButton = (ImageButton) findViewById(R.id.recordButton);
recordButtonStatus = false;

progressBar = (ProgressBar) findViewById(R.id.progressbar);

speech = SpeechRecognizer.createSpeechRecognizer(this);
Log.i(LOG_TAG, "isRecognitionAvailable: " + SpeechRecognizer.isRecognitionAvailable(this));
speech.setRecognitionListener(this);
recognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_PREFERENCE, "en");
recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
recognizerIntent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 3);

recordButton.setOnClickListener(new View.OnClickListener() {
	public void onClick(View v) {
		if(recordButtonStatus){
			recordButtonStatus = false;
			progressBar.setIndeterminate(false);
			progressBar.setVisibility(View.INVISIBLE);
			speech.stopListening();
			recordButton.setBackground(getDrawable(R.drawable.microphone_button_off));
		}else{
			ActivityCompat.requestPermissions
 				(MainActivity.this,
					new String[]{Manifest.permission.RECORD_AUDIO},
					REQUEST_RECORD_PERMISSION);
                 recordButtonStatus = true;
                 recordButton.setBackground(getDrawable(R.drawable.microphone_button_on));
        }
    }
});

Great! Now, when I build the app and start it, I should be requested to record audio.

Step 5. Add a function to handle the listener output

Now that we now how to trigger the listener to start listening, the speech recognizer does a lot on its own. However, we will want to tell the recognizer where to put the text once we’ve finished our audio input. The correct place to put that would be in the `onResults()` function. In this case, we just simply want to display the transcripted text in our `TextView`.

The speech recognizer will actually return an `Array` of possible results to the speech, with the most relevant result in front. So, we’ll just take the result in position 0 and display that.

	
    @Override
    public void onResults(Bundle results) {
        Log.i(LOG_TAG, "Results");
        ArrayList<String> matches = results
                .getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);


        String text = matches.get(0);
        returnedText.setText(text);

        recordButtonStatus = false;
        recordButton.setBackground(getDrawable(R.drawable.microphone_button_off));
    }

At this point, I should be able to run my simple speech recognizer and see my words printed on the screen. While this is a very simple example, there is a lot more you can do with this technology. If you’re like me, you might be thinking about all the possibilities that you could use this feature for in an app.

Conclusion

Now you have a basic voice to text app built on your smartphone or tablet!  If you’re like me, you can probably think of some ways that this might be useful.  Perhaps you can make a note taking application for students. Perhaps someone could use this technology when interviewing.  Even yet, maybe speech to text could be a feature in an app you are building to use custom voice commands, similar to how Google Keep and OneNote utilize Speech Recognition technology.

Android Speech Recognizer Screen If you’d like to see basic Note Taking app I added to the Google Play store that is build on top of the code here, please visit the Google Play Listing. and feel free to leave suggestions on further ways to improve the app in the comments!  If you liked what you learned here, you might also enjoy my article on Speeding up your Android WebView with SQLite transactions, or my article on ideas for lucrative side gigs.

Source code for what we learned here can be found here.








 
 
 
 

SQLite Bulk Transaction functions for Android and iOS

In a previous article I wrote on SQLite performance techniques, I explained that using SQLite bulk transactions, especially in a WebView application on your mobile app, can be a huge performance booster.  In this article, I’d like to expand a little more and offer some basic code solutions for your Android and iOS applications.

Let’s start with a simple example.  Let’s say, for instance, that you are creating a mobile WebView-based application where you are pulling a large amount of data from a back-end service or API and inserting the rows in your your local SQLite database.  For this example, let’s say we are pulling in a list of coffee shops and we are going to insert into our SQLite table the following columns:

CREATE TABLE IF NOT EXISTS coffee_shops (
    id INTEGER PRIMARY KEY,
    name TEXT,
    address TEXT,
    price_range TEXT
);

For each coffee shop, we’ll give it an auto-incremented ID and insert a name, address, and a “price range” (in this case, either “$”, “$$”, “$$$”, or “$$$$”). You already know that you want to use bulk transactions, but you need to create a function to do it.

Android SQLiteStatement

Android native Java code is built to work hand in hand with SQLite.  The SQLiteDatabase and SQLiteStatement classes will both help you tremendously, and we will use them in our example.  Let’s say, for the example’s sake, that the data we have retrieved from the back-end service/API is in JSON format using an AJAX call in our WebView. 

The data might look something like:

{"coffee_shops": 
    [
        {
	        "id": 0,
	        "name": "Downtown Coffee",,
	        "address": "123 Main Street, New York, NY 10012"
	        "price_range": "$$"
	    }
        //...
    ]
};

We’ll also say that we’ve converted it to a string using JSON.stringify in our Webview, and are now passing the string to our native application to perform the SQLite interactions.


Once we pass that data to the native portion, we’re gong to be using SQLiteDatabase and SQLiteStatement. Most notably, SQLiteDatabase has the BeginTransaction(), SetTransactionSuccessful(),  and EndTransaction() methods which are essential for bulk transactions, while the SQLiteStatement class allows you to use a compiled prepared statement in your inserts.  If you want to look at another more in depth solution on these classes, I also recommend checking out This great article with examples.

Circling back to the coffee shop example, here’s a function we can write in Java to take our Stringify’d JSON object and insert all the objects using a bulk transaction.

// objectID = "coffee_shops"
public JSONObject bulkInsertCoffeeShops(String stringValues, String objectID) {

    try {

    JSONObject stringJson = new JSONObject(stringValues);
    JSONArray values = stringJson.getJSONArray(objectID);

    try {

        String sql = "INSERT OR REPLACE INTO coffee_shops VALUES (?, ?, ?, ?);";

        coffeeDB = mContext.openOrCreateDatabase("coffeeDB", 
        Context.MODE_PRIVATE, null);
        SQLiteStatement statement = coffeeDb.compileStatement(sql);
        coffeeDb.beginTransaction();

        for (int i = 0; i < values.length(); i++) {

            statement.clearBindings();

            try {

               JSONObject o = values.getJSONObject(i);

                statement.bindDouble(1,o.getDouble("id"));
                statement.bindString(2,o.getString("name"));
                statement.bindString(3,o.getString("address"));
                statement.bindString(4,o.getString("price_range"));

                statement.execute();

            } catch (JSONException e) {

                errorMessage = e.getMessage();
                Log.e(TAG, errorMessage);

            }

        }

        coffeeDb.setTransactionSuccessful();

    } catch (Exception e) {
        errorMessage = e.getMessage();
        Log.e(TAG, errorMessage);
    } finally {
        //end transaction
        coffeeDb.endTransaction();
        coffeeDb.close();
    }
} catch (JSONException e) {
    errorMessage = e.getMessage();
    Log.e(TAG, errorMessage);
}

};


Using Swift for iOS

If you’ve also got an iOS app running on the same or similar WebView code, you’ll probably also need an equivalent in your iOS app to insert your coffee shop data. For swift, it will be the same process, but we’ll use  SQLite.swift as our library as well as SwiftyJSON to help process our JSON data.

func bulkInsertCoffeeShops(values:String, objectID:String) -> Bool{

    let fileUrl = //your file path to your DB

    //open our database
    if sqlite3_open(fileUrl.path, &db) != SQLITE_OK {
    }
    let SQLITE_TRANSIENT = unsafeBitCast(-1, to: sqlite3_destructor_type.self)

    // convert our JSON string into an object
    let fieldStringData = fieldString.data(using: .utf8, allowLossyConversion: false)
    let objectID = String(objectID)
    let data = values.data(using: .utf8, allowLossyConversion: false)

    if let json = try? JSON(data: data!)
    {

        var compiledStatement: OpaquePointer?
        //Start our transaction
        sqlite3_exec(db, "BEGIN IMMEDIATE TRANSACTION", nil, nil, nil)
        var query = "INSERT OR REPLACE INTO coffee_shops VALUES (?, ?, ?, ?);";

        let rowObjects = json[objectID]

        if(sqlite3_prepare_v2(db, query, -1, &compiledStatement, nil) == SQLITE_OK)
        {//Bind or variables and execute each statement
            for (index, obj) in rowObjects
            {

                sqlite3_bind_int(compiledStatement, Int32(1), 
                Int32(obj["id"].stringValue)!);
                sqlite3_bind_text(compiledStatement, Int32(2), 
                obj["name"].stringValue, -1, SQLITE_TRANSIENT);
                sqlite3_bind_text(compiledStatement, Int32(3), 
                obj["address"].stringValue, -1, SQLITE_TRANSIENT);
                sqlite3_bind_text(compiledStatement, Int32(4), 
                obj["price_range"].stringValue, -1, SQLITE_TRANSIENT);

                if (sqlite3_step(compiledStatement) != SQLITE_DONE)
                {
                    NSLog("%s",sqlite3_errmsg(db));
                }

                if (sqlite3_reset(compiledStatement) != SQLITE_OK)
                {
                    NSLog("%s",sqlite3_errmsg(db));
                 }
            }
        }
        if (sqlite3_finalize(compiledStatement) != SQLITE_OK){

            NSLog("%s",sqlite3_errmsg(db));
        }//Finally, let's commit our transaction

        if (sqlite3_exec(db, "COMMIT TRANSACTION", nil, nil, nil) != 
            SQLITE_OK) {
                NSLog("%s",sqlite3_errmsg(db));
            }
        }
        //Close our DB
        if sqlite3_close_v2(db) != SQLITE_OK {
            print("error closing the database")
        }
        return true
    }
    //Close our DB
    if sqlite3_close_v2(db) != SQLITE_OK {
         print("error closing the database")
    }
    return false

}

And there you have it.   We can now successfully insert our large sets of data into our SQLite databases using both Swift and Java.  If you like this article or found it useful, feel free to leave comments in the comment section.

Speeding up your WebView with SQLite Bulk Transactions.

The Problem: Slow/Sticky UI

If you’ve ever build a SQLite-based application that inserting large sets of data, you may have come across certain performance issues.  For example, if you happen to be creating a WebView application on iOS or Android where you are using javascript to insert your data into your database, your application may be feeling pretty slow and sticky as your UI gets bogged down.

This is because executing SQLite statements are being run synchronously, and will wait until each one is completed before starting the next insert.  A couple of simple solutions which I’ll talk about can drastically speed up your application by changing the way that the data is inserted.

Turn off Synchronous Pragma

The first possible solution will get your inserts to run much faster, but I don’t recommend it for mobile apps due to the fact that you are risking the database to get corrupted by this method. This method involves simply telling your database to run the statements asynchronously by running the following line before looping through your insert statements.

PRAGMA synchronous=OFF

Like I said though, this isn’t the best solution because although it is super quick to implement, you might end up with another problem: corrupted data.  If your application happens to lose power during the insert process, data can be lost or corrupted.  While you might be able to argue that most of the time this won’t occur, it’s not terribly much more effort to implement the second solution, which doesn’t have the same drawbacks.

Batch Insert

The other method which can help you greatly speed up your SQLite inserts is to swap your for-loop of execute statements for batch insert method.  To do this, you can wrap your for loop in a BEGIN IMMEDIATE TRANSACTION statement and end with a COMMIT TRANSACTION statement.

All in all, this is a much better solution.  If you are using a lot of inserts in your app, you can even create a custom function in your native code (Swift or Java) to reuse the same process and change the fields/table name depending dynamically.

Why do batch inserts increase speed?

Batch inserts in SQLite increase speed because you are bundling multiple inserts into 1 transaction.  The speed limitations are not on each insert, but each transaction.  That being said, if you don’t use batch insert, each statement you execute will be treated as a separate transaction.

So next time you find yourself experiencing slow UI’s in your WebView and you are doing a heavy number of inserts to your SQLite database, it’s best to consider batch inserts to speed up your application, you won’t be sorry!

If you want to see a few examples of how to do this in Android and Swift, head over to my post with examples on how to do this in Java and Swift.