teaching machines

Final Project – Reader-animated Storybook (first iteration)

December 20, 2011 by . Filed under cs491 mobile, fall 2011, postmortems.

This app is a first draft of a framework for the Reader-animated Storybooks research project I’m working on. I departed from the design document in that some of the features (like a voice-controlled menu screen) seemed irrelevant once I got down to building this. There are enough neat things to build into the final product for the research project that extra features only usable for this iteration lost their sparkle.

I ended up with a working model that we can apply to the project, sans the important step of softening the speech recognition so that it could be enjoyably used—especially by children (whose speech may not be as recognizable to machines).

The data model is pretty simple. I defined a Storybook object that has pointers to resources for a video file and a thumbnail image, as well as two parallel arrays: a String[] (for the text of each “page”) and an int[] (for the animation breakpoints in milliseconds). As the research project progresses, I think the model will become a little more sophisticated to allow for the text on different “pages” to be in different parts of the screen, etc. I’ll likely add a Page object that will track the text, time breakpoint, text position, etc., so that Storybook can hold one Page[] array.

GridView/GridMenuActivity and FrameLayout

Though I felt it would be silly to implement a voice-controlled menu screen, I did want something a little slicker-looking than a standard ListView/ListActivity. A nice alternative is the GridView/GridMenuActivity combination. It works the same as a ListView in that you feed it a BaseAdapter with any customized view you like. I ended up defining a small square “grid item” in XML that adds a semi-transparent overlay of the adapter index value onto each Storybook object’s thumbnail. Using a FrameLayout, you can layer views on top of each other. Add some transparency, and you have a slick result with a minimal amount of markup. Views are stacked up in the order they’re written to the XML (so the TextView in the following example is drawn on top of the ImageView).

<FrameLayout
  xmlns:android="http://schemas.android.com/apk/res/android"
  android:layout_width="wrap_content"
  android:layout_height="wrap_content">

  <ImageView
    android:layout_width="100dp"
    android:layout_height="100dp"
    android:id="@+id/grid_tile_image"
    android:scaleType="center"
    android:src="@drawable/thumb_placeholder" />

  <TextView
    android:id="@+id/grid_tile_text_overlay"
    android:layout_width="wrap_content"
    android:layout_height="wrap_content"
    android:layout_gravity="center"
    android:background="#00000000"
    android:textColor="#99FFFFFF"
    android:text="3"
    android:textSize="64sp"
    android:textStyle="bold" />
</FrameLayout>

It looks nice, breaks you out of the “list” look, removes some of the psychological hierarchy implications inherent to lists, and you get nice scrolling interaction out of the box. A note in case you forgot (like I did), Android colors are in ARGB order, not RGBA.

 

MediaPlayer/SurfaceView

If you need to play a video in an Android app, use a VideoView. Very convenient. Very simple to use. Not very flexible, but it works for a lot of common scenarios. I didn’t use a VideoView, but you should.

If you need to do something custom, use a combination of a MediaPlayer and SurfaceView. This enables you to create your own custom controls or any other zaniness you dream up. The Android dev site has some swiss-army-knife demo code that can help get you up and running, but it’s worth discussing here with a little less completeness. This version relies on your activity implementing both SurfaceHolder.Callback and MediaPlayer.OnPreparedListener.

Here’s the deal, more or less.

  1. You can’t play a video until the MediaPlayer has been prepared.
  2. The MediaPlayer can’t be prepared until you’ve set the data source.
  3. You can’t set the data source until the SurfaceView is created.
public void onCreate(Bundle savedInstanceState) {
  SurfaceView surface = (SurfaceView) findViewById(R.id.surface);
  holder = surface.getHolder();
  holder.addCallback(this);
  holder.setType(SurfaceHolder.SURFACE_TYPE_PUSH_BUFFERS);

  player = new MediaPlayer();
  player.setOnPreparedListener(this);
  player.setDisplay(surface.getHolder());
}

public void surfaceCreated(SurfaceHolder holder) {
  player.setDataSource(pathToVideoFile);
  player.prepareAsync();
}

public void onPrepared(MediaPlayer mp) {
  player.play();
}

These things all happen asynchronously, on their own time, hence the triple-layer of polite callbacks. There’s more to it than that, but this much will get you understanding what has to happen when. One more thing: don’t forget this line:

holder.setType(SurfaceHolder.SURFACE_TYPE_PUSH_BUFFERS);

I’m not kidding. Do not forget that line. It will not work without it and you will not know why. It will take you a long time to figure out.

The rest of my Player code relies on more callbacks for MediaPlayer (like OnSeekCompleteListener and OnCompletionListener), a headless speech recognizer implementation (a bunch more callbacks), a really clean swipe gesture implementation for “flipping the page” without having to go through the speech-recognition rigamarole (you guessed it, more callbacks). And a button that toggles/indicates listening/matching state and uses some icons from the free version of GLYPHICONS (note, Glyphish is another stylish option I’ve used for other projects). Handler.postDelayed(someCoolRunnable, milliseconds) takes care of pausing the video when an animation “page” is complete. The PlayerActivity is also wrapped up in a FrameLayout with a custom font, to boot (code tips for custom fonts).

With that, users can trigger “animation” (i.e., a video playing until the next breakpoint) based on reading the text on the screen aloud. You push the “talk” button, say the words “hear it ring” and you get to watch the next x milliseconds of the video, after which the text is updated. Here’s a video of a quick demo illustrating how this works.

While building this, my code got more and more noodle-like. While delicious and buttery, it was ugly and unmaintainable (especially after I added the swipe to page-flip feature). Putting together a state diagram that helped clarify what I really needed my PlayerActivity to do, and which pieces of code I could safely bundle up into helper methods. The diagram also will make it easy to turn my existing Storybook class into an interface that can be implemented in different ways down the road (i.e., HTML5 canvas elements, native XML-defined Android animations, OpenGL graphics, Processing, etc.). Using a movie file is a first-stab approach that makes it easy to get the animation sequences from an artist (who can do the work in Flash or whatever they like), and into the app.