« A more "intimate" E3? | Main | Where am I? »

Aug 03, 2006

Comments

1.

Increasing the "resolution" of mo-cap from ping-pong balls to "grains of makeup" is exciting indeed! It should enable such techniques to capture finer behavioral details such as the motions of fingers and facial features.

But what I really want is real-time mo-cap that is cheap and hi-res, so I can use it to drive my avatar... to use my body as a joystick if you will. By entering a "free motion" mode, I can puppeteer my avatar's face with my own or my avatar's limbs with my own. Then the range of expression of which avatars are capable will be expanded dramatically.

2.

Bob>
But what I really want is real-time mo-cap that is cheap and hi-res, so I can use it to drive my avatar... to use my body as a joystick if you will. By entering a "free motion" mode, I can puppeteer my avatar's face with my own or my avatar's limbs with my own. Then the range of expression of which avatars are capable will be expanded dramatically.
----------

Playing with this a bit. I wonder if one could argue a social expectation based on 'an economy of expression' that goes something like as follows: in the midst of a conversation an individual should not esculate the range of their expression either at the expense of the others. A case of where it might begin to feel zero-sum is when suddenly on of the participants has just turned so darn efficient at communicating, it begins to bug me.

The old problem with groups where members mix team-speak and text chat. The ones using text chat may have agreed to go along. But consider the situation where a group started out all text chat and suddenly in the middle of the crawl, everyone 'cept you turn on teamspeak. Resentment?


3.

Well, I really want everyone else to have a real-time mo-cap device too. ;)

Also, such an input method, does not merely make nonverbal expression more efficient, it also opens up new kinds of nonverbal expression. For example, currently I can't use avatar gesture descriptively. I can't use my avatar's hands to depict the relative positions of a group of mobs nor the local topography of an area. There's no way to pre-script all the kinds of creative iconic gestures that speakers invent on the fly. It's the equivalent of the "quickchat" systems in ToonTown or EverQuest Online Adventures in which you can't freely compose a text message, but can only choose from a list of canned messages (those games of course also offer free text modes).

4.

We're getting close to achieving real-time Contour capture at low resolution (e.g. good enough for preview to evaluate the performance). In time I expect that we'll get to real-time Contour capture at (what is today) full resolution. At that point, the performers will be actually controlling 3D characters with all of the nuances of their expressions and the flow of their costumes in real-time. To someone viewing the output of a real-time Contour system, it would be very "Holodeck-like" (if you are familiar with the virtual reality simulator in later Star Trek series).

There were many possible approaches to the algorithms we are using, but we quite specifically architected the algorithms to be parallelizable and topologically efficient in terms of dataflow. This allows us to very efficiently utilize parallelism, particularly given the trending toward multi-core CPUs and increasing parallelism in GPUs.

Getting EVERYONE to have a Contour capture device is a bit more tricky, but one thing I'd like to eventually get to is a "prosumer" version of the system for few thousand dollars that people could set up in garages. There may not be a practical business model to support this, but it would be amazingly cool to put that much power in the hands of thousands of people.

-- Steve Perlman, president, Mova

The comments to this entry are closed.