On this subject two compelling items come this way. First Andrew Stern's coverage of Steve Perlman's SIGGRAPH 2006 demo of Contour:
...technology for digital effects production. Performances can now be captured in 3D as they are performed, eliminating much of the post-production work required in the past... It isn't just capturing dots in space anymore; it's actual live-action volumetric capture.
In Andrew's words, "Effectively each grain of makeup is like a motion-capture dot, allowing for very very hi-res, and low-cost, capture." Brilliant indeed.
Also from SIGGRAPH comes Microsoft's demo of Photosynth (see New Scientist Tech {NST}: "Software meshes photos to create 3D landscape" ). Fascinating.
NST describes Photosynth as
...software (that) takes individual images and performs careful analyses to find matching sections. It then "stitches" overlapping pictures together to create a three-dimensional landscape composed of many different snaps.
An interesting facet is that virtual worlds seem to strive to invert this user perspective. Excepting procedural landscapes, virtual world places feel (to me at least) designed to be deceptively compact locales aggregating a large number of stunning vistas - comprised of all sorts of content, graphic and critter. Yes worlds have come a long way since EverQuest I (EQ), but to this day I am still amazed about how large the ButcherBlock mountains seemed (a zone in EQ) until I figured out its true geometry. A deception by disorientation, intended to maximize the impact of the smallest size of virtual terrain one could support (infrastructure, design).
I look forward to a complicated geographic experience in a virtual world - the prosady of a rambling landscape sounds interesting, in fact. Shall we contemplate the 3d Cloudpoint of an English garden stretched to a horizon?
------
Related: "Geography and Travel" (TN).
Increasing the "resolution" of mo-cap from ping-pong balls to "grains of makeup" is exciting indeed! It should enable such techniques to capture finer behavioral details such as the motions of fingers and facial features.
But what I really want is real-time mo-cap that is cheap and hi-res, so I can use it to drive my avatar... to use my body as a joystick if you will. By entering a "free motion" mode, I can puppeteer my avatar's face with my own or my avatar's limbs with my own. Then the range of expression of which avatars are capable will be expanded dramatically.
Posted by: Bob Moore | Aug 03, 2006 at 17:38
Bob>
But what I really want is real-time mo-cap that is cheap and hi-res, so I can use it to drive my avatar... to use my body as a joystick if you will. By entering a "free motion" mode, I can puppeteer my avatar's face with my own or my avatar's limbs with my own. Then the range of expression of which avatars are capable will be expanded dramatically.
----------
Playing with this a bit. I wonder if one could argue a social expectation based on 'an economy of expression' that goes something like as follows: in the midst of a conversation an individual should not esculate the range of their expression either at the expense of the others. A case of where it might begin to feel zero-sum is when suddenly on of the participants has just turned so darn efficient at communicating, it begins to bug me.
The old problem with groups where members mix team-speak and text chat. The ones using text chat may have agreed to go along. But consider the situation where a group started out all text chat and suddenly in the middle of the crawl, everyone 'cept you turn on teamspeak. Resentment?
Posted by: nate combs | Aug 03, 2006 at 21:42
Well, I really want everyone else to have a real-time mo-cap device too. ;)
Also, such an input method, does not merely make nonverbal expression more efficient, it also opens up new kinds of nonverbal expression. For example, currently I can't use avatar gesture descriptively. I can't use my avatar's hands to depict the relative positions of a group of mobs nor the local topography of an area. There's no way to pre-script all the kinds of creative iconic gestures that speakers invent on the fly. It's the equivalent of the "quickchat" systems in ToonTown or EverQuest Online Adventures in which you can't freely compose a text message, but can only choose from a list of canned messages (those games of course also offer free text modes).
Posted by: Bob Moore | Aug 04, 2006 at 00:31
We're getting close to achieving real-time Contour capture at low resolution (e.g. good enough for preview to evaluate the performance). In time I expect that we'll get to real-time Contour capture at (what is today) full resolution. At that point, the performers will be actually controlling 3D characters with all of the nuances of their expressions and the flow of their costumes in real-time. To someone viewing the output of a real-time Contour system, it would be very "Holodeck-like" (if you are familiar with the virtual reality simulator in later Star Trek series).
There were many possible approaches to the algorithms we are using, but we quite specifically architected the algorithms to be parallelizable and topologically efficient in terms of dataflow. This allows us to very efficiently utilize parallelism, particularly given the trending toward multi-core CPUs and increasing parallelism in GPUs.
Getting EVERYONE to have a Contour capture device is a bit more tricky, but one thing I'd like to eventually get to is a "prosumer" version of the system for few thousand dollars that people could set up in garages. There may not be a practical business model to support this, but it would be amazingly cool to put that much power in the hands of thousands of people.
-- Steve Perlman, president, Mova
Posted by: Steve Perlman | Aug 05, 2006 at 00:16