Comments on: Readings 4: Motion Capture Overview

By: Nathan Mitchell

Nathan Mitchell — Mon, 07 Feb 2011 20:18:58 +0000

I apologize for the lateness of this post. I saw the big posting of
papers last Friday, but somehow I missed the actual assignment for
this Sunday. I must have glanced over it in my RSS, but I have no idea
why.

——————————-

The use of motion capture data for computer animation will depend on
solving problems in two key areas: producing a convincing single
motion and producing a convincing transition between motions. In terms
of motion capture these two tasks seem to be somewhat separate.

The first task is primarily concerned with the raw motion capture data
and cleaning it for use. Artifacts from the data need to be eliminated
and the motion smoothed so it looks fluid and natural. It also deals
with the concept of synthesis, where many similar motions can be used
to produce a single blend that meets some set of requirements – be
they emotional content or mechanics of limb placement.

The second task is important as a single motion is rarely sufficient
for any purpose. A character’s actions consist of many motions, some
sequential and others concurrent. While motion capture sessions could
be planed to demonstrate all the motions the character needs, rarely
can every action be planned to such detail ahead of time. Techniques
are required to merge multiple motion capture clips together and to
splice them sequentially, all while maintaining realism. This requires
maintaining constraints and physical laws such that any transition,
while not preformed explicitly by the actor, will have the appearance
of being possible to do so.

In many ways, these two task overlap. It is difficult to get good
blending if the starting motions are poorly conditioned for the
scene. Thus well constructed, and with prior knowledge of needed
transitions, motions need to be made for the entire action to come off
successfully. However, depending on the final purposes, the ‘weight’
of these two tasks may be different. For interactive character
controls, like in a game, very good sequences and transitions are
important as the player will be often switching between motions while
playing. If the transitions are poor, they will be obvious to a viewer
who sees them done repeatedly. On the other hand, other areas may be
less concerned with the transition quality, but more focused on the
quality of the individual motions. Medical analysis and physical
therapy probably cares more about the action of rotating a single
damaged joint rather than blending the motion with another.

In the constraints of this area, I am fairly curious about the work
Perlin did with the noise based animation. On one hand, the demos look
strangely appealing – probably due to the natural seeming randomness
of the actions. No motion looks quite like any other, which is
characteristic of a biological system moving. On the other hand, I
wonder if the technique could be improved by combining it with motion
capture data. As with many other applications of noise, perhaps it
would reach its peak if used as a surface decoration rather than the
meat of the animation. This may be a viable alternative to using large
databases of motion to construct motion blends. A limited set of
simple motions could be enhanced with noise to create the effect an
animator is looking for.

———————————————

Also, I read this paper last semester when I implemented a Cyclical Coordinate Descent method of IK, and I don’t think its in the IK list. I don’t know if you have seen it, but I thought it was a useful overview of techniques. Its a bit old, but I don’t think that matters too much.

Welman, C. “Inverse kinematics and geometric constraints for articulated figure manipulation.” Simon Fraser University, 1993.
http://lib-ir.lib.sfu.ca/bitstream/1892/7119/1/b15233406.pdf

By: Danielle

Danielle — Mon, 07 Feb 2011 14:08:28 +0000

After reading these papers, the primary challenge that seems to arise is finding a balance between the natural fluidity of motion capture data and the precise control provided by algorithmic data. While motion capture data may provide a greater sense of realism in fields like film and medical fields, where the motion can all be orchestrated ahead of time, scalability issues become huge in games, where motions must be generated on-demand.

I am very curious about how specifically to handle the scalability issues presented by the data. My initial sense from reading these papers is that the scalability issue is hard and we don’t currently have a ‘killer’ way of tackling the issue. Not only that, but finding the key trick for rapidly generating natural motion from both mocap databases and algorithmic procedures could provide insight into doing rapid calculation and approximation (i.e. ‘faking’ what is natural) for other computational issues as well.

I felt that the challenges discussed in 2000 paper generally echoes the Eurographics paper with the exception of the solutions for providing more control over motion capture data by motion graphs and similar techniques. This is surprising out of context given how much work has gone into understanding how to effectively animate the human form; however, like most of the elements cited in the Catmull paper as being essential to the development of animation as a field, it still is an open question as it is so difficult to model and humans are so used to seeing real human motion in everyday life that it is perceptually really hard to fake.

By: sandrist

sandrist — Mon, 07 Feb 2011 07:48:48 +0000

1. The central challenge in using motion data to drive animation involves figuring out a way to use an existing database of motions to derive the precise motion we desire, either through blending, concatenation, layering, or some other transformation. The fundamental difficulty is that no single existing motion primitive can give us exactly what we want, even if that primitive was obtained during a motion capture session with the specific motion goal in mind. We probably do not need all of the details present in the richness of human motion to animate a simple character; we only want the “essence” of the motion. Different applications differ in their necessity for control vs naturalness, both of which are exhaustively explained in the second paper. For example, film usually places an emphasis on naturalness over control (much to the displeasure of the actual animators involved, I’m sure), hence they make much more use of motion capture techniques than procedural or physical simulations. Games are just the opposite, in which a great degree of control is required in order to make compelling interactivity possible.

2. I am very curious about the specifics involved in parameterization, defined in the second paper as the method of converting intuitive control parameters into the animation parameters defined by whichever animation technique is currently being used. It seems to me to be a very hard problem, yet very important, to be able to present the actual animator with controls that are both intuitive and as orthogonal as possible.

3. I read the draft chapter of Animation from Observation, and it seemed to me to be a very good introduction to the large fundamental issues at hand, all of which seem to still be quite relevant. One interesting note you made was on how there is quite a tension between traditional animators and motion capture technicians and users, stemming from unrealistic expectations of what motion capture can achieve and animators having a difficult time working with the data produced by motion capture. I’m curious as to what extent this tension still exists today?

By: sghosh

sghosh — Mon, 07 Feb 2011 06:51:03 +0000

The main challenge for motion capture data is that the number of data sets available to us is limited however we try to simulate an infinite number of moves from a combination of these few sets.However the more data we have with us, the more computationally complicated it becomes to ‘decide’ which motion fits in best with the current set (i.e. for operations such as blending/concatenation etc). This is specially true for games as the virtual character is human controlled and with the combination of keys available to him on the controller and the environment that the character is in, several moves are possible. Again motion (visual) is not the only aspect of a game – this whole unpredictability of the motion during real time game playback makes it difficult to produce proper lip-sync of the characters (the Realistic Crowd Sim paper I read makes a study on this).Motion captured data is limited as well. Today if they need to do a motion capture of an elephant I doubt existing techniques allow that i.e. it is highly specific. What we see more often is a human version of that animal (in movies like ‘Happy Feet’ and ‘Ratatouille’). Coming to films, it is easier to ‘handle’ motion capture data as the real-time play-back factor is no longer valid. Moves can be choreographed tailored to the script. But think of the end scene in ‘Titanic’ where a lot of non-real characters jump off the ship into the ocean – even though we can spend months/years carefully planning out how each one of them and then use motion capture, it makes more sense to capture just say 5-6 of them and mix and match them to create 100’s of CG passengers.

I was wondering how motion capture data is actually used in an animated film. What about the face of a character? Is that motion captured as well or animated – how to decide that?

By: adrm

adrm — Mon, 07 Feb 2011 06:41:04 +0000

After doing the reading, I believe that the core challenges have to do with how we can adapt the data for the intended use. This sounds vague, so let me expand:

In film, a big problem is how transfer the motion of the actors to the characters that will ultimately be displayed. If the characters do not have the same proportions as the actor, how do we deal with that? Even if that is the case, there is no way that we can hope to capture all of the subtle details that the human body makes, so will missing these be a problem?

In games the motions can’t be planned ahead of time, and therefore they need to be generated on the fly. If we restrict what the character can do, perhaps we can construct (both manually and automatically) sets of motions that will be enough for the purposes of the game. However, this tend to make things look repetitive, or even unnatural. As stated in one of the readings, it is not practical to just include huge libraries of examples either, so how do we deal with that?

In medical analysis, the fidelity of the data is extremely important. However, every type of sensing carries with it a certain degree of uncertainty. How can we make sure that this uncertainty will not lead to wrong medical conclusions?

With all of these problems however, I think that what most interests me is something in between the game and film example. How can we use synthesis by example to make it possible for a novice (aka me) to create professional grade animations. Is that goal even possible?

By: danieljc

danieljc — Mon, 07 Feb 2011 06:37:04 +0000

One of the main challenges seems to be combining motion data with interactions from other objects. Motion data seems to be harder to use when animating something that is hitting or otherwise touching something else in the scene. Another challenge is making smooth transitions between different sets of motion data so that the motion seems to move smoothly. Additionally, the motion data needs to be simplified enough so that it can be applied to more than just one model and so it isn’t too complicated to use.

I am curious to learn more about the different ways that the motion data is combined and how several different sources are sometimes used to make a smooth transition. Another interesting issue seems to be the splicing together of different upper and lower motions at the same time

By: Aaron Bartholomew

Aaron Bartholomew — Mon, 07 Feb 2011 05:18:19 +0000

1) I think that capturing the richness of human motion will be the most difficult core challenge to learn methods for. Human motion is so subjective and complex that it can sometimes be tough to describe qualitatively; trying to do this quantitatively will be even more challenging. Is there a way to generalize the feeling or style of the movement such that the data can be used to generate new motion primitives that convey the same feeling or style? We need to learn methods to capture and understand these feelings/styles in order to automate genuine, unique animations (as opposed to bland, general movements). The need to be understood quantitatively is most needed in games for the sake of generating/blending fluid movements that reflect a style/feeling on the fly; without thorough understanding, motions that do not fit the style could be chosen resulting in a unnatural movements. Film may not need as deep of an understanding, since it doesn’t have the real-time constraint (and can be adjusted offline).

2) At this point, I’m most interested in what I mentioned above, real-time generation/blending of motions that reflect a certain style. If we are to have believable, realistic characters in games (my primary interest), then we need to have their motions communicate a dynamic state of mind.

3) In the intro to your Mocap book, you say that video processing technology is a long way off from being able to determine the movement of someone in clothing. The authors of the paper “Video-based Reconstruction of Animatable Human Characters” present methods that make this appear feasible. Although, if I remember correctly, I think it took six hours to process one video sequence…Thus this is still a problem, but perhaps its solution is not as far off as once thought.

By: Michael Correll

Michael Correll — Mon, 07 Feb 2011 04:49:10 +0000

What surprises me is how little progress has been made in the realm of real time human motion. It seems to be very challenging to put all of the pieces in place: implementing a wide variety of motions, blending those motion primitives together, allowing improvisation, and doing all of those things in real time and in a way that looks natural. Other than motion in cut scenes (which doesn’t count, both because it falls so far towards the control side of the Van Welbergen et al. criteria, and because motion in cut scenes has been pretty good for a while now), I haven’t really been wowed by anything in video games for a while now.

The motion system in Spore is what seems most promising to me: it tackles the issues by doing everything procedurally (well, not really, but that’s the claim). Monsters with 12 legs will move reasonably well, ditto with 8 legged creatures or creatures with caterpillar legs. Of course it is very easy to play around with this system and break it, and all of the motions generated are exaggerated and unnatural (somewhat intentionally so). But I think the fact that it’s possible to set up an a posteriori real time motion scheme is promising.

The Gleicher paper I think makes a convincing argument that advances will need to be revolutionary, not evolutionary. We’re pretty close to surmounting the uncanny valley when it comes to human-looking characters, but we’re still a long ways off from human-moving characters, except in limited contexts. We’ve gotten a little bit better at making “cartoony” human motion: n.b. the hand-tweaked mo-cap canned motions in games like Resident Evil 5: in its intended context the motion doesn’t look “realistic,” but it is compelling and fits the setting (“realistic” in a Walt Disney way, maybe). But much like Spore, it’s easy to break the system by choosing odd angles and other parameters to begin the canned motions. By erring on the side of realism and control, they prevented spontaneity, the flipside to Spore’s design decisions.

I wonder how much of these decisions about where to fall in this vast continuum of motion choices are made. We have an idea of how game designers skimp on things like graphics or storytelling or voice acting, but how do designers skimp on motion? How can people do it well when the skill set needed to implement a motion system can drastically change based on what compromises are made? How much compartmentalization and movement occurs between the keyframe+mo-cap animators working on cutscenes and animated films, and the programmers in the trenches making motion for gameplay?

By: David

David — Mon, 07 Feb 2011 03:59:50 +0000

The main challenges seem to be finding a close match for the motion you want, and altering the original data to fit your parameters. Scaling the data set doesn’t actually help either of these. You can get a closer match if you have more motion data, but you still have to be able to describe your desired motion well enough to find the right point. Regardless of the size of your sample set, a good blending algorithm is hard to do.
One thing I noticed is that procedural techniques offer much better control, but are sometimes avoided because they look unnatural. It seems that an alternative to using motion data would be to create more realistic procedural algorithms. In other words, we currently have two extremes: procedural offers high control but low realism, and motion capture offers high realism but low control. Most of the techniques are starting at the motion capture side and trying to improve the control issue, but few are starting at the procedural side and working on realism. I also get the feeling that problems on the motion capture end are usually overconstrained (several motion primitives conflict with each other and with physicality constraints) while problems on the procedural side are underconstrained (IK being the best example). Finding some sort of “realism” optimization for a procedural animation seems simpler than figuring out which parts of a motion primitive to ignore.

By: Reid

Reid — Mon, 07 Feb 2011 02:55:04 +0000

1. For games, the biggest problem with motion capture seems to be the number of motions required to interact with the world. Since a game is essentially a non-deterministic simulation of reality, you cannot know beforehand all possible movements a character will need to make. Since motion capture relies on specific actions being recorded the method fall short when a character needs to do something it has no example for. In this case, devising a method that can extend existing examples to new situations would help immensely.

For things like film the actions are known beforehand, but generally the quality of the animation needs to be far superior than what is generally considered acceptable in a game. In this case the primary problem is recording examples that accurately depict what the director would like the motion to be, a matter of control. Perfecting it is a time consuming process reliant on the actors ability to carry out the directors instructions. Automating this process without sacrificing the realism/naturalness of the motion would be desirable.

2. I’m curious how well motion capture data has been used to generate a model based approach to animation, would it generalize well to new actions while retaining the style directed in the capture of the examples?