for Monday, February 14
Before reading about advanced synthesis methods, I wanted people to be familiar with the original methods. Partially, because I am more familiar with them (since I was part of developing them). But also, because I think they make a really good set of building blocks for what comes later.
I am going to suggest that people read “our” papers, that introduced motion graphs and blending techniques (registration curves and match webs). These ideas were introduced in a series of 3 papers by Lucas Kovar (and me). However, these papers (as well as the footskate paper) all became thesis chapters. The chapters are actually better than the original papers since they add detail, and in some places give some discussion relating the work to some of the things that came at the same time.
So, you should read at least chapters 4-6 of Lucas Kovar’s Thesis.
In class, we’ll talk about some of the other things that came out around the same time, and some of the alternative choices for some of the pieces. Rather than reading about the alternatives, if you’re interested, I’d recommend 2 of the earlier extensions (again, from our group):
- Michael Gleicher , Hyun Joon Shin, Lucas Kovar, and Andrew Jepsen. Snap Together Motion. I3D 2003. (project page with video and pdf).
- Heck, Rachel and Michael Gleicher. "Parametric Motion Graphs." Proceedings of Symposium on Interactive 3D Graphics and Games 2007, April 2007. (project page with video and pdf).
The first paper deals with the efforts to make motion graphs really practical for interactive control. The second puts motion graphs and parametric control together. They are not the only papers that do those things (we’ll look at more recent ones).
In the comments, leave a remark that will make me believe that you actually read the required readings (and if you read anything additional). Also, comment on any decision decisions that you question – e.g. why did you do that, rather than something else. There are lots of choices made in these approaches. Some of which were made well, others, we know better now. Some things that we spent lots of time and thought on, turn out not to make much of a difference.
{ 14 comments }
Kovar: Ch 1-2, 4-6
Summary:
This thesis describes methods of automating the synthesis of new human motion sequences, by building models from sets of example motions.
CH3 – Enforcement of end effector constraints, building on the Footskate paper.
CH4 – Motion graphs in gory detail, which are better than the “move trees” the game industry has been using for years because it’s all automated (except for labeling motions for the purpose of creating subgraphs based on motion type)
CH5 – Motion blending using registration curves, so that motions that are similar but not synchronized in time or alignment can be combined in sensical ways.
CH6 – Parameterized motion allows us to use the samples we have to extrapolate similar poses we don’t have.
Comments:
In Chapter 4, the thesis states that “Motion is fundamentally unchanged if it is translated within floor plane or rotated about vertical axis”, however isn’t this a bit of a limitation? I’m just wondering about the possibilities if we didn’t limit ourselves to just rotations about a vertical axis, but about other axes. We could take a motion sequence of a man stretching (touching his toes then stretching upwards) into a jackknife dive, for instance. This would probably increase the amount of computation required though, to recognize that motions can be synthesized from other motions that have wildly different orientations.
In the section on Path Synthesis it says “Paths with sharp turns cannot be fit accurately if no sharp turns are present in the data”. This sounds like a limitation that should be relatively easy to solve by divorcing orientation synthesis from path synthesis, but that could result in artifacts (if we calculated path and orientation separately maybe we could synthesize a sharp turn from a motion of a person walking in a straight line, but it wouldn’t look natural because the person would not have the slight leaning exhibited when a real person turns).
In Chapter 5, the restriction of W = 2 or 3 might be a bit stringent? Instead of dismissing motions that are too fast or too slow, perhaps we could resample them until they can satisfy the requirements instead, this would expand the possibilities for blending. Alternately, if two motions are identical upon resampling, throw one out, this may cut down on computation time (although it seems like it’s very fast anyway from the experimental results section).
———————————————–
Parametric Motion Graphs Comments:
As pointed out in the video demo, the cartwheel motion only goes to the right, so in order to turn left it has to take the long way around. In Kovar’s paper, movements and the mirror images of movements were used, perhaps that would help here. The fact that the transitions are limited to the end of one movement and the beginning of another makes this system a little less flexible than Kovar, however it does not require global search. But does the fact that it computes all the distances of the pointcloud, only to inspect a small subset of these computation results in order to find transition points a drawback? Perhaps it’s trivial, I don’t know.
———————————————–
Snap Together Motion:
The concept of creating hubs of common poses and having other transitions connect through these poses definitely greatly simplifies the graph structure (compare Fig 9 to Fig 4.9 in Kovar thesis…). This graph building approach in this paper seems to be a good compromise between the Kovar approach (possible transitions are exhaustively found), and the PMG approach (transition only takes place near the end of one movement and the beginning of another). But looking at those two papers however the PMG approach may not be bad at all, since natural human motions doesn’t usually have abrupt transitions anyway.
Seems like one common ground is the need to build smooth transitions between two motions, other stuff like where the transitions take place is dependent on how you build the graph and how to pick and weigh different transition possibilities, but the actual mechanics of blending and transitions is crucial to the quality of the overall animation. Now I’m realizing the portion of our project specification that deals with splicing is definitely the most important.
(1) Kovar Thesis and
(2) Parametric Motion Graph
*********************************************************************
I read the entire Lucas thesis and I must say that it is remarkable for its succinctness and continuity of presentation.
The basic ideas of Motion Graphs, Registration Curves and Parametric motion is very clear, but the low level details raise some questions.
(1) In general what is the average in-degree and out-degree of the motion graph ?
(2) Are higher order derivatives ( page 56 ) useful in any situation ?
(3) There are no mention of “Outliers” in the thesis or the paper, so is there big assumption that data is smooth and noise free ?
(4) The paper described “Procedural motion” and “Synthesis by Concatenation”, My understanding is that “Parametric Approach” is “the” way to go about creating smooth motion, so are there any situations where other approaches could be better ?
(5) If there are two clips coming from two different sources, edges then what exactly is the meaning of “Coordinate Frame Alignment” ? What is the frame of reference for any two clips and how is it stored in a file format ( for example in BVH ) ?
(6) The second paper describes some of the limitations of the approach and proposed some practical solutions, are those workaround automated or do they require user’s assistance ?
I read chapters 4-6 of Lucas’ Thesis. I skimmed over some of the math, especially the optimization discussions. I think I’ve got the main ideas though.
Chapter 4 discuses motion graphs which basically is an attempt to find similarities in two motion sets with the intent to blend from one into the other at will. When this is scaled to many motion sets, it becomes possible to develop a variable length animation by simply choosing the right motion set to blend into. Using this method, the author was able to generate animations of a character moving along a path. This chapter also introduces the concepts involved in measuring the distance between two poses. The biggest trick here is to get everything aligned properly. This is posed as an optimization problem using a rotation about the y axis and a translation over the x and z axis.
Chapter 5 looks more carefully at the idea of blending different animations to synthesize new animations using registration curves. These consist of three parts. The first is a time warp curve which attempts to alight similar poses in time, reducing artifacts in the blending stage. The second is alignment matching. This is an attempt to, well, align the coordinate spaces of two different motions. This is required to avoid nasty artifacts which are again induced at the blending stage. The final stage is constraint management. This state attempts to smartly combine constraints from both motions so that artifacts are reduced.
Chapter 6 seems to expand on chapter 5 by introducing the parametrization for the synthesized motion. This also involves searching for motions that can be blended in order to build the parameter space. This search is built using a match web. I don’t completely understand the construction of this data structure. However, once a suitable number of matches is found, motion can by created by using the blending techniques discussed in chapter 5.
If you haven’t stumbled upon them yet, check out the videos, they do a good job of getting the main concepts across and make the papers a little easier to read.
http://www.cs.wisc.edu/graphics/Gallery/kovar.vol/RegistrationCurves/
http://www.cs.wisc.edu/graphics/Gallery/kovar.vol/ParamMotion/
I read chapters 4-6 of Kovar’s Thesis, and skimmed
the paper and project page of Snap Together
Motion.
The first thing that caught my attention was how
similar the first chapter (4), was to the paper it
originated from. Many of the images looked rather
familiar. More importantly, I was interested in the
pruning section. The text mentions that pruning is
needed to remove the possibility of sinks and
dead-ends, but it occurs to me that such stopping
points may be valuable.
While graphs that can be traversed continuously are
good for animations that are long, often there are
points in an animation where finality is required
or preferred. In particular, death scenes where the
character is shown to be dying bring the
expectation that no further movement will be
forthcoming. Granted, these motion segments are
uncommon compared to all the other motion types
observed, but they are sometimes needed. Removing
them automatically via pruning may restrict what
is available to an animator when choosing the
proper path on the graph.
The other chapters were harder to follow, though I
wonder if that is due to the fact that I read the
first chapter’s paper prior to this
reading. However, a couple things stood out to
me. First, I was struck by the similarity between
the problem being solved here and the problem
being solved by the massive databases papers from
the inverse kinematics topic. While the size of
the database and the methods are considerably
different in Kovar’s thesis, in both cases an
attempt is made to synthesize motion from a prior
collection of examples. I believe the Wei paper
was focused on poses more, but still the concepts
seemed similar in my mind.
Second, in the third chapter, there was a section
that talked about the system being unable to
distinguish between the action of placing
something on a shelf and taking it off the
shelf. It then states that a human can tell the
difference between the two actions be looking at
the arm positions at the beginning of the
sequence. Now while this may be very true if I
could see the motion in action, what I noticed was
that the general positioning of the arms were
actually very close – if one of the motions was
played in reverse.
In terms of motion matching, I wonder if occurred
to you and Kovar to do searches on the motion
segments when the timing was reversed. Not only
would that provide an instant doubling of
available data, it may provide unseen
opportunities for blending. Now the difficulty
would be ensuring that the motion looked correct
when played backwards and it may be the case that
it would not work due to this reason, but from the
single frames shown in the chapter as an example,
it certainly appears to be a valid question.
* Note:
I’d like to thank Jim for finding those video
links. (The links in the thesis are no longer
working.) They really help getting the ideas across
far better than still photos.
I read parts 4-6 of Lucas Kovar’s Thesis and I also the extended abstract from Parametric Motion Graphs.
Section 4 from the thesis discussed the reason for motion graphs and how they are created. The idea of first converting the skeleton motions into point clouds before the comparison seemed like an interesting choice. On a basic level it surprised me that this is what was chosen, but I can see how this would help.
I also wondered if there might be a way to include mirrored data matches linked to its unmirrored data somehow in the motion graph rather than including the data twice. Section 5 talked about “Registration Curves” which could be used to more cleanly merge several different motions. Section 6 was about Parameterized Motion, which provided ways of identifying specific types of motion in a large collection of motion data and grouping these similar motions together for blending together. I wondered about the “time correspondence” on this which required the time between actions to match closely. It seems like pauses between portions of certain actions could be common even if the overall action is similar.
Readings: Chapters 4-6 of Kovar’s Dissertation
The basic idea behind chapter 4 is to be able to produce more realistic animated motion by linking short motion clips. The challenge is to be able to connect the motion clips in a seamless and believable way. And the algorithm does this automatically by producing transition motion between the sets of motion which also satisfies certain user defined constraints. The strategy involves finding numerical similarity between the poses in two frames using a distance metric. What is interesting to note is that the simple approach of using weighted joint data is not sufficient and that markers need to be attached to joints for calculations.
Chapter 5 deals with the aspect of blending existing captured motion and generate ‘new’ motion. It introduces the concept of registration curves which can arbitrarily blend a number of input motions. The idea is to find logically similar motion sequences where the skeleton is in more or less similar positions. It does so by using time-warping to align in-phase motion frames, aligns coordinate frames and deal with constraints from the motion sets. This method does give good results for motion data for a single database -the biggest advantage is that it provides real time results to the user. But what if I want to generate motion blends from two different databases with similar poses (not arbitrary ones) but different bone lengths?
Chapter 6 provides a method to construct parametrized motion from a large set of motion data. The main idea is to search for logically related motion segments using a query by using a novel method in which the matches itself are used as queries later on. This expands the search space. The use of a match web to pre-compute possible matches further enhances search. The matches are then blended to produce the parametrization space.
Ch. 4-6 of thesis:
Chapter 4 describes motion graphs, which are animation clip ‘nodes’ connected by transitional clip ‘edges’. The advantage of such a graph is that it can be searched to construct an arbitrary sequence of animated motion that satisfies some set of constraints. It commented that searching the graph for an optimal solution is impractical computationally, but that approximate solutions are often reasonable. It then proceeded to discuss a method of branch and bound search combined with optimization heuristics to select a path. It noted the downsides to such a method were that literally any walk was valid if it got the best result (turning in place for 7 seconds to line up to the goal state) or that by being too specific it would ignore reasonable alternatives.
Chapter 5 discussed a new data structure that encodes information about several motions, allowing them to be more effectively blended. The first part encodes data related to the timing of the two motions, by tracking which parts (poses) of each motion are similar and then using dynamic programming to produce a best fit path between these. Allowing similar motions to match increases the chances that the blend will be smooth. Second the orientation of each motion is encoded so that they may be aligned when blended. The given example is of blending a right turn and a left turn into a straight line. Finally, similar constraints between the motions are grouped, split, or removed in the final blend.
Chapter 6 discusses a technique for automatically identifying similar motions to a given query. In this process a search web is constructed that generates a space of all distict motions that could be matched. When a query is given this web is searched using time constraints and numerical similarity measure to select matches that are similar to the query. The goal of this system is to create new motions by interpolating between the results of the query to create new motions. This is done using the registration curves from chapter 5 to blend groups of similar motions from the results to create a dense sampling of the interpolation space. Then the final interpolation is done using this dense sampling.
I also read Ch 4-6 of the Lucas Kovar’s thesis.
Chapter 4 presented motion graphs as a way to create arbitrarily long sequences of motion from a set of example motions. Transitions are formed between frames with similar poses, calculated using a novel distance metric. Search is used to walk the graph and synthesize new motions that satisfy constraints and optimize some criteria. As you pointed out, there are some things with this work that you spent lots of time and thought on, but which turned out not to make much of a difference. I’m wondering if one of these areas would perhaps be the discussion on pruning the graph. As Nathan already pointed out, it would seem that some of these sink and dead end nodes to be pruned would actually be quite desirable for finalizing a motion.
Chapter 5 dealt with registration curves, the components of which are timewarp curves, alignment curves, and constraint matching. Registration curves provide a significant improvement over existing blending algorithms, by resulting in blends (both transitional and otherwise) that look better and are easier to control.
Chapter 6 talks about parameterized motion. A query motion is used to find a subset of motions that are logically similar. These motions create a parameter space of possible new motions that can be created by interpolating between examples.
Chapter 4:
This chapter discusses the overall ideas for generating motion graphs, which serve to automatically identify where different motions are similar so they can be synthesized and transitioned between. Smooth transition points are found by using the minima of squared distance of point clouds (representing the skeleton’s joints) between frames of motion. The transitions themselves are then generated by interpolating between the motions. Although this alone leads to an objective function that is too specific and leads to arbitrary motion, so a cost function needs to be introduced to be able to facilitate motion along user-specified paths.
Chapter 5:
Registration curves are data structures that combine timing, local coordinate frames, and constraint states of input motions to determine blending for generating new motions. Timing is needed to remove artifacts from combining unrelated portions of motion; this is done by creating a curve which allows for blending at only corresponding frames. Coordinate frame alignment is needed to make sure these motions are blended as though the character’s body maintains a consistent global position and orientation. Finally, constraint matching is used to make sure the different sets of constraints for individual motions are sustained.
Chapter 6:
Parametrized motion is the result of building a continuous space of motions defined from logically similar ones from a database. Numerically close matches (based upon skeletal poses and frame correspondence) are found and put into a match web for look up. The paths of this web are then used to sample the parameter space and the use blending to make continuous motions possible.
I guess this is more of a question than a comment, but how would you address the issue of motion control being limited to nodes in the graphs? It seems that for practical usage for something like a game, responsiveness may be more important than fluidity, so an motion-interrupt system would be a necessity.
I read chapters 4-6 of Kovar’s Thesis
The Caperer 4 gives a detailed description about motion graphs including how to build them by finding similar motion using a point cloud as a distance matrix. The transitions are created at local minima and the user can define the thresh holds for higher motion quality or transition density. Motion graphs can be used for creating continuous motion by building random walks through the graph. The graph can be searched for a motion by a specifying an objective function. The path synthesis or is formulated as an optimization problem comparing an existing walk path and a new walk path.
The Chapter 5 introduced registration curves. Registration curves are a data structure that can be used to automatically generate blends of an arbitrary number of input motions, they expand the range of motions that can be blended with automatic methods by combining the relationships between the timing, root trajectory, and constraint state of the input motions. Registration curves can also serve as a back end for common blending operations such as transitioning, interpolation, and continuous control.
The chapter6 presents automated methods for extracting logically related motions from
a data set and converting them into an intuitively parameterized space of motions. The paper introduces a search method that uses numerically similar matches as intermediaries to find more distant matches and an automatic procedure for parameterizing a space of interpolations according to user-specified motion features. The algorithm samples interpolations to build an accurate approximation of the map from motion parameters to interpolation weights, and it incorporates a scalable scattered data interpolation method that ensures interpolation weights have reasonable values.
I read Ch 4-6 of Kovar’s thesis.
Chapter four presented a method for automatically concatenating motion data. In it’s essence, this paper used a graph-based approach to weave together different frames of animation by generating transitions between different motions. On one hand, generating a transition for each viable motion pair does allow the user to see effective transition points that may have previously been overlooked; however, either allowing user specification of points of interest or intelligently selecting frames to link (see Ch6 discussion) would remove the bottleneck of computing transitions over the entire graph. Also, it would be interesting if more contextual/semantic consideration was made in the decision of which frames to transition between.
Chapter five presented a method for intelligently blending motions. This presented a really interesting potential extension to improve the work in Ch4. While I didn’t quite grasp how multiple motion blending worked for more than 2 motions, it seems like a really interesting concept. Also, the time warping and alignment constraints set forth in this section seem very similar to biological sequence alignment: make the motion sequences similar (by warping/gapping), then line them up over the warp to make their comparison semantically meaningful.
Chapter six presented a method for constructing parameterized motions of large motion data sets. This technique offers a potential solution to the bottleneck in Ch4 by the informed selection of ‘similar’ motions under a given set of constraints. Like the above two sections, the parameterized motions work appears to quantize motion to a point where different algorithmic techniques can be used to manage and manipulate motions and motion sets. Perceptual issues aside, it feel like computations such as the similarity between two distinct motions could potentially set the foundations for a mathematical notion of what is ‘good’ motion synthesis. Also, the consideration of context, though not often discussed in these sections, may be able to help to speed up the overall computation by providing further constraints for synthesis.
I read Lucas’ thesis ch. 4-6 and also the Parametric Motion Graphs paper.
I’ve been exposed to these papers/concepts before, but I suppose one question I still have is whether or not there’s been much adoption of the concepts in them. Whenever I read about motion in video games (which, granted, is not very often), it’s usually either hand animation of motion, hand tweaking of mo capped data, standard move graphs, or some combination of the above. Motion graphs in particular seem like a better solution for pretty much any criterion, and if they were more or less real time 3 years ago, I’d imagine industry adoption would make these things very fast, very robust, or very good (pick two, I suppose).
So do people use these, especially in the interactive setting? If not, why not other than “they haven’t heard about them yet?”
I read 1-5 and had enough time to only skim over 6 of Lucas’s thesis.
Chapter 4 discusses the notion of motion graphs and how to use them to generate new sequences of motion (of arbitrary length) from a bunch of “fixed” example motions. The problem is multifold and the chapter takes the reader through this journey in a wonderful manner. First is figuring out what the user input will be and how it can be used to create the motion graph. In this case, the user annotates the motion data which is then used to figure out the transition points (poses in different data sets that are reasonably similar) in the data, which can then be blended if chosen during the graph “walk”.
The novel distance metric uses point clouds from the marker data and a window (of frames) while comparing two motion frames for their similarity. The metric is designed to be invariant under rigid 2d transformations (translation along X,Z and rotation about Y) and its calculation optimizes on some of the inherent redundancy. Transitions are created between those points (i mean frames) for which the distance metric is below a certain user-specified threshold. The blending operation between them involves linear interpolation between the root positions and slerp for the marker/joint orientations (what about their positions??).
Since the blending operation can introduce violations of kinematic constraints, the constraints from the original frames are chosen and utilized in the blending process.
As others mentioned earlier, graph pruning might discard useful data such as the dead nodes, so it might not always be the best choice and would be better off being a user input.
The problem of motion synthesis from the motion graph is tackled next using a branch and bound based algorithm, wherein the halting condition is required, but not to be stressed upto (kicking will do, don’t need to restrict the search by saying this “kind” of kicking..). The idea of not wanting to “lose” high frequency data also came up.
Overall, the chapter was quite beautifully presented and an absolute pleasure to read.
Chapter 5 discusses registration curves and how they can be utilized to synthesize new “actions” which was not possible in the then state-of-the-art blending. Registration curves are data structures that encapsulate relationships involving the timing, local coordinate frame and constraint states given a bunch of input motions. By time-warping the input motions and applying many of the ideas from chapter 4, a lot of artifacts in combining the (unrelated) motions can be avoided. Co-ordinate frame alignment is used to align (for lack of a better word) the skeletal poses and help in the construction of an alignment curve that specifies rigid 2d transformations thataligns the frames with the timewarp curve. Then constraint matching is used to make sure that a decent subset of the constraints, which could now have overlaps, are taken into account to create the blended motion.
Chapter 6 deals with the construction of parametrized motions from large data sets, which then allows searching for a query motion. A match web is used for this purpose and uses time and similarity constraints for the search. A lot of ideas from the previous chapters are reused in the process. I only skimmed through the ideas in this chapter, so I definitely need a re-read to see how a lot of scale issues are tackled since it uses more of a “global” view.
I find that a lot of these ideas are evolutionary (isn’t it almost always the case?), wherein something is tried and slowly improved/built on step by step.
I read chapters 4-6 of Kovar’s thesis.
Chapter 4 dealt with creating motion graphs, and blending different motions to automatically generate motion sequences based on a specified path. The biggest problem here seemed to be figuring out the right parameters to adjust in order to get results that the user had at least some control over. Chapter 5 discussed registration curves, which matched the timing, rigid transformations, and constraints between similar motions. I found this chapter most interesting. Mainly, I was surprised that his way of dealing with constraints actually works. Chapter 6 is about parameterized motions, a weighted combination of other similar motions in a database. This idea seems simple in theory. It seemed like the most challenging part of this technique is matching similar motions, which Kovar addressed by building search webs.
{ 1 trackback }