Avatars: Semi-Autonomous Puppets
Notes On the Art of Avatar Animation


JJ Ventrella
* Inventor of Puppeteering for Second Life,
* Main Designer of There's Avatar Communication
* Philosopher of Avatar Genetics


March, 2005


We are at the cusp of a new age in history in which most of us will have an alter-ego, indeed, many alter-egos, living in worlds that do not correspond to the space and time of the physical world, and yet these worlds will be just as important in our emotional lives as the physical one. Sound creepy? I think it's a little creepy - but it is also damn exciting!

Let me set the stage of my discussion by way of a question: Your avatar has fallen from a great height, and is getting up from a fumble. How long should it take for your avatar to get up and brush the dust off its pants? And should you be able to interrupt this animation, cool as it is, if you are in a hurry to get moving on? These are subtle nuances indeed, but I bring it up to muse on a larger general issue, which is: how much of my avatar's motion should be dictated by pre-scripted animations? Is my avatar some animator's work of art? - or is it strictly My Puppet? I believe there is no one answer. It depends on the story, the deepness of the simulation, and the anticipated goals of the user. The art of avatar-making is a hybrid art - a mixture of portraiture, character animation, AI, genetics, physics, and user-interface design. Here I will take a look at four topics surrounding the art of making Avatars. They are:

1. The Intentional vs. The Physical Avatar
2. The Avatar Camera: Third-Person View, or Astral-Projection?
3. Am I Chatting, or Am I Typing At a Keyboard?
4. Body Language

1. The "Intentional" vs. the "Physical" Avatar
The first topic I would like to mention is an issue that has to do with what YOU want to do as a controller of your avatar vs. what your avatar MUST do as a physical object in a simulated world. One way to illustrate this is with a navigation question: Should the avatar slow down when climbing up a hill and speed up when walking down a hill? If so, how much? And if not, how steep does the hill have to be before PHYSICS wins out over user-intention? If you the user want to go from here to there, and there is a bump in the way, do you want your space-time trajectory to be interrupted by the physics of hills and the constraints of anatomy?

Another question: if an avatar falls and starts tumbling down a mountain, should the user have to wait until the avatar has finished being a rag-doll before taking control where physics left off? What if it takes 13 minutes to tumble down a hill?

One thing's for sure, if you did have to wait and watch your avatar falling for 13 minutes, it may not feel much like the avatar represents you and your desire to turn around and climb back up. In some game scenarios, a long and painful tumble can serve as punishment for making a bad move. But in another kind of game, say a Mario-type game, the goal is to keep moving, grabbing coins, and foiling monsters - no time for deep Newtonian physics. Mario is more of an "intentional" avatar, one who responds quite predictably and obediently to the user's navigation control. Physics is less important than user intention.

Traveling Through Virtual Space
One example of "Intentional" navigation is flying: "Hell, why should I bother clambering over that rocky hill, I'll just fly over the damn thing!". Hell yea - that's the stuff of dreams, and it's one way to get around the hill problem. But wait a minute - things are starting to get a little jumpy here. And then there's teleporting - which is even more directly intentional (and the least physical).

A question arises, when we virtual world designers find ourselves catering to an innate desire for the freedom to go anywhere, any time, instantly: if this total freedom were allowed, wouldn't that then completely dissolve the space-time fabric that gives meaning to a virtual world as a Place? A line must be drawn. This is not the World Wide Web after all, which has no geography, and in which the whole idea is that you can jump from a web page in Bangkok to one in Menlo Park in one click. No, we love our virtual worlds for their realism, the constraints they impose which resemble the real constraints in life. In a way they provide comfort and familiarity. Conclusion? It's all about finding that fine balance between satisfying user-intention, and constraining that intention with real-world-like constraints.

2. The Avatar Camera: Third-Person View, or Astral-Projection?
The term "third person" originates from the art of narrative, when speaking of a "third person story", for instance - spoken from a narrator who is not one of the characters. While we have adopted this term in software entertainment it doesn't mean the same thing. In fact, what we typically craft as the navigation and camera behaviors of a third-person view could just as easily be called, "Astral-Projected" first-person.

Basically, it is a detached viewpoint which follows the avatar, in a loose manner, usually from behind, especially when the avatar is walking. An important distinction is worth pointing out: "Immersive" Virtual Reality, which uses a head-mounted display and data-gloves, has no need for treating the camera as an object in the world. The camera IS the user's head. The user moves the camera around by moving his/her head. Not so in a virtual world in which a computer monitor, keyboard, and mouse are the input devices. In the third-person genre, we NEED a camera. We need a cinematographer, which has some degree of autonomy and decision-making power.

For this reason, it would be a mistake to build a virtual world and think you don't have to know anything about the laws of Cinema.

In the case of Avatar Craft, we expect our autonomous camera-person to stay nearby. It's not Cinema in every respect: we do not expect the camera to cut to a scene on the other side of the island to flesh out the story or to cut back in time to show a memory flashback. The cinema stays within the immediate sphere of the "First Person" (the user), and yet it is detatched, semi-objective.

3. Am I Chatting, or Am I Typing At a Keyboard?
A question pertaining to the world of avatars who communicate with text is this: What exactly is the communication metaphor? Here we have a fairly believable world with believable avatars. Very cool, very dynamic. But behind every avatar is some crazy geek without a life frantically jabbing at a keyboard. Of course, if we project into the future, this odd scenario will be no longer an issue - because we will have moved beyond the absurd bottleneck of keyboards and mice, and will be using gestures, gaze, and voice. But for the time being, we're stuck with an unlikely set of clunky communication tools while our avatars evolve quite speedily toward high expressiveness and realism. A bottleneck indeed.

I'll just leave this as a question. To what degree does your avatar represent you - and this silly stuff you are doing over there on the keyboard, in the physical world?

Is Chatting Acoustic or Electric?
A chat bubble that emerges from the head of an avatar (such as in There) is a form of ectoplasm which is emitted from the brain (or mouth) of the avatar. It lives in the simulated worlds of avatars, and is therefore more like acoustic energy than an IM text message. The "old-style" form of avatar chat, which appears as lines of text in a window at the bottom of the screen, is inherently disconnected from the simulated world. "Avatar-Centric Communication" (There, Inc.) aimed to solve this problem. I believe we have only begun to explore the space of this new form of communication, especially in regards the next topic:

4. Body Language
A senior Pixar animator knows a lot about Body Language. Indeed, great animators are really actors - they just act through the characters they create instead of their own bodies. The power of gesture, pose, and gaze are important tools of the animator. But does this craft translate to online worlds?

I would claim that it doesn't translate very much. In the case of our craft - the craft of making immersive, social, virtual worlds, our task is to enable real people to communicate to each other in a way that only a 3D virtual space can allow. There is no narrative, no story-line, other than the drama created on the fly by the users themselves. The goal is to provide a tool for expression.

Having said that, Role-Playing tends to be a part of the experience of going to online worlds, for many people, and to the degree that a user wants his/her avatar to assume a specific personality or charater, some degree of automatic character animation is appropriate and expected. It just happens on the micro level, within a large and flexible vocabulary of poses, gestures, and individual nuances, which are indeed part of the toolset of the character animator. You could say in this case that the character should have all the integrity and nuance of an animated film personality, yet it exists outside of any movie-like narrative.

Avatar as Communication Tool
The avatar is an instrument of user-expression and individuality. The more it can reveal the intentions, emotions, and thoughts of a user, the more powerful it is. Does this mean that is must strictly be a Puppet? Should every motion, every twitch of the finger, be controlled by the user? Certainly not. Many levels of animation should be involuntary and automatic, such as breathing and blinking, reacting, and striking subtle poses and gestures. The details of walking are not puppeted by the user: rather, the direction, speed, and perhaps the "mood" are.

Once again, where user-control of avatar motion stops and scripted (or automatic) animation begins is not a simple matter. Knowing where to draw that line is part of the hybrid Art of Avatar Animation.


---------------------------