There's no question that voice chat is going to work better in a small group. When was the last time you went to a party where everyone stood around in a circle and only one person talked? Instead people break up into small groups and hold conversations with a few other people.

As for the immersion question, in the distant past advocates of radio drama argued that television was an inferior medium because the visual element actually detracted from the overall experience. The listener's imagination would be much more evocative than anything that could be filmed. That may be true, but radio drama is all but extinct today.

My experience is that voice chat is simply too useful not to use. When TeamSpeak rolled around I stopped looking at the chat window and started looking at the scenery. In future games with more advanced technology picking up on visual cues will probably be more important than ever and taking your fingers off the controls to type a warning will probably be fatal.


The paper is a good summary of the issues.

It can be technically quite difficult to use voice - you need to have a microphone, a soundcard with working audio in, you need to be listening through headphones (not speakers) so you don't get feedback and so on. If you're away from home and connecting via your laptop, you would need to have the microphone, headset etc. with you as well. If there's lag, or the network is congested, voice becomes unusuable while the rest of the game and text chat is still ok.

There are lots of good reasons to use voice - it's faster, more expressive and so on, but I feel it reduces the level of "immersion" in role-playing games: people come across as more like their real selves, and less like their character.

We also don't have good tools for managing voice at the moment. Wih text chat you can: send and receive private messages while listening to public chat at the same time; "mute" people who are being annoying; easily pick out one conversation from among over a dozen that are happening at the same time; etc.


> Some players are uncomfortable using voice with strangers,

Nic cites a post of mine that emphasizes this view, for here I'll deviate with a different view. I think one advantage of voice with regards to strangers in some circumstances is that it seems easier to maintain "communication (channel) discipline". Subjectively, I think there are two components to this: 1.) quick and nuanced "hushing" of offenders is easier on voice; 2.) voice channel pacing can more easily cue participants of purpose/function of the channel.

I'm thinking of EVE fleet channels as a good example. With potentially over 100++ participants they work surprisingly well. I think tempo there plays an important role in communicating urgency and function to those participating. In EVE you do have text channels with >>100 participants, and those can be quite variable in terms of discipline.


The RL military (and law enforcement and emergency responders) have wrestled through some of these issues and found effective solutions to some of the "efficiency inhibiting" ones. They are summed up in the idea of 'radio discipline' - you talk when you need to, briefly, and you listen well.

That is, in my experience in a variety of games, what works in gaming too, as far as efficiency. But it doesn't get at one of the other reasons why people chat "in voice" - the social aspect.

Yes, there are reasons why people do NOT want to connect with others "as their real selves" in games via voice. But there are times/people who do too.

I joined a WoW raiding guild almost 4 months ago, and I find a couple voice-related things interesting and germane.

1) A few people will not use mics; they'll listen, but they do not want to talk. They represent a minority of people. Their reasons are diverse, I'm sure, but I won't speculate.

2) People don't use voice communication much outside of raids or sometimes for 5-man instances. There are unusual times when people say "this is too complicated to type, let's jump on Vent[rillo] to better talk about it."

3) But after raids, when there are 10+ people in the voice channel, people who aren't logging out of the game promptly often linger in the voice channel and socialize as they wind down and/or start other things. This has a very social aspect to it. A half hour after the raid is over, 2-4 people will often be found still enjoying each others' company via voice.

I find this last point most meaningful, and I think it is telling about the future. Sure, some people will fulfill some social needs in the anonymous or semi-anonymous ways which MMOs allow. But I think in the near future more and more will connect to a small number of people more deeply in games, and that interaction will be facilitated by voice in many situations.

Instead people break up into small groups and hold conversations with a few other people.

Yes, this is true, but there are a huge number of contextual things going on besides just the voice. Eye contact and other body language, as well as just the fact that people temper their voices (usually) so that they are heard most and best by those few around them, not the whole group. Even in a loud noisy bar/party where everyone is shouting, you can still pick out what those you are speaking with are saying. Subtle things like lip reading come into play more than you probably realize.

My point is that VOIP in a game just doesn't have these elements, at least yet. If the best you can get is a little icon over the speakers head, that's crude and I suppose effective.

What I want is my avatar to emote based on what I say or how I say it. How hard could it be to capture the pace of speaking (amount of silence between words) or the tone change or the volume (loud = shout, quiet = normal) and adjust my avatar to show this?


I'll be curious to see whether there is any measurable impact of the spatialized voice in SL. Alas, SL is such a different use case than WoW that these studies might not be directly comparable anyway.


Thanks for your comments - much appreciated at this time.

I was struck during our research by how many different kinds of people there are doing different kinds of things in VWs. It makes it difficult to state simply that either voice or text is superior, as the question is too situation-dependent. It's like arguing whether cellphones are better than landlines, or cars better than bikes - it depends on what you're doing. That said, there is a canonical MMORPG scenario - a small group of people who regularly play together, currently engaged in a raid - and it looks like most people in that scenario prefer voice.

re spatialized voice: We did another study of a spatial voice system designed in Australia, and found that gamers derived some interesting tactical benefits from it (paper on my website). SL is very place-oriented and spatial voice seems suited to that.

re whether WoW can be compared to SL (is the 'VW' category valid?) I am asking myself that question a lot :-) My comments above about variety of people and activity goes double for SL. I am looking at SL data now, and I suspect that rather than draw a general conclusion, I will have to catagorize people and activities and comment on the usefulness of communication media w.r.t. individual scenarios.


"there is a canonical MMORPG scenario - a small group of people who regularly play together, currently engaged in a raid"

That isn't really the base scenario, is it? Raiding isn't THAT central an activity when you look at the stats.


Raph's right, of course, particularly in the West, where the most popular MMORPG - Runescape - features no raiding as WoW-fans think of it at all.



I'll say at the start of this that I can only speak from my own experience in wholly three different end game raiding guilds, a PvP guild and two much more casual guilds.

In general two of the raiding guilds I've played with were very extreme about comm's discipline and clearly had sectioned a Casual Chat area from the raid or even party channels. You could talk about what you liked in the Casual area (and indeed, it was pretty much anything goes conversation at times; you could do a case study in itself of how people blow off steam after work on those conversation!) but as soon as you got into the raid channel, if you weren't the raid leader you shut your mouth, and if you didn't, the raid told you in no uncertain terms.

The third raiding guild I was in was very much a progressing, more casual guild, and many of the fights had to be explained time & time again because of the "churn" of players we had. The VOIP was essential for communicating quite complex tactics in a much shorter time than it would have taken to type.

The PvP guild I was in used the Comms very tactically ("Farm!", "Flag Tunnel!" etc.) and everyone participated. However as it was a PvP guild specifically for that purpose, no "casual chat" channels existed and, for the most part the channels were kept clear of "chatter"

The two casual guilds I've played in were the complete opposite. I'd actually log into the Vent server while the game was loading up to chat and just "shoot the breeze" with people.

All in all, and this I'll make very clear is just from personal observation rather than emperical evidence, I'd say the use of VOIP, the manner it's used, and the impact on the gameplay has more to do with the aims of the people they're talking to, the community culture that is already established and the subjective personal situation of the player involved.



Sorry, I didn't write clearly. I wasn't trying to imply that group raiding is the number one activity in MMOs - rather, I mean that it's the situation in which VoIP seems most clearly to be useful. In our research, gamers usually spoke about group raids when they were enthusing about voice communication.

