« Are Religions Virtual Worlds? | Main | Philip Rosedale the CEO is Dead....Long Live Philip Rosedale the Chairman of the Board »

Mar 12, 2008



The notion of a serious experimental economist decrying VW research on the basis of a lack of physical contact is absurd on face. Economists have been using proxy tests and experiments on computers and over networks for 20 years. Plenty of experiments need to eliminate the whole "O Noes, the guy is in a white coat, I'll treat this like a serious experiment" and plenty of other experiments are written such that they do not require the invocation of the stern authority figure. As a matter of fact, I'll go one further. A robust experimental finding should persist in the absence of explicit direction to the subject that this is a Serious Science Experiment and not to be trifled with--and I know plenty of experiments that meet that test.

As to the objection about demographic data--bah. If by this point in the trajectory of the internet a researcher hasn't figured out that anonymous forms don't compel truth telling, I don't know what to say. Here are bigger problems: spoofing, ballot stuffing, selection bias. Those are problems (to some degree) with any survey of any sort, but crop up in online surveys more frequently (WARNING! Made up statistic!).


Absolutely. I would have thought being confronted by a stern guy in a stern suit being all stern would pollute the responses of participants, particularly in the case of any lines of questioning that might be overtly personal, or embarrassing for the respondent.

Unless the whole process is designed to mitigate that. Maybe the complaint might be better aimed at researchers who are using virtual worlds for their samples out of laziness, which is then reflected in their methods.


In psychology, it is well understood that study participants might not be telling the truth to the reseacher.

If you ask someone whether they have had gay sex, how many sexual partners they have had etc., they may well lie to you. But (for example) researchers looking at ways of reducing the spread of HIV/AIDs want to gather statistics on people's sexual behaviour, and so ask these kind of questions.

The problem that participants may lie doesn't make VW research impossible, it just means that it suffers from a potential problem that researchers are already accustomed to dealing with.

I'm really suspicious about influencing people's behavior by making them know that they are participating in a Serious Scientific Experiment. There's a danger that they will act in ways that aren't how they would act in normal situations ("ecological validity"). I'd be happier with the experimental design if the participants were just acting the way they do normally when playing the game.

Having said that, I'm starting to think that VW experiments can sometimes be improved be getting some real data on who is typing at the keyboard (as opposed to their avatar).


PS. A college who is a research interviewer has various techniques for detecting when a study participant isn't telling the truth. These are actually very similar to the kinds of techniques the police use to detect when a suspect is lying - but you have to more gentle about it with someone who is voluntarily participating in your experiment, as opposed to someone who has been cautioned and arrested.


I think he may have a point about the size of the stakes.


It would be helpful, in my opinion, if we were able to shed the notion of a "controlled" research environment whenever we're talking about trying to study people and the *meanings* things have for them. I'd venture to say that there are *no* conditions under which it is fair to take statements that people make at face value. (Stanley Milgram FTW.)

The point of ethnography, is in a way *not* to rely only on interviews, isolated from raising questions of credibility, etc. (This is oftwn forgotten by people who equate ethnography with interviewing.) The strength of ethnography lies in the multiple ways in whicht he researcher is able to read the claims he or she encounters *against* other people's claims, the actions people take, the researcher's gradually acquired competence to act in that context, and the changing relationship of the researcher to interlocutors over time.


(Please forgive the mini-rant -- it was supposed to be set off with /rant on, and /rant off tags, just for fair warning. Oh, for an edit button.)



That's a really good way of putting it. I agree with you.


My personal view is that as a community of active researchers alternate views regarding validity and reliability of methods employed in finding meaningful results should always be questioned and debated. Indeed, it would be a sorry state to be in as a group of academics if a methodological & episomological debate wasn't occuring in our little corner of of research.

Questioning research methods is in-itself positive as it allows for pertient questions to be asked, and, hopefully, answered with sound reasoning.

For me the important line in this paper is:

More generally, this same critique applies to any
anonymous experiment conducted over the internet.

Yes. Indeed. the Author has effectively questioned fundementally a wide range of disciplines and scholars who are currently using online surveys, questionairres or indeed, any type of netnographic techinque to research with. His broadside against these types of experiments then and the rebuttal should not be based on a narrow virtual world usage only, but on the entirity of the methodological use.

Which brings me to this point. In 20 minutes I could give you, if I was so inclined, a list of 10-15 published articles in top 3* and 4* journals which use internet surveys, questionarries or other online techniques to gather data. All of which are possibly subjected to the kind of issues outlined in that article.

If it is the case that this flaw is so fundamental as the author claims, I would ask how and why top class peer reviewed journals are accepting such methods being used... which leads to the conclusion that the flaws outlined are already well understood by published academics and indeed a number of papers in this area already exist questioning their use with rebuttals similar to what Thomas Malaby & Susan have already said being used to defend such methods.

That said. Debate is good :-)


Every research method has costs and benefits. Researchers need to match their method to their research question, so that they are taking advantage of the benefits and are taking on no unnecessary costs.

The experiment that Duffy describes was originally designed for the laboratory (the trust game), so it isn't surprising that conducting the same tests in a virtual world brings few benefits along with its methodological costs. Participants may be cheap and plentiful, and administration easy, but those are rarely convincing arguments when trying to publish papers in academic journals. (Note to PhD students: when asked why you made a method/design choice, don't say "it was easier." Instead, "it was better, and here’s why.”)

The reason Steve and others are running these experiments in virtual worlds is primarily to show that it can be done, and to assess whether the results of the literally thousands of trust games from the lab generalize to virtual worlds. I think of these studies as 'proof of concept' that will also be helping in nailing down best practices for the experimental work that will inevitably be undertaken in virtual worlds. But if I really just wanted to do trust experiments, I would do them in the lab (or the field, as U of Chicago Prof. John List has argued).

In contrast, Nick Yee has used virtual world environments to examine how appearances affect perceptions of self and others. This is very hard to do in the real world, so there is an obvious benefit to using virtual worlds.

Nick's work also shows that going into the virtual world doesn't necessarily mean leaving the lab behind. People can conduct experiments in virtual worlds by having people come to a laboratory in the real world, and log in to a virtual world on a lab computer. Or at a minimum, people could be met by an avatar in the virtual world to provide the human contact and oversight typical in the laboratory.

The long-run future of virtual world experimentation is going to be more like Nick Yee's work than Steve Atlas's: using virtual worlds to address questions that simply can't be addressed easily in the lab, while also adding lots of controls to reduce the costs that concern Duffy.

Duffy's own interest is in macroeconomics. It is going to be hard to get 200 people in the laboratory to interact in an economy with any significant degree of complexity or endogeneity. So, it's off to the virtual world--though quite likely with a group of participants who have been vetted by real-world selection and interviewing, are being monitored in real time, and maybe are getting payed more than a handful of lindens.

Two other minor points:
**Labor rates in Second Life are very low, so a handful of lindens is probably pretty effective at attracting and motivating participants
**I don't know what Steve *IS* monitoring in his experiment, but it is possible to monitor a lot more information than participants are aware of...everything from IP addresses of computers to behaviors undertaken after the experiment is complete.

p.s. I am interviewing Nick Yee this Monday, March 17th, on my Metanomics show in Second Life, 11am Pacific Time. See http://metanomics.net for details. We will surely have a show on experimental economists in virtual worlds, now that said folk are coming out into the light. Last week's Metanomics show featured an interview with TN's own Richard Bartle. You can watch it, or read the transcript, here: http://metanomics.net/11-mar-2008/recap-richard-bartle-visits-metanomics


I'll support Robert's comments about putting the appropriate experiment into the appropriate environment. Also in some experiments it is possible to analyze answers for cheat patterns or other signs that the subjects are lying.

We will be starting to look at self reported medical data from online sources, and the simple expedient of watching the distribution of the last digit in the series can tell us if the research subject is telling the truth.

If your experiment occurs over time you can also look to see if the data shows events that are possible. People do certain things (including have sex or lose wieght) in predictable patterns. If the patterns are odd, the data may be bogus.

Finally there are always naysayers out there, who just don't believe that a new method is possible or valid. Anyone making a blanket statement is likely to end up with egg on thier face. But you can't argue with them, it is a waste of time.


Indeed, any methodology requires more than pursuing it on "ez mode," and there are many ways to proceed once we understand how we cannot take reported claims at face value. The example I cited, ethnography, includes a set of ways to do just that, and other methodologies have solved the problem in other ways, some of which leverage the set-apart lab space, for example. The only real error, to reiterate, is to presume that the problems go away simply by achieving some environment of perfect "control," which somehow allows us to (finally!) get at what people *really* think, in some pure sense.


Certainly, there are some unresolved methodological concerns with gathering data in online settings. However, it is important to weigh these costs against the benefits of online research: our methods allow us to test the external validity of general principles by experimenting on a different subject pool than the usual undergraduates; additionally, the combination of automated data gathering scripts and a population who are willing to participate at a fraction of the normal cost allows samples to be dramatically larger. The end result was that I collected over 1,200 data points over two months on a graduate student budget, realizing a 95% cost savings compared with more traditional laboratory methods. This allowed me to test five treatments on this subject pool and tease out more subtle factors that influence behavior that might not be detected in a smaller sample.

It is important to note that even if the demographic data may not be perfectly accurate, the substance of the experiment was about subjects' behavior the trust game. On this issue subjects were making decisions with real (in virtual terms) stakes about which they would be truthful. For example, while Duffy did not feel compelled to be accurate about his age and gender, he indeed answered the core experiment question on trust with what he truly believed to be the "best" course of action. Other participants selections provided results that were consistent with the trust and reciprocity effects observed in the 1995 Berg, Dickhaut, and McCabe experiment, in contrast with the subgame-perfect equilibrium expected by neoclassical economic assumptions.

What remains of questionable data integrity, however, are the 28 demographic and background questions that followed the experiment. Indeed some (17%) of participants did not choose to complete the experiment and followup questions. These "partial" data points were dropped from the resulting analysis. Whether there were patterns in peoples' decisions to drop out prematurely does indeed affect the outcome is a matter that could be tested by further experiments and data analysis.

While the followup questions were indeed not for "extra payment," it is not accurate to say that there was no incentive to complete the survey aspect of the experiment because subjects were required to complete the questions in order to receive any earnings from the prior question. Nevertheless, Duffy's concerns about the accuracy of subjects' responses is noted and is a real practical consideration in the design of experiments in any setting. In online experiments (both in virtual worlds and in web-based experiments), the absence of an authority could result in users providing inaccurate information on their demographics. One possible way to assess the accuracy of the demographic data would be to verify aspects of the data with previous data provided to Linden Labs. However, this does beg for further research into mechanisms to elicit truth-telling in anonymous online settings.

Duffy's final concern is that "there is little control over whether the same individual is logged in on multiple machines, under different identities, perhaps playing a two-person game with himself." In anticipation of this, our script prevented individual avatars from participating in the experiment more than once. We also used a delay mechanism between matched players so subjects would not know the identity of their counterpart. These two features made a two-player game with oneself practically impossible.

Underlying this concern, however, is a legitimate issue about players' use of alternate characters, known as "alts" to participate in the experiment multiple times using different avatars, a practice I refer to as experiment farming. This can sometimes be manually cleaned by noticing obviously duplicated avatars with names such as "Po Potez," "Po1 Potez," etc. My experience is that such experiment farming is most prevalent when participants are offered a large reward for participating in a relatively short experiment.

In closing, Duffy has identified some very real concerns to be addressed in designing effective virtual experiments. In truth, I think he is just scratching the surface about the issues that virtual experimentation needs to overcome. However, to invalidate these methods while in such a nascent state would be an overreaction. I believe the solution is to expand academic inquiry into experimentation in virtual worlds and develop better tools for collecting online data. In the mean time such confounding issues should certainly be addressed by researchers, and the field is wide open for the design of experiments to demonstrate the dimensions along which subjects behave differently in virtual worlds than the real world.


A well-documented and properly analyzed experiment will tell you *something*, whether your participants are lying or not. At the least, comparing the results of the trust game in SL with the results in a RL lab with an authority figure hovering over people would be interesting.


I mean, until you do the actual test, there is no way to know that the test is invalid, is there?


I am pleased that my note has stirred a discussion about research methods in virtual worlds. As the note stressed, I *am* curious about pursuing this type of research but have concerns over data collection, incentives and what we can really hope to learn. I do think these concerns extend to any experiment or survey conducted over the internet, but thought to focus my critique on virtual world research, which I found most interesting.

I am not the type of experimenter who seeks to intimidate subjects by wearing a suit and looking stern – I don’t own any suits and I try to avoid being present while the experiment is conducted so as to minimize experimenter demand effects. The control of the laboratory is mainly with regard to who is showing up to participate, their background, and the accuracy and thoroughness of data collection during the experiment.

I *do* think we can learn from observing behavior in virtual worlds. For instance, Prof. Ernan Haruvy at UT Dallas has pursued an interesting strategy of recruiting subjects to participate in experiments on Second Life comparing their behavior as avatars with a control group of subjects in a standard laboratory environment. This seems to me to be a promising approach.

I am indeed interested in studying macroeconomic phenomena (e.g., inflation) in virtual worlds, but I am increasingly becoming convinced that the way to do this might be to build a game from scratch and not sample from one that is built mainly for the amusement of users. Of course designing a game that would attract many users and would also be of interest for research purposes is not so easily done, but suggestions or pointers are welcome.


Without looking at the original article (lol, yeah I know. lazy dmx!) The straight looking researcher is as prone as any to producing eroneous research as well.

I'm pretty honest myself, as I honestly don't have a hell of a lot to 'hide' , but I've seen people lie between the teeth at researchers , simply because of social aproval (If I admit I'm gay, will this guy laugh at me? What If I told him I once beat a man with a rock?)

And it gets worse in focus group situations. Where as an adult can go 'well, I guess this guy IS a researcher, and not judging', its harder when 3/4 the room ISNT a researcher.

but whatya do? At the end of the day one really just has to try and build in measures to work around these and get on with it.


Oh god that was a mentally lazy post. Ignore it.

I need a coffee :(


<< So here's the question - how can virtual world participants be encouraged to take online research seriously, or should no attempt be made to change their online behaviour? >>

Tell them that more accurate data will help future game designers make better games and wrong info could lead to lamer game aspects. (even if the data will also be used for other things)

A bit more might be added to explain how information on seemingly useless questions actually affects lots of small future choices game developers make in content, quest mechanics, player communication etc..

Gamers want ever better games catered to them and probably feel that developers have some big misconceptions about them. To them "better games" might seem a noble cause.

The comments to this entry are closed.