CHAPTER 13 Protocol Analysis and Expert Thought: Concurrent … · 2009. 4. 22. · Protocol Analysis and Expert Thought: Concurrent Verbalizations of Thinking during Experts’ Performance

P1: JzG052184097Xc13 CB1040B/Ericsson 0 521 84087 X February 27, 2006 16:30

C H A P T E R 13

Protocol Analysis and Expert Thought:Concurrent Verbalizations of Thinking

during Experts’ Performance onRepresentative Tasks

K. Anders Ericsson

The superior skills of experts, such as accom-plished musicians and chess masters, can beamazing to most spectators. For example,club-level chess players are often puzzled bythe chess moves of grandmasters and worldchampions. Similarly, many recreational ath-letes find it inconceivable that most otheradults – regardless of the amount or type oftraining – have the potential ever to reachthe performance levels of international com-petitors. Especially puzzling to philosophersand scientists has been the question of theextent to which expertise requires innategifts versus specialized acquired skills andabilities.

One of the most widely used and simplestmethods of gathering data on exceptionalperformance is to interview the expertsthemselves. But are experts always capableof describing their thoughts, their behaviors,and their strategies in a manner that wouldallow less-skilled individuals to understandhow the experts do what they do, and per-haps also understand how they might reachexpert level through appropriate training?To date, there has been considerable contro-versy over the extent to which experts are

capable of explaining the nature and struc-ture of their exceptional performance. Somepioneering scientists, such as Binet (1893 /1966), questioned the validity of the experts’descriptions when they found that someexperts gave reports inconsistent with thoseof other experts. To make matters worse,in those rare cases that allowed verifica-tion of the strategy by observing the perfor-mance, discrepancies were found betweenthe reported strategies and the observations(Watson, 1913). Some of these discrepancieswere explained, in part, by the hypothe-sis that some processes were not normallymediated by awareness/attention and thatthe mere act of engaging in self-observation(introspection) during performance changedthe content of ongoing thought processes.These problems led most psychologists infirst half of the 20th century to reject alltypes of introspective verbal reports as validscientific evidence, and they focused almostexclusively on observable behavior (Boring,1950).

In response to the problems with thecareful introspective analysis of images andperceptions, investigators such as John B.

2 2 3


2 2 4 the cambridge handbook of expertise and expert performance

Watson (1920) and Karl Duncker (1945)introduced a new type of method to elicitverbal reports. The subjects were askedto “think aloud” and give immediate ver-bal expression to their thoughts while theywere engaged in problem solving. In themain body of this chapter I will review evi-dence that this type of verbal expressionof thoughts has not been shown to changethe underlying structure of the thought pro-cesses and thus avoids the problem of reac-tivity, namely, where the act of generat-ing the reports may change the cognitiveprocesses that mediate the observed per-formance. In particular, I will describe themethods of protocol analysis where verbalreports are elicited, recorded, and encodedto yield valid data on the underlying thoughtprocesses (Ericsson & Simon 1980, 1984 ,1993).

Although protocol analysis is generallyaccepted as providing valid verbalizationsof thought processes (Simon & Kaplan,1989), these verbal descriptions of thoughtsequences frequently do not contain suffi-cient detail about the mediating cognitiveprocesses and the associated knowledge tosatisfy many scientists. For example, thesereports may not contain the detailed proce-dures that would allow cognitive scientiststo build complete computer models that arecapable of regenerating the observed perfor-mance on the studied tasks. Hence, inves-tigators have continued to search for alter-native types of verbal reports that generatemore detailed descriptions. Frequently sci-entists require participants to explain theirmethods for solving tasks and to give detaileddescriptions of various aspects. These alter-native reporting methods elicit additionaland more detailed information than is spon-taneously verbalized during “think aloud.”The desire for increased amounts of reportedinformation is central to the study of exper-tise, so I will briefly discuss whether it ispossible to increase the amount reportedwithout inducing reactivity and change ofperformance. The main sections of this chap-ter describe the methods for eliciting andanalyzing concurrent and retrospective ver-bal reports and how these methods have

been applied to a number of domains ofexpertise, such as memory experts, chessmasters, and medical specialists. The chap-ter is concluded with a broad overview ofthe issues of applying protocol analysis tothe study of expert performance.

Historical Development of VerbalReports on Thought Processes

Introspection or “looking inside” to uncoverthe structure of thinking and its mentalimages has a very long history in philos-ophy. Drawing on the review by Ericssonand Crutcher (1991), we see that Aristotleis generally given credit for the first system-atic attempt to record and analyze the struc-ture of sequences of thoughts. He recountedan example of series of thoughts mediat-ing the recall of a specific piece of infor-mation from memory. Aristotle argued thatthinking can be described as a sequence ofthoughts, where the brief transition periodsbetween consecutive thoughts do not con-tain any reportable information, and this hasnever been seriously challenged. However,such a simple description of thinking wasnot sufficiently detailed to answer the ques-tions about the nature of thought raised byphilosophers in the 17th, 18th, and 19th cen-turies (Ericsson & Crutcher, 1991).

Most of the introspective analysis ofphilosophers had been based on self-analysisof the individual philosophers’ own thought.In the 19th century Sir Francis Galton alongwith others introduced several importantinnovations that set the groundwork forempirical studies of thinking. For example,Galton (1879, see Crovitz, 1970) noticedrepeatedly that when he took the same walkthrough a part of London and looked ata given building on his path, this eventtriggered frequently the same or similarthoughts in memory. Galton recreated thisphenomenon by listing the names of themajor buildings and sights from his walk oncards and then presented a card at a timeto observe the thoughts that were triggered.From this self-experiment Galton arguedthat thoughts reoccur with considerable


protocol analysis and expert thought 2 2 5

frequency when the same stimulus isencountered.

Galton (1883) is particularly famous forthe innovation of interviewing many peo-ple by sending out a list of questions aboutmental imagery – said to be the first ques-tionnaire. He had been intrigued by reportsof photographic memory and asked ques-tions of the acuity of specific memories, suchas the clarity and brightness of their mem-ory for specific things such as their break-fast table. He found striking individual dif-ferences in the clarity or vividness, but noclear superiority of the eminent scientists;for example, Darwin reported having weakvisual images. Now a hundred years later itis still unclear what these large individualdifferences in reported vividness of memoryimages really reflect. They seem almost com-pletely unrelated to the accuracy of memoryimages and there is no reproducible evidencefor individuals with photographic or eide-tic memory (McKelvie, 1995 ; Richardson,1988).

In one of the first published studies onmemory and expertise Binet (1893 /1966)reported a pioneering interview of chessplayers and their ability to play “blind-folded” without seeing a chess board. Basedon anecdotes and his interviews Binet con-cluded that the ability required to main-tain chess position in memory during blind-fold play did not appear to reflect a basicmemory capacity to store complex visualimages, but a deeper understanding of thestructure of chess. More troubling, Binetfound that the verbal descriptions on thevisual images of the mental chess positionsdiffered markedly among blindfold chessplayers. Some claimed to see the boardas clearly as if it were shown perceptu-ally with all the details and even shad-ows. Other chess players reported seeingno visual images during blindfold play andclaimed to rely on abstract characteristicsof the chess position. Unfortunately, therewas no independent evidence to support orquestion the validity of these diverse intro-spective reports. Binet’s (1893 /1966) classicreport is a pioneering analysis of blindfoldchess players’ opinions and self-observations

and illustrates the problems and limits ofintrospection.

In a similar manner Bryan and Harter(1899) interviewed two students of telegra-phy as they improved their skill and foundevidence for an extended plateau for bothas they reached a rate of around 12 wordsper minute. Both reported that this arrest indevelopment was associated with attemptsto move away from encoding the Morsecode into words and to encode the code intophrases. Subsequent research (Keller, 1958)has found that this plateau is not a necessarystep toward expert levels of performanceand referred to it as the phantom plateau.

In parallel with the interviews and theinformal collection of self-observationsof expertise in everyday life, laboratoryscientists attempted to refine introspectivemethods to examine the structure of think-ing. In the beginning of the 20th century,psychologists at the University of Würzburgpresented highly trained introspectiveobservers, with standardized questions andasked them to respond as fast as possible.After reporting their answers, the observersrecalled as much as possible about thethoughts that they had while answering thequestions. They tried to identify the mostbasic elements of their thoughts and imagesto give as detailed reports as possible. Mostreported thoughts consisted of visual andauditory images, but some participantsclaimed to have experienced thoughts with-out any corresponding imagery – imagelessthoughts. The principle investigator, KarlBühler (1907), argued that the existence ofimageless thoughts had far-reaching theoret-ical implications and was inconsistent withthe basic assumption of Wilhelm Wundt(1897) that all thoughts were associatedwith particular neural activity in some partof the brain. Bühler’s (1907) paper led toa heated exchange between Bühler’s intro-spective observers, who claimed to haveobserved them, and Wundt (1907), whoargued that these reports were artifacts ofinappropriate reporting methods and thetheoretical bias of the observers. A devastat-ing methodological conclusion arose fromthis controversy: the existence of imageless



thoughts could not be resolved empiricallyby the introspective method. This findingraised fundamental doubts about analyticintrospection as a scientific method.

The resulting reaction to the crisis wasto avoid the problem of having to trustthe participants’ verbal reports about inter-nal events. Instead of asking individuals todescribe the structure of their thoughts, par-ticipants were given objective tests of theirmemory and other abilities. More gener-ally, experimental psychologists developedstandardized tests with stimuli and instruc-tions where the same pattern of performancecould be replicated under controlled con-ditions. Furthermore, the focus of researchmoved away from complex mental pro-cesses, such as experts’ thinking, and towardprocesses that were assumed to be unaf-fected by prior experience and knowledge.For example, participants were given well-defined simple tasks, such as memorizationof lists of nonsense syllables, e.g., XOK,ZUT, where it is easy to measure objectiveperformance. In addition, experimentersassumed that nonsense syllables were com-mitted to memory without any reportablemediating thoughts, and the interest in col-lecting verbal reports from participants vir-tually disappeared until the cognitive revo-lution in the late 1950s.

In one of the pioneering attempts toapply this approach to the study of exper-tise, Djakow, Petrowski, and Rudik (1927)tested the basic abilities of world-class chessplayers and compared their abilities to otheradults. Contrary to the assumed importanceof superior basic cognitive ability and mem-ory, the international players were only supe-rior on a single test – a test involving memoryfor stimuli from their own domain of exper-tise, namely, chess positions. A few decadeslater de Groot (1946/1978d) replicated chessplayers’ superior memory for chess positionsand found that correct recall was closelyrelated to the level of chess skill of the player.

Many investigators, including the famousbehaviorist and critic of analytic introspec-tion, John Watson, are very critical of theaccuracy of verbal descriptions of skilledactivities, such as where one looks dur-ing a golf swing (Watson, 1913). He real-

ized that many types of complex cogni-tive processes, such as problem solving, cor-responded to ongoing processes that wereinherently complex and were mediated byreportable thoughts. In fact, Watson (1920)was the first investigator to publish a studywhere a participant was asked to “thinkaloud” while solving a problem. Accord-ing to Watson, thinking was accompaniedby covert neural activity of the speechapparatus that is frequently referred to as“inner speech.” Hence, thinking aloud didnot require observations by any hypotheticalintrospective capacity, and thinking aloudmerely gives overt expression to these sub-vocal verbalizations. Many other investiga-tors proposed similar types of instructionsto give concurrent verbal expression of one’sthoughts (see Ericsson & Simon, 1993 , for amore extended historical review).

The emergence of computers in the 1950sand 1960s and the design of computer pro-grams that could perform challenging cog-nitive tasks brought renewed interest inhuman cognition and higher-level cogni-tive processes. Investigators started study-ing how people solve problems and makedecisions and attempted to describe andinfer the thought processes that mediateperformance. They proposed cognitive the-ories where strategies, concepts, and ruleswere central to human learning and problemsolving (Miller, Galanter, & Pribram, 1960).Information-processing theories (Newell &Simon, 1972) sought computational modelsthat could regenerate human performanceon well-defined tasks by the application ofexplicit procedures. Much of the evidencefor these complex mechanisms was derivedfrom the researchers’ own self-observation,informal interviews, and systematic ques-tioning of participants.

Some investigators raised concerns almostimmediately about the validity of thesedata. For example, Robert Gagné and hiscolleagues (Gagné & Smith, 1962) demon-strated that requiring participants to ver-balize reasons for each move in the Towerof Hanoi improved performance by reduc-ing the number of moves in the solutionsand improving transfer to more difficultproblems as compared to a silent control



Figure 13 .1. An illustration of the overt verbalizations of most thoughtspassing through attention while a person thinks aloud during theperformance of a task.

condition. Although improvements are wel-come to educators, the requirement toexplain must have changed the sequencesof thoughts from those normally gener-ated. Other investigators criticized the valid-ity and accuracy of the retrospective ver-bal reports. For instance, Verplanck (1962)argued that participants reported that theyrelied on rules that were inconsistent withtheir observed selection behavior. Nisbettand Wilson (1977) reported several exam-ples of experiments in social psychology,where participants gave explanations thatwere inconsistent with their observed behav-ior. These findings initially led many inves-tigators to conclude that all types of verbalreports were tainted by similar methodolog-ical problems that had plagued introspec-tion and led to its demise. Herb Simonand I showed in a review (Ericsson andSimon, 1980) that the methods and instruc-tions used to elicit the verbal reports hada great influence on both the reactivity ofthe verbal reporting and on the accuracy ofthe reported information. We developed aparfticular type of methodology to instructparticipants to elicit consistently valid non-reactive reports of their thoughts that I willdescribe in the next section.

Protocol Analysis: A Methodologyfor Eliciting Valid Data on Thinking

The central assumption of protocol analy-sis is that it is possible to instruct subjects

to verbalize their thoughts in a manner thatdoes not alter the sequence and contentof thoughts mediating the completion of atask and therefore should reflect immedi-ately available information during thinking.

Elicitation of Non-Reactive VerbalReports of Thinking

Based on their theoretical analysis, Ericssonand Simon (1993) argued that the clos-est connection between actual thoughts andverbal reports is found when people verbal-ize thoughts that are spontaneously attendedduring task completion. In Figure 13 .1 weillustrate how most thoughts are given a ver-bal expression.

When people are asked to think aloud(see Ericsson and Simon, 1993 , for completeinstructions), some of their verbalizationsseem to correspond to merely vocalizing“inner speech,” which would otherwise haveremained inaudible. Nonverbal thoughts canalso be often given verbal expression by brieflabels and referents. Laboratory tasks stud-ied by early cognitive scientists focused onhow individuals applied knowledge and pro-cedures to novel problems, such as men-tal multiplication. When, for example, oneparticipant was asked to think aloud whilementally multiplying 36 by 24 on two testoccasions one week apart, the following pro-tocols were recorded:

OK, 36 times 2 4 , um, 4 times 6 is 2 4 , 4 ,carry the 2 , 4 times 3 is 12 , 14 , 144 , 0, 2times 6 is 12 , 2 , carry the 1, 2 times 3 is 6,



7, 72 0, 72 0, 144 plus 72 0, so it would be4 , 6, 864 .

36 times 2 4 , 4 , carry the – no wait, 4 ,carry the 2 , 14 , 144 , 0, 36 times 2 is, 12 , 6,72 , 72 0 plus 144 , 4 , uh, uh, 6, 8, uh, 864 .

In these two examples, the reportedthoughts are not analyzed into their per-ceptual or imagery components as requiredby Bühler’s (1907) rejected introspectionistprocedures, but are merely vocalized innerspeech and verbal expressions of intermedi-ate steps, such as “carry the 1,” “36,” and “144plus 720.” Furthermore, participants werenot asked to describe or explain how theysolve these problems and do not generatesuch descriptions or explanations. Instead,they are asked to stay focused on generat-ing a solution to the problem and thus onlygive verbal expression to those thoughts thatspontaneously emerge in attention duringthe generation of the solution.

If the act of verbalizing participants’thought processes does not change thesequence of thoughts, then participants’task performance should not change as aresult of thinking aloud. In a comprehen-sive review of dozens of studies, Ericsson andSimon (1993) found no evidence that thesequences of thoughts (accuracy of perfor-mance) changed when individuals thoughtaloud as they completed the tasks, com-pared to other individuals who completedthe same tasks silently. However, some stud-ies have shown that participants who thinkaloud take somewhat longer to complete thetasks – presumably due to the additionaltime required to produce the overt verbal-ization of the thoughts.

The same theoretical framework can alsoexplain why other types of verbal-reportingprocedures consistently change cognitiveprocesses, like the findings of Gagné andSmith (1962). For example, when partici-pants explain why they are selecting actionsor carefully describe the structure anddetailed content of their thoughts, they arenot able to merely verbalize each thought asit emerges, they must engage in additionalcognitive processes to generate the thoughtscorresponding to the required explanations

and descriptions. This additional cognitiveactivity required to generate the reportschanges the sequence of generated thoughts(see Chi, Chapter 10, for a discussion ofthe differences between explanation andthinking aloud). Instructions to explain thereasons for one’s problem solving and todescribe the content of thought are reliablyassociated with changes in the accuracy ofobserved performance (Ericsson and Simon,1993). Subsequent reviews have shown thatthe more recent work on effects of ver-bal overshadowing are consistent with reac-tive consequences of enforced generation ofextensive verbal descriptions of brief experi-ences (Ericsson, 2002). Even instructions togenerate self-explanations have been foundto change (actually, improve) participants’comprehension, memory, and learning com-pared to merely thinking aloud during theseactivities (Ericsson, 1988a, 2003a; Neuman& Schwarz, 1998).

In summary, adults must already pos-sess the necessary skills for verbalizing theirthoughts concurrently, because they areable to think aloud without any system-atic changes to their thought process aftera brief instruction and familiarization in giv-ing verbal reports (see Ericsson and Simon1993 , for detailed instructions and associatedwarm-up tasks recommended for laboratoryresearch).

Validity of Verbalized Informationwhile Thinking Aloud

The main purpose of instructing partici-pants to give verbal reports on their think-ing is to gain new information beyond whatis available with more traditional measuresof performance. If, on the other hand, ver-bal reports are the only source for somespecific information about thinking, howcan the accuracy of that information bevalidated? The standard approach for evalu-ating methodology is to apply the method insituations where other converging evidenceis available and where the method’s data candistinguish alternative models of task perfor-mance and disconfirm all but one reasonablealternatives.



Theories of human cognition (Anderson,1983 ; Newell & Simon, 1972 ; Newell, 1990)proposed computational models that couldreproduce the observable aspects of humanperformance on well-defined tasks throughthe application of explicit procedures. Oneof the principle methods applied by thesescientists is an analysis of the cognitive task(see Chapter 11 by Schraagen for a discussionof the methods referred to as cognitive taskanalysis), and it serves a related purpose inthe analysis of verbal protocols. Task analy-sis specifies the range of alternative proce-dures that people could reasonably use, inthe light of their prior knowledge of facts andprocedures, to generate correct answers to atask. Moreover, task analysis can be appliedto the analysis of think-aloud protocols; forexample, during a relatively skilled activity,namely, mental multiplication, most adultshave only limited mathematical knowledge.They know the multiplication tables andonly the standard “pencil and paper” proce-dure taught in school for solving multiplica-tion problems. Accordingly, one can predictthat they will solve a specific problem suchas 36 · 24 by first calculating 4 · 36 = 144 ,then adding 20 · 36 = 720. More sophisti-cated adults may recognize that 24 · 36 canbe transformed into (30+6)(30–6) and thatthe formula (a+b)(a−b) = a2−b2 can beused to calculate 36 · 24 as 302 –62 = 900–36 = 864 .

When adults perform tasks while think-ing aloud the verbalized information mustreflect information generated from the cog-nitive processes normally executed duringthe task. By analyzing this information, theverbalized sequences of thoughts can becompared to the sequence of intermediateresults required to compute the answer bydifferent strategies that are specified in atask analysis (Ericsson & Simon, 1993). Thesequence of thoughts verbalized while mul-tiplying 24 · 36 mentally (reproduced in theprotocol examples above) agrees with thesequence of intermediate thoughts specifiedby one, and only one, of the possible strate-gies for calculating the answer.

However, the hypothesized sequence ofintermediate products predicted from the

task analysis may not perfectly correspondto the verbalizations. Inconsistencies mayresult from instances where, because ofacquired skill, the original steps are eithernot generated or not attended as distinctsteps. However, there is persuasive evidencefor the validity of the thoughts that are ver-balized, that is, that the verbalizations canreveal sequences of thoughts that matchthose specified by the task analysis (Ericsson& Simon, 1993). Even if a highly skilled par-ticipant’s think-aloud report in the multipli-cation task only consisted of “144” and “720,”the reported information would still be suf-ficient to reject many alternative strategiesand skilled adaptations of them becausethese strategies do not involve the generationof both of the reported intermediate prod-ucts. The most compelling evidence for thevalidity of the verbal reports comes from theuse of task analysis to predict a priori a setof alternative sequences of concurrently ver-balized thoughts that is associated with thegeneration of the correct answer to the pre-sented problem.

Furthermore, verbal reports are only oneindicator of the thought processes that occurduring problem solving. Other indicatorsinclude reaction times (RTs), error rates, pat-terns of brain activation, and sequences ofeye fixations. Given that each kind of empir-ical indicator can be separately recordedand analyzed, it is possible to examine theconvergent validity established by indepen-dent analyses of different types of data.In their review, Ericsson and Simon (1993)found that longer RTs were associated witha longer sequence of intermediate reportedthoughts. In addition, analyses show a closecorrespondence between participants’ ver-balized thoughts and the information thatthey looked at in their environment (seeEricsson & Simon, 1993 , for a review).

Finally, the validity of verbally reportedthought sequences depends on the timeinterval between the occurrence of a thoughtand its verbal report, where the highestvalidity is observed for concurrent, think-aloud verbalizations. For tasks with relativelyshort response latencies (less than 5 to 10 sec-onds), people are typically able to recall their


2 30 the cambridge handbook of expertise and expert performance

sequences of thoughts accurately immedi-ately after the completion of the task, andthe validity of this type of retrospectivereports remains very high. However, for cog-nitive processes of longer duration (longerthan 10 to 30 seconds), recall of past spe-cific thought sequences becomes more dif-ficult, and people are increasingly temptedto infer what they must have thought, thuscreating inferential biases in the reportedinformation.

Other Types of Verbal Reportswith Serious Validity Problems

Protocol analysis, as proposed by Ericssonand Simon (1980, 1984 , 1993), specifies theconstrained conditions necessary for valid,non-reactive verbalizations of thinking whileperforming a well-defined task. Many ofthe problems with verbally reported infor-mation obtained by other methods can beexplained as violations of this recommendedprotocol-analysis methodology.

The first problem arises when the inves-tigators ask participants to give more infor-mation beyond that which is containedin their recalled thought sequences. Forexample, some investigators ask participantswhy they responded in a certain man-ner. Participants may have deliberated onalternative methods; thus, their recalledthoughts during the solution will providea sufficient answer, but typically the par-ticipants need to go beyond any retriev-able memory of their processes to givean answer. Because participants can accessonly the end-products of their cognitiveprocesses during perception and memoryretrieval, and they cannot report why onlyone of several logically possible thoughtsentered their attention, they must makeinferences or confabulate answers to suchquestions.

In support of this type of confabula-tion, Nisbett and Wilson (1977) found thatparticipants’ responses to “why-questions”after responding in a task were in manycircumstances as inaccurate as those givenby other participants who merely observedthese individuals’ performance and tried to

explain it without any memory or first-handexperience of the processes involved. Moregenerally, Ericsson and Simon (1993) rec-ommended that one should strive to under-stand these reactive, albeit typically ben-eficial, effects of instructing students toexplain their performance. A detailed anal-ysis of the different verbalizations elicitedduring “think-aloud” and “explain” instruc-tions should allow investigators to identifythose induced cognitive processes that areassociated with changes (improvements) intheir performance.

A very interesting development that cap-italizes on the reactive effects of generatingexplanations involves instructing students togenerate self-explanations while they readtext or work on problems (Chi, de Leeuw,Chiu, & LaVancher, 1994 ; Renkl, 1997).Instructing participants to generate self-explanations has been shown to increase per-formance beyond that obtained with merelyhaving them “think aloud,” which did notdiffer from a control condition (Neuman,Leibowitz, & Schwarz, 2000). The system-atic experimental comparison of instructionsinvolving explanations or “thinking aloud”during problem solving has provided furtherinsights into the differences between mecha-nisms underlying the generation of explana-tions that alter performance and those thatmerely give expression to thoughts whilethinking aloud (Berardi-Coletta, Buyer,Dominowski, & Rellinger, 1995).

The second problem is that scientistsare frequently primarily interested in thegeneral strategies and methods participantsuse to solve a broad class of problems ina domain, such as mathematics or textcomprehension. They often ask participantsto describe their general methods aftersolving a long series of different tasks,which often leads to misleading summariesor after-the-fact reconstructions of whatparticipants think they must have done.In the rare cases when participants havedeliberately and consistently applied a singlegeneral strategy to solving the problems,they can answer such requests easily byrecalling their thought sequence from any ofthe completed tasks. However, participants


protocol analysis and expert thought 2 31

typically employ multiple strategies, andtheir strategy choices may change duringthe course of an experimental session.Under such circumstances participantswould have great difficulty describing asingle strategy that they used consistentlythroughout the experiment, thus theirreports of such a strategy would be poorlyrelated to their averaged performance.Hence, reviews of general strategy descrip-tions show that these reports are usually notvalid, even when immediate retrospectiveverbal reports after the performance of eachtrial provide accounts of thought sequencesthat are consistent with other indicators ofperformance on the same trials (see Ericsson& Simon, 1993 , for a review).

Similar problems have been encoun-tered in interviews of experts (Hoffman,1992). When experts are asked to describetheir general methods in professional activ-ities, they sometimes have difficulties, andthere is frequently poor correspondencebetween the behavior of computer pro-grams (expert systems) implementing theirdescribed methods and their observeddetailed behavior when presented with thesame tasks and specific situations. Thisfinding has led many scientists study-ing expertise (Ericsson, 1996a; Ericsson &Lehmann, 1996; Ericsson & Smith, 1991;Starkes & Ericsson, 2003) to identify a col-lection of specific tasks that capture theessence of a given type of expertise. Thesetasks can then be presented under stan-dardized conditions to experts and less-skilled individuals, while their think-aloudverbalizations and other process measuresare recorded.

In sum, to obtain the most valid and com-plete trace of thought processes, scientistsshould strive to elicit laboratory conditionswhere participants perform tasks that arerepresentative of the studied phenomenonand where verbalizations directly reflect theparticipants’ spontaneous thoughts gener-ated while completing the task. In the nextsection I will describe how protocol analysishas been applied to study experts’ superiorperformance on tasks representative of theirrespective domain of expertise.

Protocol Analysis and theExpert-Performance Approach

The expert-performance approach to exper-tise (Ericsson, 1996a; Ericsson & Smith,1991) examines the behavior of expertsto identify situations with challenging taskdemands, where superior performance inthese tasks captures the essence of exper-tise in the associated domain. These natu-rally emerging situations can be recreated aswell-defined tasks calling for immediateaction. The tasks associated with these sit-uations can then be presented to individualsat all levels of skill, ranging from novice tointernational-level expert, under standard-ized conditions in which participants areinstructed to give concurrent or retrospec-tive reports.

In this section I will describe the expert-performance approach and illustrate itsapplication of protocol analysis to studythe structure of expert performance. First,de Groot’s (1946/1978) pioneering work onthe study of expert performance in chesswill be described, followed by more recentextensions in the domain of chess as well assimilar findings in other domains of exper-tise. Second, the issue of developing andvalidating theories of the mechanisms ofindividual experts will be addressed and sev-eral experimental analyses of expert perfor-mance will be described.

Capturing the Essence of Expertiseand Analyzing Expert Performance

It is important to avoid the temptation tostudy differences in performance betweenexperts and novices because there are readilyavailable tasks to measure such differences.Researchers need to identify those natu-rally occurring activities that correspondto the essence of expertise in a domain(Ericsson, 2004 , Chapter 38). For exam-ple, researchers need to study how chessplayers win tournament games rather thanJust probing for superior knowledge of chessand test memory for chess games. Similarly,researchers need to study how doctors areable to treat patients with more successful



outcomes rather than test their knowledgefor medicine and memory of encounteredpatients. It is, however, difficult to com-pare different individuals’ levels of naturallyoccurring performance in a domain becausedifferent individuals’ tasks will differ in dif-ficulty and many other aspects. For exam-ple, for medical doctors who primarily treatpatients with severe and complex problemsbut with a relatively low frequency of fullrecovery, is their performance better thanthe performance of doctors who primarilytreat patients with milder forms of the samedisease with uniform recovery? Unless alldoctors encounter patients with nearly iden-tical conditions, it will be nearly impossibleto compare the quality of their performance.The problem of comparing performers’ per-formance for comparable tasks is a generalchallenge for measuring and capturing supe-rior performance in most domains.

For example, chess players rarely, if ever,encounter the same chess positions duringthe middle part of chess games (Ericsson& Smith, 1991). Hence, there are no nat-urally occurring cases where many chessplayers select moves for the identical com-plex chess position such that the qualityof their moves can be directly compared.In a path-braking research effort, de Groot(1946/1978) addressed this problem by iden-tifying challenging situations (chess posi-tions) in representative games that requiredimmediate action, namely, the selection ofthe next move. De Groot then presentedthe same game situations to chess playersof different skill levels and instructed themto think aloud while they selected the nextchess move. Subsequent research has shownthat this method of presenting representa-tive situations and requiring generation ofappropriate actions provides the best avail-able measure of chess skill that predicts per-formance in chess tournaments (Ericsson,Patel, & Kintsch, 2000; van der Maas &Wagenmakers, 2005).

the pioneering studies of chess expertise

In his pioneering research on chess expertise,de Groot (1946/1978) picked out chess posi-

tions that he had analyzed for a long time andestablished an informal task analysis. Basedon this analysis he could evaluate the relativemerits of different moves and encode thethoughts verbalized by chess players whilethey were selecting the best move for thesepositions.

The verbal protocols of both world-class and skilled club-level players showedthat both types of players first familiarizedthemselves with the position and verballyreported salient and distinctive aspects ofthe position along with potential lines ofattack or defense. The players then exploredthe consequences of longer move exchangesby planning alternatives and evaluating theresulting positions. During these searchesthe players would identify moves with thebest prospects in order to select the singlebest move.

De Groot’s (1946/1978) analysis of theprotocols identified two important differ-ences in cognitive processes that explainedthe ability of world-class players to selectsuperior moves compared to club play-ers. De Groot noticed that the less-skilledplayers didn’t even verbally report think-ing about the best move during their moveselection, implying that they did not, in fact,think about it. Thus, their initial inferior rep-resentation of the position must not haverevealed the value of lines of play startingwith that move. In contrast, the world-classplayers reported many strong first moveseven during their initial familiarization withthe chess position. For example, they wouldnotice weaknesses in the opponent’s defensethat suggested various lines of attack andthen examine and systematically comparethe consequences of various sequences ofmoves. During this second detailed phaseof analysis, these world-class players wouldoften discover new moves that were superiorto all the previously generated ones.

mechanisms mediating chess expertise

De Groot’s analysis revealed two differentmechanisms that mediate the world-classplayers’ superiority in finding and selectingmoves. The first difference concerns the best



players’ ability to rapidly perceive the rele-vant structure of the presented chess posi-tion, thus allowing them to identify weak-nesses and associated lines of attack that theless-accomplished players never reportednoticing in their verbal protocols. These pro-cesses involve rapid perception and encod-ing, and thus only the end products of theseencoding processes are verbalized. There hasbeen a great deal of research attempting tostudy the perceptual encoding processes byrecording and analyzing eye fixations dur-ing brief exposures to reveal the cognitiveprocesses mediating perception and memoryof chess positions (see Gobet & Charness,Chapter 30). However, most of this researchhas not studied the task of selecting the bestmove but has used alternative task instruc-tions, namely, to recall as many chess piecesas possible from briefly presented positions,or to find specific chess pieces in pre-sented postions. These changes in the tasksappear to alter the mediating cognitive pro-cesses, and the results cannot therefore bedirectly integrated into accounts of the rep-resentative expert performance (Ericsson &Kintsch, 2000; Ericsson & Lehmann, 1996;Ericsson et al., 2000).

The second mechanism that underlies thesuperior performance of highly skilled play-ers concerns a superior ability to generatepotential moves by planning. De Groot’sprotocols showed that during this planningand evaluation process, the masters oftendiscovered new moves that were better thanthose perceived initially during the familiar-ization phase. In a more recent study Char-ness (1981) collected think-aloud protocolson the planning process during the selectionof a move for a chess position. Examples ofan analysis of the protocols from a club-leveland an expert-level chess player are givenin Figure 13 .2 . Consistent with these exam-ples, Charness (1981) found that the depth ofplanning increased with greater chess skill. Inaddition, there is evidence that an increase inthe time available for planning increases thequality of the moves selected, where moveselection during regular chess is superior tothat of speed chess with its limited time formaking the next move (Chabris & Hearst,

2003). Furthermore, highly skilled playershave been shown to be superior in mentallyplanning out consequences of sequences ofchess moves in experimental studies. In fact,chess masters, unlike less-skilled players, areable to play blindfold, without a visibleboard showing the current position, at a rel-atively high level (Chabris & Hearst, 2003 ;Karpov, 1995 ; Koltanowski, 1985). Experi-ments show that chess masters are able tomentally generate the chess positions associ-ated with multiple chess games without anyexternal memory support when the experi-menter reads sequences of moves from mul-tiple chess games (Saariluoma, 1991, 1995).

In sum, the analyses of the protocols alongwith experiments show that expert chessplayers’ ability to generate better movescannot be completely explained by theirmore extensive knowledge of chess pat-terns. Recognition of patterns and retrievalof appropriate moves that they have storedin memory during past experiences of chessplaying is not sufficient to explain theobserved reasoning abilities of highly skilledplayers. As their skill increases, they becomeincreasingly able to encode and manipulateinternal representations of chess positions toplan the consequences of chess moves, dis-cover potential threats, and even developnew lines of attack (Ericsson & Kintsch,1995 ; Saariluoma, 1992). (For a discussionof the relation between the superior mem-ory for presented chess positions and thememory demands integral to selecting chessmoves, see Ericsson et al., 2000, and Gobet& Charness, Chapter 30.)

medicine and other domains

The expert-performance approach has beenapplied to a wide range of domains, whereskilled and less-skilled performers solve rep-resentative problems while thinking aloud.When the review is restricted to studiesin domains that show reproducibly supe-rior performance of experts, the think-aloudprotocols reveal patterns of reports that areconsistent with those observed in chess.For example, when expert snooker play-ers are instructed to make a shot for a



Figure 13 .2 . A chess position presented to chess players with the instruction to select the best nextmove by white (top panel). The think-aloud protocols of a good club player (chess rating = 1657) anda chess expert (chess rating = 2004) collected by Charness (1981) are shown in the bottom panel toillustrate differences in evaluation and planning for one specific move, P-c5 (white pawn is movedfrom c4 to c5), which is the best move for this position. Reported considerations for other potentialmoves have been omitted. The chess expert considers more alternative move sequences and some ofthem to a greater depth than the club player does. (From Ericsson, K. A., & Charness, N., 1994 ,Expert performance: Its structure and acquisition. American Psychologist, 49(8), 725–747, Figure 13 .2copyright American Psychological Association).

given configuration of pool balls, they ver-balize deeper plans and more far-reachingexploration of consequences of their shotsthan less-skilled players (Abernethy, Neal, &Koning, 1994). Similarly, athletes at expertlevels given protocols from dynamic situa-tions in baseball (French, Nevett, Spurgeon,Graham, Rink, & McPherson, 1996) andsoccer (Ward, Hodges, Williams, & Starkes,

2004) reveal a more complete and superiorrepresentation of the current game situa-tion that allow them to prepare for futureimmediate actions better than less-skilledplayers in the same domains. In domainsinvolving perceptual diagnosis, such as in theinterpretation of Electrocardiograms (ECG)(Simpson & Gilhooly, 1997) and micro-scopic pathology (Crowley, Naus, Stewart,



& Friedman, 2003), verbal protocols revealthat the experts are able to encode essentialinformation more accurately and are moreable to integrate the information into anaccurate diagnosis.

Most of the research on medical diag-nosis has tried to minimize the influenceof perceptual factors and has relied primar-ily on verbal descriptions of scenarios andpatients. This research on medical exper-tise has shown that the process of generat-ing a diagnosis becomes more efficient asmedical students complete more of theirmedical training. The increase in efficiencyis mediated by higher levels of represen-tation that is acquired to support clinicalreasoning (Boshuizen & Schmidt, 1992 ;Schmidt & Boshuizen, 1993). When stud-ies present very challenging medical prob-lems to specialists and medical students,the experts give more accurate diagnoses(Ericsson, 2004 ; Norman, Trott, Brooks, &Smith, 1994). The specialists are also moreable to give complete and logically sup-ported diagnoses (Patel & Groen, 1991) thatappear to reflect higher-level representa-tions that they have acquired to support rea-soning about clinical alternative diagnoses(Ericsson & Kintsch, 1995 ; Ericsson et al.,2000; Patel, Arocha, & Kaufmann, 1994).

There are also studies showing differ-ences in knowledge between experts andless-accomplished individuals that mediatesuccessful task performance in experimen-tal design of experiments in psychology(Schraagen, 1993) and detection of fraudin financial accounting (Johnson, Karim, &Berryman, 1991). The work on account-ing fraud was later developed into a gen-eral theory of fraud detection (Johnson,Grazioli, Jamal, & Berryman, 2001). Inthis handbook there are discussions ofthe applications of verbal report method-ology to study thinking in several differ-ent domains of expertise, such as medicine(Norman, Eva, Brooks, & Hamstra, Chap-ter 19), software design (Sonnentag, Niessen,& Volmer, Chapter 21), professional writ-ing (Kellogg, Chapter 22), artistic perfor-mance (Noice & Noice, Cahpter 28), chessplaying (Gobet & Charness, Chapter 30),

exceptional memory (Wilding & Valentine,Chapter 31), mathematical expertise (But-terworth, Chapter 32), and historical exper-tise (Voss & Wiley, Chapter 33).

The evidence reviewed in this section hasbeen based primarily on findings that arebased on averages across groups of experts.In the next section we will search for evi-dence on the validity of reported thoughts ofindividual experts as well as individual dif-ferences between different experts.

Individual Differences and Validityof Verbal Reports from Expert Performance

It is well established that to be successfulin competitions at the international level,experts need to have engaged in at leastten years of intensive training – a findingthat applies even to the most “talented”individuals (Ericsson Krampe, & Tesch-Romer, 1993 ; Simon & Chase, 1973). Conse-quently, researchers have not been surprisedthat verbal reports of experts and, thus,the corresponding sequences of reportedthoughts, differ between expert performers– at least at the level of detailed thoughts. Inthe previous section I showed how protocolsuncover many higher-level characteristics ofexpert performers’ mediating mechanisms,such as skills supporting the expanded work-ing memory (Ericsson & Kintsch, 1995). Inthis section I will discuss attempts to exper-imentally validate the detailed structure ofthe reported cognitive processes of individ-ual expert performers.

The complexity of the knowledge andacquired skills of expert performers inmost domains, such as chess and medicine,makes it virtually impossible to describethe complete structure of the expertiseof an individual expert. For example,Allen Newell (personal communication)described a project in which one of hisgraduate students in the 1970s tried to elicitall the relevant knowledge of a stamp col-lector. After some forty hours of interviews,Newell and his student gave up, as therewas no sight of the end of the knowledgethat the expert had acquired. As it may bedifficult, perhaps impossible, to describe all



the knowledge and skills of experts, scien-tists should follow the recommendationsof the expert-performance approach.Namely, they should focus on the repro-ducible structure of the experts’ mecha-nisms that mediate their superior perfor-mance on representative tasks (Ericsson,1996b). Consequently, I will focus onselected domains of expertise in whichregularities in the verbal reports of differenttrials with representative tasks have beenanalyzed.

In the early applications of protocol anal-ysis there were several studies that col-lected protocols from experts solving repre-sentative problems while thinking aloud. Forexample, Clarkson and Metzler (1960) col-lected protocols from a professional investorconstructing portfolios of investments. Sim-ilar detailed analyses of individual expertsfrom different domains have been brieflydescribed in Ericsson and Simon (1993) andHoffman (1992). These analyses were not,however, formally evaluated, and the pro-posed mechanisms were not demonstratedto account for reproducibly superior perfor-mance on representative tasks.

The most extensive applications of theexpert-performance approach using proto-col analysis to study individual experts haveexamined people with exceptional memory(Ericsson & Lehmann, 1996). In the intro-duction of this chapter I mentioned Binet’s(1894) pioneering work studying individu-als with exceptional memory for numbers.Several subsequent studies interviewed peo-ple with exceptional memory, such as Luria’s(1968) Subject S and Hunt and Love’s (1972)VP (see Wilding and Valentine, 1997, for areview). However, the first study to trace thedevelopment of exceptional memory fromaverage performance to the best memoryperformance in the world (in some mem-ory tasks) was conducted in a training studyby Chase and Ericsson (1981, 1982 ; Erics-son, Chase, & Faloon, 1980). We studied acollege student (SF) whose initial immedi-ate memory for rapidly presented digits wasaround 7, in correspondence with the typ-ical average (Miller, 1956), but he eventu-ally acquired exceptional performance forimmediate memory and after 200 hours of

practice was able to recall over 80 digitsin the digit-span task. During this extendedtraining period SF gave retrospective reportson his thought processes after most memorytrials. As his memory performance startedto increase he reported segmenting the pre-sented lists into 3 -digit groups and, when-ever possible, encoding them as runningtimes for various races because SF wasan avid cross-country runner. For example,SF would encode 358 as a very fast miletime, 3 minutes and 58 seconds, just belowthe 4-minute mile. The central questionconcerning verbal reports is whether wecan trust the validity of these reports andwhether the ability to generate mnemonicrunning-time encodings influences memory.

To address that issue Bill Chase and Idesigned an experiment to test the effects ofmnemonic encodings and presented SF withspecial types of lists of constrained digits. Inaddition to a list of random digits we pre-sented other lists that were constructed tocontain only 3 -digits groups that could notbe encoded as running times, such as 364 asthree minutes and sixty four seconds, in a list(364 895 481 . . . ). As predicted his perfor-mance decreased reliably. In another exper-iment we designed digit sequences whereall 3 -digit groups could be encoded as run-ning times (412 637 524 . . . ) with a reli-able increase in his associated performance.In over a dozen specially designed experi-ments it was possible to validate numerousaspects of SF’s acquired memory skill (Chase& Ericsson, 1981, 1982 ; Ericsson, 1988b).Other investigators, such as Wenger andPayne (1995), have also relied on protocolanalysis and other process-tracing data toassess the mechanisms of individuals whoincreased their memory performance dra-matically with practice on a list-learningtask.

More generally, this method has beenextended to any individual with exceptionalmemory performance. During the first step,the exceptional individuals are given mem-ory tasks where they could exhibit theirexceptional performance while giving con-current and/or retrospective verbal reports.These reports are then analyzed to iden-tify the mediating encoding and retrieval



mechanisms of each exceptional individ-ual. The validity of these accounts is thenevaluated experimentally by presenting eachindividual with specially designed memorytasks that would predictably reduce thatindividuals’ memory performance in a deci-sive manner (Ericsson, 1985 , 1988b; Wilding& Valentine, 1998). With this methodol-ogy, verbal reported mechanisms of supe-rior performance have been validated withdesigned experiments in a wide range ofdomains, such as a waiter with superiormemory for dinner orders (Ericsson &Polson, 1988a, 1988b), mental calculators(Chase & Ericsson, 1982) and other indi-viduals with exceptional memory perfor-mance (Ericsson, 2003b; Ericsson, Delaney,Weaver, & Mahadevan, 2004).

Exceptional memory performance fornumbers and other types of “arbitrary” infor-mation appears to require that the expertperformers sustain attention during the pre-sentation (Ericsson, 2003b). The difficultyto automate memory skills for encoding newstimuli makes this type of performance par-ticularly amenable to examination with pro-tocol analysis. More generally, when individ-uals change and improve their performancethey appear able to verbalize their thoughtprocesses during learning (Ericsson & Simon,1993). This has been seen to extend tolearning of experts and their ability to altertheir performance through deliberate prac-tice (Ericsson et al., 1993). There is nowan emerging body of research that examinesthe microstructure of this type of trainingand how additional specific deliberate prac-tice improves particular aspects of the tar-get performance in music (Chaffin & Imreh,1997; Nielsen, 1999) and in sports (Deakin &Cobley, 2003 ; Ericsson, 2003c; Ward et al.,2004) – for a more extended discussion seethe chapter by Ericsson (Chapter 38) ondeliberate practice.

Conclusion

Protocol analysis of thoughts verbalized dur-ing the experts’ superior performance onrepresentative tasks offers an alternative tothe problematic methods of directed ques-

tioning and introspection. The think-aloudmodel of verbalization of thoughts has beenaccepted as a useful foundation for dealingwith the problems of introspection (see theentry on “Psychology of Introspection” inthe Routledge Encyclopedia of Philosophy byVon Eckardt, 1998, and entries on “ProtocolAnalysis” in the Companion to Cognitive Sci-ence [Ericsson, 1998] and the InternationalEncyclopedia of the Social and Behavioral Sci-ences [Ericsson, 2001]. This same theoreti-cal framework for collecting verbal reportshas led to the accumulation of evidence thathas led many behaviorists to accept dataon cognitive constructs, such as memoryand rules (Austin & Delaney, 1998). Conse-quently, the method of protocol analysis pro-vides a tool that allows researchers to iden-tify information that pass through expertperformers’ attention while they generatetheir behavior without the need to embraceany controversial theoretical assumptions.In support of this claim, protocol analysishas emerged as a practical tool to diagnosethinking outside of traditional cognitive psy-chology and cognitive science. For example,designers of surveys (Sudman, Bradburn,& Schwarz, 1996), researchers on second-language learning(Green, 1998) and textcomprehension passages (Ericsson, 1988a;Pressley & Afflerbach, 1995), and computersoftware developers (Henderson, Smith,Podd, & Varela-Alvarez, 1995 ; Hughes &Parkes, 2003) regularly collect verbal reportsand rely on protocol analysis.

The complexity and diversity of themechanisms mediating skilled and expertperformance is intimidating. To meet thesechallenges it is essential to develop meth-ods to allow investigators to reproduce theexperts’ superior performance under con-trolled and experimental conditions on tasksthat capture the essence of expertise in agiven domain. Process tracing, in particu-lar protocol analysis, will be required touncover detailed information about most ofthe important mechanisms that are respon-sible for the superiority of the experts’achievement. Only then will it be possibleto discover their structure and study theirdevelopment and refinement with trainingand deliberate practice.



References

Abernethy, B., Neal, R. J., & Koning, P. (1994).Visual-perceptual and cognitive differencesbetween expert, intermediate, and novicesnooker players. Applied Cognitive Psychology,18, 185–211.

Anderson, J. R. (1983). The architecture of cog-nition. Cambridge, MA: Harvard UniversityPress.

Austin, J., & Delaney, P. F. (1998). Protocol anal-ysis as a tool for behavior analysis. Analysis ofVerbal Behavior, 15 , 41–56.

Berardi-Coletta, B., Buyer, L. S., Dominowski,R. L., & Rellinger, E. R. (1995). Metacogni-tion and problem solving: A process-orientedapproach. Journal of Experimental Psychology:Learning, Memory, & Cognition, 2 1, 205–223 .

Binet, A. (1893 /1966). Mnemonic virtuosity:A study of chess players. (Original paperappeared in 1893 and was translated by M. L.Simmel & S. B. Barron). Genetic PsychologyMonographs, 74 , 127–162 .

Binet, A. (1894). Psychologie des grands calcu-lateurs et joueurs d’echecs [The psychologyof great calculators and chess players]. Paris:Libraire Hachette.

Boring, E. B. (1950). A history of experimental psy-chology. New York: Appleton-Century Crofts.

Boshuizen, H. P. A., & Schmidt, H. G. (1992).On the role of biomedical knowledge in clin-ical reasoning by experts, intermediates andnovices. Cognitive Science, 16, 153–184 .

Bryan, W. L., & Harter, N. (1899). Studies on thetelegraphic language: The acquisition of a hier-archy of habits. Psychological Review, 6, 345–375 .

Bühler, K. (1907). Tatsachen und Probleme zueiner Psychologie der Denkvorgaenge: I. UeberGedanken [Facts and problems in a psychol-ogy of thinking: I. On thoughts]. Archiv für diegesamte Psychologie, 9, 297–365 .

Chabris, C. F., & Hearst, E. S. (2003). Visualiza-tion, pattern recognition, and forward search:Effects of playing speed and sight of the posi-tion on grandmaster chess errors. Cognitive Sci-ence, 2 7. 637–648.

Chaffin, R., & Imreh, G. (1997). “Pulling teethand torture”: Musical memory and problemsolving. Thinking and Reasoning, 3 , 315–336.

Charness, N. (1981). Search in chess: Age and skilldifferences. Journal of Experimental Psychology:

Human Perception and Performance, 7, 467–476.

Chase, W. G., & Ericsson, K. A. (1981). Skilledmemory. In J. R. Anderson (Ed.), Cognitiveskills and their acquisition (pp. 141–189). Hills-dale, NJ: Erlbaum.

Chase, W. G., & Ericsson, K. A. (1982). Skill andworking memory. In G. H. Bower (Ed.), Thepsychology of learning and motivation, Vol. 16(pp. 1–58). New York: Academic Press.

Chi, M. T. H., de Leeuw, N., Chiu, M.-H., & LaVancher, C. (1994). Eliciting self-explanations improves understanding. Cogni-tive Science, 18, 439–477.

Clarkson, G. P., & Metzler, A. H. (1960). Portfo-lio selection: A heuristic approach. Journal ofFinance, 15 , 465–480.

Crovitz, H. F. (1970). Galton’s walk: Methods forthe analysis of thinking, intelligence, and creativ-ity. New York: Harper & Row Publishers.

Crowley, R. S., Naus, G. J., Stewart, J., &Friedman, C. P. (2003). Development ofvisual diagnostic expertise in pathology: Aninformation-processing study. Journal of theAmerican Medical Informatics Association, 10,39–51.

Crutcher, R. J. (1994). Telling what we know: Theuse of verbal report methodologies in psycho-logical research. Psychological Science, 5 , 241–244 .

de Groot, A. (1978). Thought and choice and chess.The Hague: Mouton. (Original work published1946).

Deakin, J. M., & Cobley, S. (2003). A searchfor deliberate practice: An examination of thepractice environments in figure skating and vol-leyball. In J. Starkes & K. A. Ericsson (Eds.),Expert performance in sport: Recent advances inresearch on sport expertise (pp. 115–135). Cham-paign, IL: Human Kinetics.

Djakow, J. N., Petrowski, N. W., & Rudik, P. A.(1927). Psychologie des Schachspiels [The psy-chology of chess]. Berlin: Walter de Gruyter.

Duncker, K. (1945). On problem solving. Psycho-logical Monographs, 58(5 , Whole No. 270).

Ericsson, K. A. (1985). Memory skill. CanadianJournal of Psychology, 39, 188–231.

Ericsson, K. A. (1988a). Concurrent verbalreports on reading and text comprehension.Text, 8, 295–325 .

Ericsson, K. A. (1988b). Analysis of memory per-formance in terms of memory skill. In R. J.Sternberg (Ed.), Advances in the psychology of



human intelligence, Vol. 4 (pp. 137–179). Hills-dale, NJ: Erlbaum.

Ericsson, K. A. (Ed.) (1996a). The road toexcellence: The acquisition of expert performancein the arts and sciences, sports, and games. Mah-wah, NJ: Erlbaum.

Ericsson, K. A. (1996b). The acquisition of expertperformance: An introduction to some of theissues. In K. A. Ericsson (Ed.), The road toexcellence: The acquisition of expert performancein the arts and sciences, sports, and games (pp. 1–50). Mahwah, NJ: Erlbaum.

Ericsson, K. A. (1998) Protocol analysis. In W.Bechtel & G. Graham (Eds.), A companionto cognitive science (pp. 425–432). Cambridge,MA: Basil Blackwell.

Ericsson, K. A. (2001). Protocol analysis in psy-chology. In N. Smelser & P. Baltes (Eds.), Inter-national Encyclopedia of the Social and Behav-ioral Sciences (pp. 12256–12262). Oxford, UK:Elsevier.

Ericsson, K. A. (2002). Toward a procedure foreliciting verbal expression of nonverbal expe-rience without reactivity: Interpreting the ver-bal overshadowing effect within the theoreticalframework for protocol analysis. Applied Cog-nitive Psychology, 16, 981–987.

Ericsson, K. A. (2003a). Valid and non-reactiveverbalization of thoughts during performanceof tasks: Toward a solution to the central prob-lems of introspection as a source of scientificdata. Journal of Consciousness Studies, 10, 1–18.

Ericsson, K. A. (2003b). Exceptional memori-zers: Made, not born. Trends in Cognitive Sci-ences, 7, 233–235 .

Ericsson, K. A. (2003c). The development ofelite performance and deliberate practice: Anupdate from the perspective of the expert-performance approach. In J. Starkes & K. A.Ericsson (Eds.), Expert performance in sport:Recent advances in research on sport exper-tise (pp. 49–81). Champaign, IL: HumanKinetics.

Ericsson, K. A. (2004). Deliberate practice andthe acquisition and maintenance of expert per-formance in medicine and related domains.Academic Medicine, 10, S1–S12 .

Ericsson, K. A., Chase, W., & Faloon, S. (1980).Acquisition of a memory skill. Science, 2 08,1181–1182 .

Ericsson, K. A., & Crutcher, R. J. (1991). Intro-spection and verbal reports on cognitive pro-cesses – two approaches to the study of thought

processes: A response to Howe. New Ideas inPsychology, 9, 57–71.

Ericsson, K. A., Delaney, P. F., Weaver, G., &Mahadevan, R. (2004). Uncovering the struc-ture of a memorist’s superior “basic” memorycapacity. Cognitive Psychology, 49, 191–237.

Ericsson, K. A., & Kintsch, W. (1995). Long-term working memory. Psychological Review,102 , 211–245 .

Ericsson, K. A., & Kintsch, W. (2000). Shortcom-ings of article retrieval structures with slots ofthe type that Gobet (1993) proposed and mod-eled. British Journal of Psychology, 91, 571–588.

Ericsson, K. A., Krampe, R. Th., & Tesch-Römer,C. (1993). The role of deliberate practice in theacquisition of expert performance. Psychologi-cal Review, 100(3), 363–406.

Ericsson, K. A., & Lehmann, A. C. (1996).Expert and exceptional performance: Evidenceon maximal adaptations on task constraints.Annual Review of Psychology, 47, 273–305 .

Ericsson, K. A., Patel, V. L., & Kintsch, W.(2000). How experts’ adaptations to represen-tative task demands account for the expertiseeffect in memory recall: Comment on Vicenteand Wang (1998). Psychological Review, 107,578–592 .

Ericsson, K. A., & Polson, P. G. (1988a). Memoryfor restaurant orders. In M. Chi, R. Glaser, &M. Farr (Eds.), The nature of expertise (pp. 23–70). Hillsdale, NJ: Erlbaum.

Ericsson, K. A., & Polson, P. G. (1988b). Anexperimental analysis of a memory skill for din-ner orders. Journal of Experimental Psychology:Learning, Memory, and Cognition, 14 , 305–316.

Ericsson, K. A., & Simon, H. A. (1980). Verbalreports as data. Psychological Review, 87, 215–251.

Ericsson, K. A., & Simon, H. A. (1984). Proto-col analysis: Verbal reports as data. Cambridge,MA: Bradford books/MIT Press.

Ericsson, K. A., & Simon, H. A. (1993). Protocolanalysis: Verbal reports as data (revised edition).Cambridge, MA: Bradford books/MIT Press.

Ericsson, K. A., & Smith, J. (1991). Prospectsand limits in the empirical study of expertise:An introduction. In K. A. Ericsson & J. Smith(Eds.), Toward a general theory of expertise:Prospects and limits (pp. 1–38). Cambridge:Cambridge University Press.

French, K. E., Nevett, M. E., Spurgeon, J. H.,Graham, K. C., Rink, J. E., & McPherson,S. L. (1996). Knowledge representation and



problem solution in expert and novice youthbaseball players. Research Quarterly for Exerciseand Sport, 67, 386–395 .

Gagné, R. H., & Smith, E. C. (1962). A study ofthe effects of verbalization on problem solving.Journal of Experimental Psychology, 63 , 12–18.

Galton, F. (1879). Psychometric experiments.Brain, 2 , 148–162 .

Galton, F. (1883). Inquiries into human faculty andits development. New York: Dutton.

Green, A. J. F. (1998). Using verbal protocols in lan-guage testing research: A handbook. Cambridge,UK: Cambridge University Press.

Henderson, R. D., Smith, M. C., Podd, J., &Varela-Alvarez, H. (1995). A comparison ofthe four prominent user-based methods forevaluating the usability of computer software.Ergonomics, 39, 2030–2044 .

Hoffman, R. R. (Ed.) (1992). The psychology ofexpertise: Cognitive research and empirical AI.New York: Springer-Verlag.

Hughes, J., & Parkes, S. (2003). Trends in the useof verbal protocol analysis in software engineer-ing research. Behaviour & Information Technol-ogy, 2 2 , 127–140.

Hunt, E., & Love, T. (1972). How good can mem-ory be? In A. W. Melton & E. Martin (Eds.),Coding processes in human memory (pp. 237–260). New York: Holt.

Johnson, P. E., Karim, J., & Berryman, R. G.(1991). Effects of framing on auditor deci-sions. Organizational Behavior & Human Deci-sion Processes, 50, 75–105

Johnson, P. E., Grazioli, S., Jamal, K., & Berry-man, R. G. (2001). Detecting deception: Adver-sarial problem solving in a low base-rate world.Cognitive Science, 2 5 , 355–392 .

Karpov, A. (1995). Grandmaster musings. ChessLife, November, pp. 32–33 .

Keller, F. S. (1958). The phantom plateau. Journalof the Experimental Analysis of Behavior, 1, 1–13 .

Koltanowski, G. (1985). In the dark. Coraopolis,PA: Chess Enterprises.

Luria, A. R. (1968). The mind of a mnemonist. NewYork: Avon.

McKelvie, S. J. (1995). The VVIQ and beyond:Vividness and its measurement. Journal of Men-tal Imagery, 19, 197–252 .

Miller, G. A. (1956). The magical number seven,plus or minus two: Some limits of our capac-ity for processing information. PsychologicalReview, 63 , 81–97.

Miller, G. A., Galanter, E., & Pribram, K. H.(1960). Plans and the structure of behavior. NewYork: Holt, Rinehart, and Winston.

Neuman, Y., & Schwarz, B. (1998). Is self-explanation while solving problems helpful?The case of analogical problem-solving.British Journal of Educational Psychology, 68,15–24 .

Neuman, Y., Leibowitz, L., & Schwarz, B. (2000).Patterns of verbal mediation during prob-lem solving: A sequential analysis of self-explanation. Journal of Experimental Education,68, 197–213 .

Newell, A. (1990). Unified theories of cognition.Cambridge, MA: Harvard University Press.

Newell, A., & Simon, H. A. (1972). Human prob-lem solving. Englewood Cliffs, NJ: Prentice-Hall.

Nielsen, S. (1999). Regulation of learning strate-gies during practice: A case study of a singlechurch organ student preparing a particularwork for a concert performance. Psychology ofMusic, 2 7, 218–229.

Nisbett, R. E., & Wilson, T. D. (1977). Tellingmore than we can know: Verbal reports onmental processes. Psychological Review, 84 ,231–259.

Norman, G. R., Trott, A. D., Brooks, L. R., &Smith, E. K. M. (1994). Cognitive differencesin clinical reasoning related to postgraduatetraining. Teaching and Learning in Medicine, 6,114–120.

Patel, V. L., Arocha, J. F., & Kaufmann, D. R.(1994). Diagnostic reasoning and medicalexpertise. In D. Medin (Ed.), The psychology oflearning and motivation, Vol. 30 (pp. 187–251).New York: Academic Press.

Patel, V. L., & Groen, G. J. (1991). The generaland specific nature of medical expertise: A crit-ical look. In K. A. Ericsson & J. Smith (Eds.),Toward a general theory of expertise (pp. 93–125). Cambridge, MA: Cambridge UniversityPress.

Pressley, M., & Afflerbach, P. (1995). Verbal pro-tocols of reading: The nature of constructivelyresponsive reading. Hillsdale, NJ: Erlbaum.

Renkl, A. (1997). Learning from worked-outexamples: A study on individual differences.Cognitive Science, 2 1, 1–29.

Richardson, J. T. E. (1988). Vividness andunvividness: Reliability, consistency, and valid-ity of subjective imagery ratings. Journal ofMental Imagery, 12 , 115–122 .



Saariluoma, P. (1991). Aspects of skilled imageryin blindfold chess. Acta Psychologica, 77, 65–89.

Saariluoma, P. (1992). Error in chess: Theapperception-restructuring view. PsychologicalResearch/Psychologische Forschung, 54 , 17–26.

Saariluoma, P. (1995). Chess players’ thinking.London: Routledge.

Schmidt, H. G., & Boshuizen, H. (1993). Onacquiring expertise in medicine. EducationalPsychology Review, 5 , 205–221.

Schraagen, J. M. (1993). How experts solve anovel problem in experimental design. Cogni-tive Science, 17, 285–309.

Simon, H. A., & Chase, W. G. (1973). Skill inchess. American Scientist, 61, 394–403 .

Simon, H. A., & Kaplan, C. A. (1989). Founda-tions of cognitive science. In M. J. Posner (Ed.),Foundations of cognitive science (pp. 1–47). Cam-bridge, MA: MIT Press.

Simpson, S. A., & Gilhooly, K. J. (1997). Diagnos-tic thinking processes: Evidence from a con-structive interaction study of electrocardio-gram (ECG) interpretation. Applied CognitivePsychology, 11, 543–554 .

Starkes, J., & Ericsson, K. A. (Eds.) (2003).Expert performance in sport: Recent advancesin research on sport expertise. Champaign, IL:Human Kinetics.

Sudman, S., Bradburn, N. M., & Schwarz,N. (Eds.) (1996). Thinking about answers:The application of cognitive processes to surveymethodology. San Francisco, CA: Jossey-Bass.

van der Maas, H. L. J., & Wagenmakers, E. J.(2005). A psychometric analysis of chess exper-tise. American Journal of Psychology, 118, 29–60.

Verplanck, W. S. (1962). Unaware of where’sawareness: Some verbal operants-notates,moments and notants. In C. W. Eriksen(Ed.), Behavior and awareness – a symposiumof research and interpretations (pp. 130–158).Durham, NC: Duke University Press.

Von Eckardt, B. (1998). Psychology of intro-spection. In E. Craig (Ed.), Routledge encyclo-pedia of philosophy (pp. 842–846). London:Routledge.

Ward. P., Hodges, N. J., Williams, A. M., &Starkes, J. L. (2004). Deliberate practice andexpert performance: Defining the path toexcellence. In A. M. Williams & N. J. Hodges(Eds.), Skill acquisition in sport: Research, the-ory and practice (pp. 231–258). London, UK:Routledge.

Watson, J. B. (1913). Psychology as the behavioristviews it. Psychological Review, 2 0, 158–77.

Watson, J. B. (1920). Is thinking merely the actionof language mechanisms? British Journal of Psy-chology, 11, 87–104 .

Wenger, M. J., & Payne, D. G. (1995). On theacquisition of mnemonic skill: Application ofskilled memory theory. Journal of ExperimentalPsychology: Applied, 1, 194–215 .

Wilding, J., & Valentine, E. (1997). Superior mem-ory. Hove, UK: Psychology Press.

Wundt, W. (1897). Outlines of psychology (Trans-lated by C. H. Judd). Leipzig: WilhelmEngelmann.

Wundt, W. (1907). Über Ausfrageexperimenteund über die Methoden zur Psychologie desDenkens [On interrogation experiments andon the methods of the psychology of thinking].Philosophische Studien, 3 , 301–360.

Author Notes

This article was prepared in part with sup-port from the FSCW/Conradi Endowment Fundof Florida State University Foundation. Theauthor wants to thank Robert Hoffman, KatyNandagopal, and Roy Roring for their valu-able comments on an earlier draft of thisChapter.


242

CHAPTER 13 Protocol Analysis and Expert Thought: Concurrent … · 2009. 4. 22. · Protocol Analysis and Expert Thought: Concurrent Verbalizations of Thinking during Experts’ Performance

Documents