Top Banner
Scandinavian Journal of Psychology, 2006, 47, 461–470 DOI: 10.1111/j.1467-9450.2006.00530.x © 2006 The Authors. Journal compilation © 2006 The Scandinavian Psychological Associations. Published by Blackwell Publishing Ltd., 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA. ISSN 0036-5564. Blackwell Publishing Ltd Cognition and Neurosciences Child witnesses’ metamemory realism CARL MARTIN ALLWOOD, 1 PÄR ANDERS GRANHAG 2 and ANNA-CARIN JONSSON 3 1 Department of Psychology, Lund University, Sweden 2 Department of Psychology, Göteborg University, Sweden 3 School of Education and Behavioural Sciences, University College of Boras, Sweden Allwood, C. M., Granhag, P. A. & Jonsson, A.-C. (2006). Child witnesses’ metamemory realism. Scandinavian Journal of Psychology, 47, 461– 470. This study investigated the degree of realism in the confidence judgments of 11 to 12-year-olds (41 girls and 40 boys) of their answers to questions relating to a short film clip showing a kidnapping event. Four different confidence scales were used: a numeric scale, a picture scale, a line scale and a written scale. The results demonstrated that the children showed a high level of overconfidence in their memories. However, no significant differences between the four confidence scales were found. Weak gender differences were found in that the girls were slightly, but significantly, better calibrated than the boys. In addition, although both boys and girls overestimated the total number of memory questions they had answered correctly, the boys gave higher estimates compared with the girls. In brief, the results indicate that, at least in the context investigated, 11–12 year-old children’s confidence in and estimations of their own event memory show poor realism (overconfidence and overestimation). A comparison with previous research on adults indicates that 11 to 12-year-old children show noticeably poorer realism. Key words: Eyewitness memory, metacognition, 11 to 12-year-olds, confidence, realism. Carl Martin Allwood, Department of Psychology, Lund University, Box 213, SE-221 00 Lund, Sweden. E-mail: [email protected] INTRODUCTION Children are increasingly often asked to act as witnesses in different stages of the forensic process, and in the last decade, the value of children’s testimonies in forensic situations has been much discussed. On such occasions, the young witnesses are often asked how confident they are in the accuracy of their memories of a certain event. In this context it is relevant that previous studies have shown that jurors rely heavily on witnesses’ confidence (Cutler, Penrod & Stuve, 1988; Lindsay, Wells & Rumpel, 1981; Wells, Lindsay & Ferguson, 1979). However, eyewitnesses are often mistaken, and previous research has concluded that a mistaken eyewitness’ testi- mony is the single largest cause of jury convictions of innocent people (Wells & Bradfield, 1999 and research cited therein), and for this reason the realism of witnesses’ confid- ence judgments is of great importance. Much research has investigated the relation between the level of adult witnesses’ confidence and the accuracy of the witnesses’ statements, especially in the context of line-up identifications. Lately, the earlier quite pessimistic view of the realism of adult witnesses’ confidence judgments (e.g., Bothwell, Deffenbacher & Brigham, 1987) has changed into a more differentiated and somewhat more optimistic view (Allwood, Ask & Granhag, 2005a; Allwood, Granhag & Johansson, 2003; Read, Lindsay & Nicholls, 1998; Sporer, Penrod, Read & Cutler, 1995). This is partly due to improvements in methods for measuring the realism in witnesses’ confidence (further elaborated below). However, very little research has investigated the relation between child witnesses’ confidence and the accuracy of their event memory. This is the topic of the present study. Research on children as witnesses has been conducted both in event memory situations and in line-up situations. This research has covered many topics, for example age differences in, and causes of, suggestibility (e.g., Ackil & Zaragoza, 1998; Ceci & Bruck, 1993; Holliday & Albon, 2004) and the prevalence of, and differences in outcome for, different types of questions asked by the interviewer (e.g., Milne & Bull, 1999; Peterson & Grant, 2001; Pipe, Lamb, Orbach & Esplin, 2004). An undisputed main finding with respect to suggestibility is that especially younger children are more suggestible than adults. With respect to the issue of differences in the prevalence of different types of questions asked, Pipe and colleagues (2004) concluded that “focused utterances are much more common in the field than open-ended questions are” (p. 453). In the same vein, Peterson and Grant (2001) concluded that “forensic inter- viewers typically ask forced choice questions, especially yes/ no and multiple choice” (p. 118). In addition, Fisher, Geiselman and Raymond (1987) analyzed real police interviews and found that about 8 of 10 questions asked were closed. Research on children’s performance on different types of questions shows that children’s answers to open-ended questions are usually much briefer (i.e., less complete) than adults’, but are likely to be accurate. With respect to directed questions, children’s level of accuracy decreases and is usually often lower than that of adults (e.g., Milne & Bull, 1999; Pipe et al. , 2004). Peterson and Grant (2001) reviewed research showing that when children (kindergartners, grade
10

Child witnesses' metamemory realism

Jan 17, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Child witnesses' metamemory realism

Scandinavian Journal of Psychology, 2006, 47, 461–470 DOI: 10.1111/j.1467-9450.2006.00530.x

© 2006 The Authors. Journal compilation © 2006 The Scandinavian Psychological Associations. Published by Blackwell Publishing Ltd., 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA. ISSN 0036-5564.

Blackwell Publishing Ltd

Cognition and Neurosciences

Child witnesses’ metamemory realism

CARL MARTIN ALLWOOD,

1

PÄR ANDERS GRANHAG

2

and ANNA-CARIN JONSSON

3

1

Department of Psychology, Lund University, Sweden

2

Department of Psychology, Göteborg University, Sweden

3

School of Education and Behavioural Sciences, University College of Boras, Sweden

Allwood, C. M., Granhag, P. A. & Jonsson, A.-C. (2006). Child witnesses’ metamemory realism.

Scandinavian Journal of Psychology

,

47

, 461–470.

This study investigated the degree of realism in the confidence judgments of 11 to 12-year-olds (41 girls and 40 boys) of their answers to questionsrelating to a short film clip showing a kidnapping event. Four different confidence scales were used: a numeric scale, a picture scale, a linescale and a written scale. The results demonstrated that the children showed a high level of overconfidence in their memories. However, nosignificant differences between the four confidence scales were found. Weak gender differences were found in that the girls were slightly, butsignificantly, better calibrated than the boys. In addition, although both boys and girls overestimated the total number of memory questionsthey had answered correctly, the boys gave higher estimates compared with the girls. In brief, the results indicate that, at least in the contextinvestigated, 11–12 year-old children’s confidence in and estimations of their own event memory show poor realism (overconfidence andoverestimation). A comparison with previous research on adults indicates that 11 to 12-year-old children show noticeably poorer realism.

Key words:

Eyewitness memory, metacognition, 11 to 12-year-olds, confidence, realism.

Carl Martin Allwood, Department of Psychology, Lund University, Box 213, SE-221 00 Lund, Sweden.

E-mail: [email protected]

INTRODUCTION

Children are increasingly often asked to act as witnesses indifferent stages of the forensic process, and in the last decade,the value of children’s testimonies in forensic situationshas been much discussed. On such occasions, the youngwitnesses are often asked how confident they are in theaccuracy of their memories of a certain event. In this contextit is relevant that previous studies have shown that jurorsrely heavily on witnesses’ confidence (Cutler, Penrod & Stuve,1988; Lindsay, Wells & Rumpel, 1981; Wells, Lindsay &Ferguson, 1979).

However, eyewitnesses are often mistaken, and previousresearch has concluded that a mistaken eyewitness’ testi-mony is the single largest cause of jury convictions ofinnocent people (Wells & Bradfield, 1999 and research citedtherein), and for this reason the realism of witnesses’ confid-ence judgments is of great importance. Much research hasinvestigated the relation between the level of adult witnesses’confidence and the accuracy of the witnesses’ statements,especially in the context of line-up identifications. Lately, theearlier quite pessimistic view of the realism of adult witnesses’confidence judgments (e.g., Bothwell, Deffenbacher & Brigham,1987) has changed into a more differentiated and somewhatmore optimistic view (Allwood, Ask & Granhag, 2005a;Allwood, Granhag & Johansson, 2003; Read, Lindsay &Nicholls, 1998; Sporer, Penrod, Read & Cutler, 1995). Thisis partly due to improvements in methods for measuring therealism in witnesses’ confidence (further elaborated below).However, very little research has investigated the relation

between child witnesses’ confidence and the accuracy oftheir event memory. This is the topic of the present study.

Research on children as witnesses has been conductedboth in event memory situations and in line-up situations.This research has covered many topics, for example agedifferences in, and causes of, suggestibility (e.g., Ackil &Zaragoza, 1998; Ceci & Bruck, 1993; Holliday & Albon,2004) and the prevalence of, and differences in outcomefor, different types of questions asked by the interviewer(e.g., Milne & Bull, 1999; Peterson & Grant, 2001; Pipe,Lamb, Orbach & Esplin, 2004). An undisputed main findingwith respect to suggestibility is that especially younger childrenare more suggestible than adults. With respect to the issueof differences in the prevalence of different types ofquestions asked, Pipe and colleagues (2004) concluded that“focused utterances are much more common in the fieldthan open-ended questions are” (p. 453). In the same vein,Peterson and Grant (2001) concluded that “forensic inter-viewers typically ask forced choice questions, especially yes/no and multiple choice” (p. 118). In addition, Fisher, Geiselmanand Raymond (1987) analyzed real police interviews andfound that about 8 of 10 questions asked were closed.

Research on children’s performance on different types ofquestions shows that children’s answers to open-endedquestions are usually much briefer (i.e., less complete) thanadults’, but are likely to be accurate. With respect to directedquestions, children’s level of accuracy decreases and is usuallyoften lower than that of adults (e.g., Milne & Bull, 1999;Pipe

et al.

, 2004). Peterson and Grant (2001) reviewedresearch showing that when children (kindergartners, grade

Page 2: Child witnesses' metamemory realism

462

C. M. Allwood

et al.

Scand J Psychol 47 (2006)

© 2006 The Authors. Journal compilation © 2006 The Scandinavian Psychological Associations.

2 and 4) were asked multiple-choice questions after havingwitnessed an event, and having been instructed that noneof the alternatives were necessarily correct, still had a dis-turbingly high tendency to choose one of the alternativesin situations where none of the alternatives were correct.Children’s tendency to avoid “don’t know” responses wasalso documented in a study by Roebers and Howie (2003).Here, 8 to 10 year-olds who had seen a 7 min long video andafter 14 days answered questions on the film gave fewer“don’t know” responses than adult participants. A some-what parallel finding has been reported in research on line-ups where children tend to make a selection in target-absentline-ups to a higher extent than adults (Pozzulo & Lindsay,1998; 1999). Taken together, these results suggest that children,at least in the contexts studied, may have a greater tendencythan adults to be “assertive” with respect to their episodicmemories. Such an assertiveness may also spill over intochildren’s metacognitive judgments about the correctness oftheir own memories.

Research on children’s metamemory started in the 1970’s(see, e.g., Flavell, 1979). In spite of the fact that there hasbeen extensive research on children’s metacognition in thedevelopmental and educational paradigms (for a review, seeHacker, Dunlosky & Graesser, 1998) and that progress hasbeen made in our understanding of the extent to which chil-dren’s testimonies can be trusted (e.g., Ceci & Bruck, 1993,1995; Lamb, Sternberg, Orbach, Hershkowitz & Esplin,1999; Peterson & Grant, 2001; Pipe

et al.

, 2004) littleresearch has analyzed the realism of children’s confidencejudgments (Allwood, Jonsson & Granhag, 2005b; Dirkzwager,1996; Horgan, 1992; Newman & Wick, 1987; Roebers,2002; Roebers, Gelhaar & Schneider, 2004; Roebers &Howie, 2003). Many of the studies on children’s confidencejudgments have used other tasks than event memory (e.g.,Dirkzwager, 1996; Horgan, 1992; Newman & Wick, 1987)and as far as we know only Allwood

et al.

(2005b, Roebers(2002), Roebers

et al.

(2004) and Roebers and Howie (2003)have studied the realism of children’s confidence judgmentsof answers to questions on their memory of events. More-over, only some of the studies on children have any clearimplications concerning the level of the realism in thechildren’s confidence. Although previous research includedyounger children (e.g., Roebers, 2002; Roebers

et al.

, 2004),in the present study 11 to 12-year-old children acted as parti-cipants since we wanted to be sure that they would understandthe concepts of probability involved in making confidencejudgments (see e.g., Schlottmann & Anderson, 1994).

In the study by Roebers (2002) 8- and 10-year-old childrenas well as adults were shown a video, and their memory wastested six weeks later. At that time the participants also con-fidence rated their answers on a five-step scale. The scalesteps showed “smiley” faces that varied from very sad tovery happy and were labeled “very unsure” to “very sure”.When they felt uncertain “participants were encouraged touse the ‘I don’t know’ option” (p. 1054). The results (Exp. 3)

showed a developmental effect for confidence judgments of

incorrect

(but not of correct) answers to two-alternativequestions, in that 8- and 10-year-old children were moreconfident than adults for incorrect items.

Roebers

et al.

(2004) used a similar methodology andinvestigated children’s recall of a magic show observed inthree different modalities (live show, video or slide show).The results showed that the magnitude of the difference inthe confidence judgments of correct and incorrect answersincreased significantly with age for 5–6 years-olds, 7–8 year-olds and 9–10 year-olds. Furthermore, no difference wasfound in metacognitive monitoring ability as a function ofpresentation modality. Roebers and Howie (2003) investig-ated 8- and 10-year-old children’s event memory of a shortfilm clip and compared the results with adults. For

unbiasedquestions

, all three age groups were found to give signific-antly higher confidence judgments for correct answers thanfor incorrect answers. However, for

misleading questions

,only adults gave significantly higher confidence judgmentsfor correct answers than for incorrect answers.

In the study by Allwood

et al.

(2005b), children were showna short film clip. Half of the participants were told that thefeedback came from an unknown classmate and the otherhalf were told that it came from an unknown schoolteacherat their school. Half of each participant’s correct answers toquestions on the film were given confirmatory feedback andthe other half was given disconfirmatory feedback, and sim-ilar for their incorrect answers. The participants rated theirconfidence in each of their answers on a numeric scale from50% (“Guessing”) to 100% (“Absolutely sure”). After posi-tive feedback very high levels of overconfidence were found.

In the studies by Roebers (2002) and Roebers

et al.

(2004),the participants were instructed to use a very high responsecriterion. In contrast, in the present study we investigatedthe response criterion that children would spontaneouslyuse. Furthermore, a problem in the studies by Roebers(2002) and Roebers

et al.

(2004) is that the nature of theconfidence scale used does not provide well for an informa-tive measurement of the degree of realism in the children’sconfidence judgments. For example, it is not possible tounderstand the

extent

to which the children showed poorerrealism than the adults or, more specifically, what aspectsof their confidence judgments showed poorer realism. Ingeneral, calibration methodology appears to be the currentmethodology best suited to measure the realism in con-fidence judgments. Calibration methodology is better than,for example, computing the correlation between the level ofthe confidence judgments and the accuracy (see, e.g., Juslin,Olsson & Winman 1996; Weingardt, Leonesio & Loftus,1994; Wells, Olson & Charman, 2002). Juslin

et al.

(1996)pointed out that one important problem with the correlationmeasure is that the level of the correlation depends on thespread of the confidence judgments on the confidence scale.For example, even in a situation with

de facto

perfectrealism, a small spread is associated with low correlations.

Page 3: Child witnesses' metamemory realism

Scand J Psychol 47 (2006)

Child witnesses’ metamemory realism

463

© 2006 The Authors. Journal compilation © 2006 The Scandinavian Psychological Associations.

In many of the studies reviewed above, the children wereasked to provide confidence judgments that were measuredon a scale (Allwood

et al.

, 2005b; Dirkzwager, 1996; Newman& Wick, 1987; Roebers, 2002; Roebers

et al.

, 2004; Roebers& Howie, 2003). These scales differed, but none of the stud-ies reported any problems with the way confidence wasmeasured. However, it is relevant to ask how contingent thestability of results concerning the realism in children’s con-fidence judgments is on the specific scale used. In order toimprove our understanding on this issue, we investigated therealism of children’s confidence judgments using four differentscales. One of these four scales was a numeric scale that iscommonly used in calibration research with adults. It is ofinterest to see if the realism in children’s confidence judg-ments when using this scale differs compared with using scalesthat, on the surface, may seem more suitable for children.

Gender differences

Not much research has investigated possible gender differ-ences in the degree of realism in confidence judgments. Foradults, few signs of such differences have been found (e.g.,Jonsson & Allwood, 2003; Roebers, 2002; Roebers & Howie,2003; Stankov, 1999). However, Beyer and Bowden (1997)presented results indicating that the degree of realism in con-fidence judgments might be associated with the conventional“masculinity/femininity” of the task domain. In contrast,Jonsson and Allwood (2003) found that both genders wereoverconfident in tasks involving logical/spatial ability, butfound no gender differences for such tasks. However, for tasksinvolving word knowledge, females proved to be somewhatunderconfident, while males showed fairly good realism.

For children, little research on gender differences has beencarried out. In three experiments, Roebers (2002) found, similarlyto the results for the adults in her study, no gender differencesfor 8- and 10-year-old children. Likewise, Roebers

et al.

(2004),Roebers and Howie (2003, Study 1) and Allwood

et al.

(2005b)found no gender differences in the realism of the item-specificconfidence judgments of answers to event memory questions.

Frequency judgments

In addition to asking the participants to confidence rateeach of their answers to 44 questions on the contents of afilm clip they had just witnessed (item-specific confidencejudgments), we also, at the end of the experimental session,asked them to estimate

how many

of all 44 questions theythought they had answered correctly. Such judgments arecalled frequency judgments, or aggregate-item judgments.When general knowledge questions are used, item-specificconfidence judgments tend to result in overconfidence (butsee Gigerenzer, Hoffrage & Kleinbölting, 1991) and frequencyjudgments tend to show rather good realism (Allwood &Granhag, 1996; Treadwell & Nelson, 1996, Exp. 1). That is,the estimated number of correct answers overlaps with the

actual number of questions answered correctly. However,when episodic memory is tested, the item-specific confidencejudgments again show overconfidence, but the results for fre-quency judgments vary between underestimation and realism(Allwood

et al.

, 2003; Allwood, Knutsson & Granhag, 2006;Granhag, 1997; Granhag, Strömwall & Allwood, 2000). Inother words, for adults, overestimation of performance whenconducting frequency judgments appears rare.

For children, the degree of realism in frequency judgmentshas, to our knowledge, only been investigated by Allwood

et al.

(2005b). These authors gave their participants feedbackon their performance, and later asked the children to givefrequency judgments. The children, in contrast to the resultsin studies on adults’ frequency judgments, overestimatedthe number of correctly answered questions (by about 8%).In addition, as for the item-specific confidence judgments, nogender differences in the realism in the frequency judgmentswere found. It may be noted that the so-called confidencefrequency effect (Gigerenzer

et al.

, 1991) according to whichthe level of the confidence judgments is typically higher thanthat of the frequency judgments, appears to be valid both foradults and for children. Given this, and that the level of thechildren’s confidence judgments is sufficiently high, and thatthe absolute distance between the two types of judgmentsdoes not differ between children and adults, the children’sfrequency judgments may be expected to show overconfidence.In the present study, we explore the issue of the realism inchildren’s frequency judgments further in a context where nofeedback was given to the children on their performance.

Hypotheses

Drawing on previous research results, our first hypothesiswas that children would show overconfidence and, in addi-tion, a higher level of overconfidence than adults. Since pre-vious research has indicated robustness of different types ofconfidence scales, our second hypothesis was that we wouldfind no differences in the calibration measures for the fourscales investigated. Our third hypothesis was that the childrenin our study, just as the children in the study by Allwood

et al.

(2005b), and as noted above, would show overconfidence intheir frequency judgments, in contrast to adults in mostprevious studies on event memory (Allwood

et al.

, 2003, inpress; Granhag, 1997; Granhag

et al.

, 2000). Finally, ourfourth hypothesis was that we, similarly to most of the pre-vious research studies, would find gender differences neitherin the level of realism in the item-specific confidence judg-ments, nor in the level of realism in the frequency judgments.

METHOD

Participants

The participants were 81 children (41 girls and 40 boys, aged 11–12).They came from four school classes, grades 5–6, in a middle-class

Page 4: Child witnesses' metamemory realism

464

C. M. Allwood

et al.

Scand J Psychol 47 (2006)

© 2006 The Authors. Journal compilation © 2006 The Scandinavian Psychological Associations.

area in the south of Sweden. Each class received a contribution toa school trip of approximately US$115 for participating. None ofthe participants had any previous experience of this type of task.For all children, the parents’/carers’ permission was obtained beforethey participated in the study.

Design

The participants were randomized into four conditions that repre-sented four different confidence judgment scales: the numeric scale(

n

= 20), the picture scale (

n

= 22), the line scale (

n

= 20) and thewritten scale (

n

= 19).

Materials

Videotape.

A 3 min and 50 s long color videotape showing two menkidnapping a woman at a bus stop was used (see Granhag, 1997).First the camera (simulating a witness’ perceptual field) enters a busstand, scans the nearby surroundings and finally focuses on anapproaching woman who enters the bus stand. A few cars and twowomen pass by the scene. Suddenly, a car pulls up at the bus standand two men get out and walk up to the woman. One of the menhands the woman an identity card. While she is checking the card,the other man firmly presses a handkerchief against her mouth andnose. The woman resists by screaming and fighting but is overpow-ered by the two men. During the fight, she drops her purse and oneof the men bends down to pick it up as the camera approaches him.He pulls out a gun and points it to the camera. Faced with the gunthe camera (still simulating a witness) backs up into its originalposition in the bus stand. The two men then place the woman whois half unconscious in the backseat of the car. Finally, they get intothe car and drive away.

Questions and confidence/frequency judgments.

Forty-four questionsconcerning the movie were used. Each question had two answer

alternatives. After each of the 44 questions the participants made aconfidence rating, i.e., they rated how sure they were that they hadanswered the question correctly on a scale ranging from 50% to100%. It was explained that 50% meant that he/she was guessingand 100% meant that he/she was absolutely sure the answer wascorrect. At the end of the experiment, the participants performeda frequency judgment, i.e., they estimated how many of the 44questions they had answered correctly.

Confidence scales

The four conditions differed depending on what kind of probabilityscale the participants in the condition received. The scales are shown inFigs. 1–4. The different scales were chosen in order to investigate therobustness of children’s confidence judgments using scales with differentfeatures. More specific justifications for each scale are given below,in connection with a more detailed description of each scale.

In the

numeric scale

condition (see Fig. 1), the participants madea numeric estimate consisting of any number from 50% (guessing)to 100% (absolutely sure), indicating how sure they were that theanswer was correct. This scale is often used in calibration researchwith adult participants and was included in order to allow comparisonwith results from studies with adult participants.

In the

picture scale

condition, the scale consisted of six heads withmouths representing sadder (towards the left side of the paper) tohappier (towards the right side of the paper) moods (see Fig. 2).Each face also had a balloon text. The text in the balloon spelledout the numeric scores with an accompanying written explanation(here translated from Swedish): 50% (I’m VERY UNSURE/GUESSING), 60% (I’m PRETTY UNSURE), 70% (I’m SOME-WHAT UNSURE), 80% (I’m SOMEWHAT SURE), 90% (I’mPRETTY SURE) and 100% (I’m VERY SURE/ABSOLUTELYSURE). The participants were instructed to put a cross over thehead that best corresponded to their feelings of confidence in havingchosen the correct answer. This type of scale is often used inresearch with younger children (e.g., Roebers, 2002) and wasincluded as a comparison scale to the more “mature” scales such asthe numeric scale above.

In the

line scale

condition, the participants were instructed toindicate their confidence by drawing a vertical line over a horizontalrectangle (see Fig. 3). The left half of the rectangle was black. Thisindicated that the base-line probability to answer a question correctlywas 50% (two answer alternatives were given for each question).The participants could draw the line anywhere from the middle

Fig. 2. The picture scale.

Fig. 1. The numeric scale.

Page 5: Child witnesses' metamemory realism

Scand J Psychol 47 (2006)

Child witnesses’ metamemory realism

465

© 2006 The Authors. Journal compilation © 2006 The Scandinavian Psychological Associations.

where the black field ended (at this point

Absolutely unsure

, 50%,was written above the horizontal line) up to the right-hand end(where

Absolutely sure

, 100%, was written above the line).The use of the line scale was inspired by research presented by

Nilsson (1998), who on the basis of his results, suggested thatyounger children’s (6 year olds and to some extent 10 year olds) useof probability might be influenced by “perceptual factors such assize, shape and colour” (Nilsson, 1998, p. 97). The line scale wasincluded in order to examine if the possibility to give estimates bydelimiting different-sized areas would influence the 11 to 12 year-old participants’ confidence ratings compared to the other scales.

In the

written scale

condition, written explanations and numericfigures were presented in rows with

50% Absolutely unsure (Correct50 times of 100)

at the top, followed by (each on a new line)

60%Pretty unsure (Correct 60 times of 100)

,

70% Somewhat unsure(Correct 70 times of 100)

,

80% Somewhat sure (Correct 80 times of100)

,

90% Pretty sure (Correct 90 times of 100)

and

100% Abso-lutely sure (Correct 100 times of 100)

at the bottom (see Fig. 4). Theparticipants were instructed to put a cross to the left of the proba-bility estimate that best corresponded to their feelings of confidencein the specific answer being correct.

The inclusion of the verbal scale was based on results pre-sented by Teigen (2001) and Teigen and Brun (1999). Theseresults showed that the results for adults for written probabilityphrases tended to differ from numerical probabilities and wewanted to explore if the inclusion of written probability statementswould have any effect on the level of children’s confidencejudgments.

Procedure

First, the participants in all four conditions were informed that theyparticipated in an experiment and that it was important that theypaid close attention to the video that they were about to see. Thevideo was shown in groups of five individuals. The children wereseated with intervals of 2 m and good order prevailed during theshort show. After viewing the videotape, the participants completeda 10 min training session on probability assessments. The trainingsession was included in order to (a) make sure that the participantsunderstood the probability scale used when making the confidencejudgments and (b) function as a filler-task to reduce memory of thecontent of the videotape. In the training session, general explana-tions concerning probability estimates were given (for example, itwas explained that a scale value of 60% meant that in the long run,60% of the items they had confidence rated as 60% sure should becorrect). Together with the experimental leader the children solvedpractical examples. Next, the children received general instructionsabout the study and answered 44 two-alternative questions on thefilm. After each question, the children confidence rated theiranswers on a scale ranging from 50% to 100%. In this phase, thefour conditions differed in the kind of probability scale the childrenused (see above). After all questions were answered and confidencerated, the experimenter collected the question formulas. Finally, thechildren gave a frequency judgment, i.e., they estimated how manyof the 44 questions they had answered correctly. Before leaving, theparticipants were thanked and asked not to talk about the contentsof the experiment with anyone during the day.

Calibration measures

Three calibration measures were used to analyze the degree of realismin the participants’ confidence judgments.

Calibration

reflects therelation between the level of the confidence ratings and the accuracy.The formula for computing calibration is:

Here

n

is the total number of questions answered,

T

is the numberof confidence classes used,

c

t

is the proportion of correct answersfor all items in the confidence class

r

t

,

n

t

is the number of times theconfidence class

r

t

was used and

r

tm

is the mean of the confidenceratings in confidence class

r

t

.

Thus, calibration is computed by firstdividing participants’ confidence ratings into a number of con-fidence classes. Next, for each confidence class, the difference is takenbetween the mean confidence for the items and the proportion ofcorrect items. Finally, the squared differences multiplied by thenumber of responses in the confidence class are summed overconfidence classes and divided by the total number of items.

Over/underconfidence

is computed in the same way, except thatthe differences are not squared. The measure indicates whether an

Fig. 3. The line scale.

Fig. 4. The written scale.

Calibration = 1/ ( )n n r ct tm tt

T

−=∑ 2

1

Page 6: Child witnesses' metamemory realism

466

C. M. Allwood

et al.

Scand J Psychol 47 (2006)

© 2006 The Authors. Journal compilation © 2006 The Scandinavian Psychological Associations.

individual is overconfident (positive value) or underconfident(negative value). Calibration is perfect and over/underconfidence isabsent when their values are zero.

Resolution

, loosely speaking, reflects the ability of the subject todistinguish between two sets of answers, one correct and one incorrect.The formula for computing resolution is:

Here,

c

is the proportion of all items for which the correct alternativewas selected. To achieve maximal resolution a subject within eachconfidence class has to assign the lowest confidence to all questionsanswered incorrectly and the highest confidence to all questionsanswered correctly (or vice versa). A higher value reflects betterresolution than a lower. These measures are further described inLichtenstein, Fischhoff and Phillips (1982).

RESULTS

We first present the calibration curves for the four differentscales (numeric, picture, line and written) and the analysesrelating to the five dependent measures; calibration, over/underconfidence, resolution, accuracy and confidence. Afteran analysis of the variance in the participants’ confidencejudgments, we present an analysis of the five dependentmeasures with respect to gender. After this, the results for thefrequency judgments are presented in total and by gender.

Realism and the four confidence scales

Figure 5 shows the calibration curve for each of the fourconditions. The x-axis shows the six different confidence

classes (50–59, 60–69, 70–79, 80–89, 90–99, 100) and they-axis shows the proportion of correct answers. The referenceline (the diagonal) represents perfect calibration. As can beseen, the participants showed overconfidence in all fourconditions. Table 1 shows the percentages of all items ineach of the four conditions (numeric, picture, line, written)in each of the six confidence classes, 50–59, 60–69, 70–79,80–89, 90–99 and 100.

Five univariate ANOVAs, with the four scale conditions(numeric, picture, line and written) as the between-subjectsfactor, were computed for the results from the 44 questionsfor the five measures; calibration, over/underconfidence,resolution, accuracy and confidence. The results are shownin Table 2. No significant differences between the four scaleswere found for any of the dependent measures.

In order to analyze if the four scales affected the errorcomponent of the participants’ confidence judgments, wefirst computed the SDs for each of the participants’ confidencejudgments. The average SDs in each condition were: for thenumerical scale condition = 16.57; for the picture scalecondition 14.64; for the line scale condition 15.82; and forthe written scale condition 15.29. Next, we computed a one-way ANOVA with the SDs of the participants’ confidencejudgments as the dependent variable. No significant effectwas found.

In order to test our first hypothesis, we compared ourresults for overconfidence with the results for adult parti-cipants in a condition in the study by Allwood

et al.

(2003,Exp. 2, phase 1) that was conducted in a very similar waycompared to the present study. In this condition (

n

= 22) the

Fig. 5. Calibration-curves for the four scales (numeric, picture, lineand written).

Resolution = 1/ ( )n n c ct tt

T

−=∑ 2

1

Table 1. Percentages of all items in each of the four conditions(numeric, picture, line, written scale) in each of the six confidenceclasses, 50–59, 60–69, 70–79, 80–89, 90–99 and 100

Scale

Confidence class

50–59 60–69 70–79 80–89 90–99 100

Numeric 17.8 10.2 18.0 8.1 13.2 32.7Picture 10.6 9.3 12.2 16.8 18.6 32.4Line 28.1 12.0 14.7 11.0 17.4 16.8Written 13.4 11.4 15.2 17.2 22.0 20.8

Table 2. Means (and SDs within parentheses) for the four scales(numeric, picture, line and written) on the five measures calibration,over/underconfidence, resolution, accuracy and confidence

Scale

Numeric Picture Line Written

Calibration 0.10 (0.06) 0.10 (0.05) 0.09 (0.05) 0.09 (0.05)Over/underconfidence 0.22 (0.13) 0.24 (0.11) 0.20 (0.12) 0.22 (0.09)Resolution 0.04 (0.02) 0.03 (0.02) 0.03 (0.02) 0.03 (0.02)Accuracy 0.59 (0.09) 0.58 (0.08) 0.56 (0.08) 0.57 (0.08)Confidence 0.81 (0.09) 0.82 (0.08) 0.76 (0.10) 0.78 (0.07)

Page 7: Child witnesses' metamemory realism

Scand J Psychol 47 (2006)

Child witnesses’ metamemory realism

467

© 2006 The Authors. Journal compilation © 2006 The Scandinavian Psychological Associations.

participants were university students and the mean age was22.6, ranging from 19 to 38 years. The participants used thesame numerical scale, the same film and the same memoryquestions (one question more) as in the present study.Furthermore, just as in the present study, the participantsanswered the questionnaire more or less directly after havingwatched the film. The overconfidence in the Allwood

et al.

(2003, Exp. 2, phase 1) study was 0.12,

1

to be compared withthe overconfidence in the numeric scale condition in thepresent study, 0.22. The effect size for this comparison wasCohen’s

d

= 0.96.

Gender differences in the item-specific confidence judgments

Five one-way ANOVAs, with the between-subjects factorgender, were computed for the same five dependent measuresas above. For each ANOVA the four conditions (numeric,picture, line and written) were collapsed. The results (seeTable 3) showed a significant effect for calibration,

F

(1, 79)= 4.51,

p

< 0.037 (girls

M

= 0.08 and boys

M

= 0.10), i.e.,the girls were better calibrated than the boys. Also, confidenceshowed a significant difference,

F

(1, 79) = 4.65,

p

< 0.034between girls (

M

= 77%) and boys (

M

= 82%).We refrained from computing a 2 (girl, boy)

×

4 (numeric,picture, line, and written) ANOVA, since gender was notequally distributed over the four scales, numeric (girls

n

= 12,boys

n

= 8), picture (girls

n

= 11, boys

n

= 11), line (girls

n

= 11, boys

n

= 9) and written (girls

n

= 7, boys

n

= 12).

Frequency judgments

The children were also asked to give a frequency judgment,i.e., they were asked to estimate how many of the 44 ques-tions they thought they had answered correctly. For ease ofcomparison, the estimates given by the children, stated interms of

number

of questions, have been translated into

percentage

of questions estimated as correctly answered.In order to test for differences between the scales, a one-wayANOVA was computed on the frequency judgments withscale (numeric, picture, line and written) as the between-subjects factor. A close to significant effect was found(numeric

M

= 75%, picture

M

= 67%, line

M

= 63%, and

written

M

= 68%),

F

(3, 77) = 2.60,

p

< 0.058. The pairwisecomparison based on estimated marginal means, and withadjustment for multiple comparisons (Bonferroni), showedthat the numeric scale and the line scale differed significantly,p < 0.048.

Further, a one-way ANOVA was computed on the fre-quency judgments with the between subjects factor gender.The result showed a significant difference between girls(M = 65%) and boys (M = 72%), F(1, 79) = 5.07, p < 0. 027.In other words, the girls believed they had fewer accurateanswers than did the boys. It should be noted that the girls’accuracy did not differ from the boys’ (see Table 3).

In order to analyze the realism in the frequency judgments,two paired samples t-tests were computed, one for girls andone for boys, comparing the level of the frequency judg-ments with the level of accuracy. The results showed a sig-nificant difference between the levels of frequency judgmentsand accuracy both for girls and boys. For girls, the meanfrequency judgment was 65% and the mean accuracy was58%, t(1, 40) = −3.01, p < 0.004. For boys, the mean frequencyjudgment was 72% and the mean accuracy 57%, t(1, 39) =−5.80, p < 0.000. Thus, both genders believed they hadanswered more questions correctly than they in fact had.

DISCUSSION

This study investigated the realism in 11- to 12-year-old chil-dren’s confidence in their answers to questions on a shortfilm clip they had witnessed. Our first hypothesis wasconfirmed, in that the children showed overconfidence.Moreover, a comparison with results from a group of adults(Allwood et al., 2003, Exp. 2, phase 1) who had seen thesame film, answered the same question material in similarcircumstances as in the present study and were tested withthe numerical confidence scale as the comparison group inthe present study, showed that the children were much moreoverconfident than the adults. The effect size for this com-parison was d = 0.96, i.e., “large”. Comparisons with otheradult groups tested in somewhat less similar circumstancessupported the same conclusion (compare note 1). This resultis important for the evaluation of children’s confidence bythe courts and in the forensic process in general. It is, how-ever, important that more research is conducted on this issuein order to understand its possible limitations.

The results by Allwood et al. (2005b) supported the presentresults in that they showed that 11- to 12-year-old children(after having received confirmatory feedback on their answers)showed very high levels of overconfidence. The study byRoebers (2002) found developmental effects showing thatchildren showed higher confidence for incorrect items thanadults. However, no such difference in confidence was foundfor correct items. Likewise, the study by Roebers and Howie(2003) found indications of differences between children andadults with respect to the appropriateness of their confidencejudgments. However, the methodology used in the studies by

Table 3. Means for the two genders for the five measures calibration,over/underconfidence, resolution, accuracy and confidence

Gender

Girls Boys

Calibration 0.08* 0.10Over/underconfidence 0.19 0.24Resolution 0.03 0.03Accuracy 0.58 0.57Confidence 0.77* 0.82

* p < 0.05.

Page 8: Child witnesses' metamemory realism

468 C. M. Allwood et al. Scand J Psychol 47 (2006)

© 2006 The Authors. Journal compilation © 2006 The Scandinavian Psychological Associations.

Roebers and collaborators did not allow for any conclusionson the extent of the children’s overconfidence.

It is not clear how the differences in overconfidencebetween the 11- to 12-year-old children and adults can beexplained. One possibility is that the children were generallyless skilled in using the confidence scale. This is likely toincrease the error component in the children’s confidencejudgments which would show up in the size of the SDs forindividual participant’s confidence judgments. As shown byErev et al. (1994), greater variability in confidence judgments,everything else equal, is associated with greater overconfid-ence. However, when we computed the SDs for individuals’confidence judgments in the different scale conditions, theaverage SDs for the confidence judgments were all found tobe lower than the average SD for the confidence judgments(17.69) in the most similar adult comparison group we couldfind in our previous research (Allwood et al., 2003, Exp. 2,phase 1, described above). Thus, the general explanation forthe 11- to 12-year-old children’s greater overconfidence, thatthey were generally less skilled in performing confidencejudgments, did not receive empirical support.

However, in this general context, it is of interest to notethat the children’s greater overconfidence, compared withadults, is in line with a general tendency for children to beassertive about their episodic memories. Previous researchhas, for example, shown that children, compared with adults,have a greater tendency to avoid “don’t know” options andto have a greater tendency to make false identifications intarget-absent line-ups. Moreover, on a speculative note,children’s greater assertiveness about their episodic memoriesmay at least partly explain their greater suggestibility docu-mented in previous research. However, the possible implica-tions of the notion that children are more assertive abouttheir episodic memories than adults need to be furtherinvestigated in future research.

Our second hypothesis was also confirmed, in that thechildren’s metacognitive realism as measured by the differentconfidence scales did not differ. Thus, 11- to 12-year-oldchildren’s confidence judgments appear robust, at least tothe extent that they were not much affected by the differ-ences in the features of the different scales. These results arein line with the results presented by Roebers et al. (2004)who did not find any difference in metacognitive monitoringability as a function of the three types of presentationsmodality studied (live show, video and slides).

Even though we did not find any difference in the resultsfor the scales investigated, our recommendation is, however,to use the numeric scale. Since this is the scale most fre-quently used in studies of adults’ confidence judgments, theresults will be more comparable between studies if the samescales are used in the exploration of developmental effects.However, a caveat is needed with respect to the age of chil-dren. Our results relate to 11 to 12-year-old children andfurther research is needed in order to know if such a scale issuitable also for younger children. For example, there is

evidence (Nilsson, 1998) that children as old as 6 years basetheir probability estimates on perceptual factors such as size,shape and color.

The third hypothesis, that the participants would showoverconfidence in their judgments of their number of correctlyanswered event memory questions (frequency judgments)was confirmed. In contrast to previous research on adults’frequency judgments (Allwood et al., 2003, 2006; Granhag,1997; Granhag et al., 2000, 2004), but in line with the resultsin our previous research (Allwood et al., 2005b), the childrenshowed a significant overestimation in their frequencyjudgments. In other words, the children thought they hadanswered more questions correctly than they actually had.Given that children and adults do not differ in the distancebetween the levels of their confidence and the frequencyjudgments, this result was expected, since our participantsshowed high levels in their confidence judgments and alsoevidenced the so-called confidence-frequency effect, i.e., thatthe confidence judgments were at a higher level than thefrequency judgments.

Our fourth hypothesis was not confirmed. Here, againrelying on previous research on adults where the majorfinding is that there are no gender effects (e.g., Jonsson &Allwood, 2003; Stankov, 1999), even if the results are slightlymixed, we did not expect any gender effects. However, ourresults showed gender differences in the measures of cali-bration and confidence. The girls were significantly bettercalibrated than the boys, although the difference in absoluteterms was not large. The girls also showed lower confidencethan the boys. The results are in contrast to previous researchwhere no gender differences have been found in the meta-cognitive appropriateness of confidence judgments (e.g.Roebers, 2002; Roebers & Howie, 2003). A gender differencealso appeared in the frequency judgments, where the boysoverestimated their number of correctly answered questionsto a higher degree than the girls. Here it is important toremember that boys and girls did not differ on the accuracymeasure. In our previous research (Allwood et al., 2005b) on12-year-old children receiving feedback, we found no signi-ficant differences between the genders with respect to theirfrequency judgments, although the results were in the samedirection as in the present study. Summarizing the researchon gender differences, the results are mixed, but whereas thetendency for adults appears to be that there are no genderdifferences, the tendency for children is less stable. However,more research is obviously needed on this issue.

Our results show that 11- to 12-year-old children demon-strate worse realism in both confidence and frequencyjudgments than has been the case for adults in previousstudies, using the same film clip and the same questions(e.g., Allwood, et al., 2003, 2006; Granhag, 1997; Granhaget al., 2000). It is of interest that the children had only some-what lower accuracy than the adults, on average 0.59 comparedto 0.61 (averaged over the results in Allwood et al., 2006;Granhag, 1997; Granhag et al., 2004, which all used the

Page 9: Child witnesses' metamemory realism

Scand J Psychol 47 (2006) Child witnesses’ metamemory realism 469

© 2006 The Authors. Journal compilation © 2006 The Scandinavian Psychological Associations.

same film and questions but with adults). The children over-estimated their performance to a very high degree, both withrespect to the item-specific judgments and the frequencyjudgments. Future research should be conducted in order tobetter describe the developmental process from the highlevel of overconfidence evidenced in children in this study,and the study by Allwood et al. (2005b), to the more moder-ate levels of overconfidence commonly found in adults in thesame context (e.g., Allwood et al., 2006).

Our results and those of Roebers (2002), Roebers et al.(2004) and Roebers and Howie (2003) are relevant to thequestion of children’s credibility in the forensic process. Previousresearch by, for example, Myers, Redlich, Goodman, Prizmichand Imwinkelried (1999) has reported mixed results withrespect to children’s credibility in court and in the forensicprocess, depending for example on the type of study and the typeof crime involved. However, the present results unfortunatelyshow that when it comes to the children’s confidence in theiranswers to closed questions with answer alternatives, theremay be less reason to find them as credible as adults.

This study was supported by a grant from the Bank of SwedenTercentenary Foundation given to the first author.

NOTE1 These results are representative for other studies using the same

film clip, the same confidence scale, the same questions and adultparticipants. The value for overconfidence in the relevant conditionsin Allwood et al. (2003), Allwood et al. (2006), Granhag (1997),Granhag et al. (2000) and Granhag et al. (2004) varies betweenM = 0.061 and M = 0.127.

REFERENCES

Ackil, J. K. & Zaragoza, M. S. (1998). Memorial consequences offorced confabulation: Age differences in susceptibility to falsememories. Developmental Psychology, 34, 1358–1372.

Allwood, C. M., Ask, K. & Granhag, P. A. (2005a). The CognitiveInterview: Effects on the realism in witnesses’ confidence in theirfree recall. Psychology, Crime & Law, 11, 183–198.

Allwood, C. M. & Granhag, P. A. (1996). Considering the know-ledge you have: effects on realism in confidence judgements.European Journal of Cognitive Psychology, 8, 235–256.

Allwood, C. M., Granhag, P. A. & Johansson, M. (2003). Increasedrealism in eyewitness confidence judgments: The effect of dyadiccollaboration. Applied Cognitive Psychology, 17, 545–561.

Allwood, C. M., Jonsson, A. C. & Granhag, P. A. (2005b). Theeffects of source and type of feedback on child witnesses’ meta-memory accuracy. Applied Cognitive Psychology, 19, 331–344.

Allwood, C. M., Knutsson, J. & Granhag, P. A. (2006). Eyewitnessunder influence: How feedback affect the realism in confidence.Psychology, Crime & Law, 12, 25–38.

Beyer, S. & Bowden, E. M. (1997). Gender differences in self-perception: Convergent evidence from three measures of accuracyand bias. Personality & Social Psychology Bulletin, 23, 157–172.

Bothwell, R. K., Deffenbacher, K. A. & Brigham, J. C. (1987).Correlations of eyewitness accuracy and confidence: optimalityhypothesis revisited. Journal of Applied Psychology, 72, 691–695.

Ceci, S. J. & Bruck, M. (1993). Suggestibility of the child witness:A historical review and synthesis. Psychological Bulletin, 113,403–439.

Ceci, S. J. & Bruck, M. (1995). Jeopardy in the courtroom: A scien-tific analysis of children’s testimony. American PsychologicalAssociation: Washington, DC.

Cutler, B., Penrod, S. D. & Stuve, T. E. (1988). Juror decision makingin eyewitness identification cases. Law and Human Behavior, 12,41–55.

Dirkzwager, A. (1996). Testing with personal probabilities: 11-year-olds can correctly estimate their personal probabilities. Educa-tional and Psychological Measurement, 56, 957–971.

Erev, I., Wallsten, T. S. & Budescu, D. V. (1994). Simultaneous over-and underconfidence: The role of error in judgment processes.Psychological Review, 101, 519–527.

Fisher, R. P., Geiselman, R. E. & Raymond, D. S. (1987). Criticalanalysis of police interview techniques. Journal of Police Scienceand Administration, 15, 177–185.

Flavell, J. H. (1979). Metacognition and cognitive monitoring: Anew area of cognitive-developmental inquiry. American Psy-chologist, 34, 906–911.

Granhag, P. A. (1997). Realism in eyewitness confidence as a functionof type of event witnessed and repeated recall. Journal of AppliedPsychology, 82, 599–613.

Granhag, P. A., Jonsson, A. C. & Allwood, C. M. (2004). TheCognitive Interview and its effect on witnesses’ confidence.Psychology, Crime & Law, 10, 37–52.

Granhag, P. A., Strömwall, L. A. & Allwood, C. M. (2000). Effectsof reiteration, hindsight bias, and memory on realism in eye-witness confidence. Applied Cognitive Psychology, 14, 397–420.

Gigerenzer, G., Hoffrage, U. & Kleinbölting, H. (1991). Probabilisticmental models: A Brunswikian theory of confidence. PsychologicalReview, 98, 519–527.

Hacker, D. J., Dunlosky, J. & Graesser, A. C. (Eds.) (1998).Metacognition in educational theory and practice. London:Lawrence Erlbaum Associates.

Holliday, R. E. & Albon, A. J. (2004). Minimizing misinformationeffects in young children with cognitive interview mnemonics.Applied Cognitive Psychology, 18, 263–281.

Horgan, D. D. (1992). Children and chess expertise: The role ofcalibration. Psychological Research, 54, 44–50.

Jonsson, A.-C. & Allwood, C. M. (2003). Stability and variability inthe realism of confidence judgments over time, content domain,and gender. Personality and Individual Differences, 34, 559–574.

Juslin, P., Olsson, H. & Winman, A. (1996). Calibration and diagnos-ticity of confidence in eyewitness identification. Comments onwhat cannot be inferred from a low confidence-accuracy cor-relation. Journal of Experimental Psychology: Learning, Memoryand Cognition, 22, 1304–1316.

Lamb, M. M. E., Sternberg, K. J., Orbach, Y., Hershkowitz, I. &Esplin, P. W. (1999). Forensic interviews of children. In Memon,A. & Bull, R. (Eds.), Handbook of the psychology of interviewing(pp. 253–277). New York: Wiley.

Lichtenstein, S., Fischhoff, B. & Phillips, L. D. (1982). Calibrationof probabilities: The state of the art to 1980. In D. Kahneman,P. Slovic & A. Tversky (Eds.), Judgments under uncertainty:Heuristics and biases (pp. 306–334). New York: CambridgeUniversity Press.

Lindsay, R. C. L., Wells, G. L. & Rumpel, C. M. (1981). Can peopledetect eyewitness identification accuracy within and acrosssituations? Journal of Applied Psychology, 66, 79–89.

Milne, R. & Bull, R. (1999). Investigative interviewing – Psychologyand practice. Chichester: John Wiley & Sons.

Myers, J. E. B., Redlich, A. D., Goodman, G. S., Prizmich, L. P. &Imwinkelried, E. (1999). Jurors’ perceptions of hearsay in childsexual abuse cases. Psychology, Public Policy, and Law, 5, 388–419.

Page 10: Child witnesses' metamemory realism

470 C. M. Allwood et al. Scand J Psychol 47 (2006)

© 2006 The Authors. Journal compilation © 2006 The Scandinavian Psychological Associations.

Newman, R. S. & Wick, P. L. (1987). Effect of age, skill, and per-formance feedback on children’s judgments of confidence.Journal of Educational Psychology, 79, 115–119.

Nilsson, S.-E. (1998). Children’s decision making under conditions ofcertainty and risk. Department of Psychology, Göteborg Univer-sity, Sweden. (doctoral dissertation).

Peterson, C. & Grant, M. (2001). Forced choice: Are forensicinterviewers asking the right questions? Canadian Journal ofBehavioural Science, 33, 118–127.

Pipe, M.-E., Lamb, M. E., Orbach, Y. & Esplin, P. W. (2004).Recent research on children’s testimony about experienced andwitnessed events. Developmental Review, 24, 440–468.

Pozzulo, J. D. & Lindsay, R. C. L. (1998). Identification accuracyof children versus adults: A metaanalysis. Law and HumanBehavior, 22, 549–570.

Pozzulo, J. D. & Lindsay, R. C. L. (1999). Elimination lineups: Animproved identification procedure for child witnesses. Journal ofApplied Psychology, 84, 167–176.

Read, J. D., Lindsay, D. S. & Nicholls, T. (1998). The relationbetween confidence and accuracy in eyewitness identificationstudies. In C. P. Thompson, D. J. Hermann, J. D. Read, D. Bruce,D. G. Payne & M. P. Toglia (Eds.), Eyewitness memory: Theoreticaland applied perspectives (pp. 107–130). Mahwah, NJ: Erlbaum.

Roebers, C. M. (2002). Confidence judgments in children’s andadults’ event recall and suggestibility. Developmental Psycho-logy, 38, 1052–1067.

Roebers, C. M., Gelhaar, T. & Schneider, W. (2004). “It’s magic!”The effects of presentation modality on children’s event memory,suggestibility and confidence judgments. Journal of Experimen-tal Child Psychology, 87, 320–335.

Roebers, C. M. & Howie, P. (2003). Confidence judgments in eventrecall: Developmental progression in the impact of questionformat. Journal Experimental Child Psychology, 85, 352–371.

Schlottmann, A. & Anderson, N. H. (1994). Children’s judgmentsof expected value. Developmental Psychology, 30, 56–66.

Sporer, S., Penrod, S., Read, D. & Cutler, B. (1995). Choosing,confidence, and accuracy: A meta-analysis of the confidence-accuracy relation in eyewitness identification studies. Psycho-logical Bulletin, 118, 315–327.

Stankov, L. (1999). Mining on the “No Man’s Land” betweenintelligence and personality. In P. L. Ackerman, P. C. Kyllonen& R. D. Roberts (Eds.), Learning and individual differences(pp. 315–337). Washington, DC: American PsychologicalAssociation.

Teigen, K. H. (2001). When equal chances = Good chances: Verbalprobabilities and the equiprobability effect. OrganizationalBehavior and Human Decision Processes, 85, 77–108.

Teigen, K. H. & Brun, W. (1999). The directionality of verbal prob-ability expressions: Effects on decisions, predictions, and proba-bilistic reasoning. Organizational Behavior and Human DecisionProcesses, 80, 155–190.

Treadwell, J. R. & Nelson, T. O. (1996). Availability of informationand the aggregation of confidence in prior decisions. Organiza-tional Behavior and Human Decision Processes, 68, 13–27.

Weingardt, K. R., Leonesio, R. J. & Loftus, E. F. (1994). View-ing eyewitness research from a metacognitive perspective. InJ. Metcalfe & A. P. Shimamura (Eds.), Metacognition:Knowing about knowing (pp. 155–184). Cambridge, MA: MITPress.

Wells, G. L. & Bradfield, A. L. (1999). Distortions in eyewitnesses’recollection: Can the postidentification-feedback effect bemoderated? Psychological Science, 10, 138–144.

Wells, G. L., Lindsay, R. C. L. & Ferguson, T. J. (1979). Accuracy,confidence, and juror perception in eyewitness identification.Journal of Applied Psychology, 64, 440–448.

Wells, G. L., Olson, E. A. & Charman, S. D. (2002). The confidenceof eyewitnesses in their identifications from lineups. CurrentDirections in Psychological Science, 11, 151–154.

Received 29 November 2004, accepted 31 August 2005