The efficacy of various kinds of error feedback for improvement in the accuracy and fluency of L2 student writing Jean Chandler * New England Conservatory of Music and Simmons College, 15 Leonard Avenue, Cambridge, MA 02139, USA Abstract This research uses experimental and control group data to show that students’ correction of grammatical and lexical error between assignments reduces such error in subsequent writing over one semester without reducing fluency or quality. A second study further examines how error correction should be done. Should a teacher correct errors or mark errors for student self-correction? If the latter, should the teacher indicate location or type of error or both? Measures include change in the accuracy of both revisions and of subsequent writing, change in fluency, change in holistic ratings, student attitudes toward the four different kinds of teacher response, and time required by student and teacher for each kind of response. Findings are that both direct correction and simple underlining of errors are significantly superior to describing the type of error, even with underlining, for reducing long-term error. Direct correction is best for producing accurate revisions, and students prefer it because it is the fastest and easiest way for them as well as the fastest way for teachers over several drafts. However, students feel that they learn more from self- correction, and simple underlining of errors takes less teacher time on the first draft. Both are viable methods depending on other goals. # 2003 Elsevier Inc. All rights reserved. Keywords: Accuracy; Fluency; Second language writing; Error correction; Teacher feedback; Student preferences; Asian college students In 1996 Truscott wrote a review article in Language Learning contending that all forms of error correction of L2 student writing are not only ineffective but potentially harmful and should be abandoned. This was followed by a rejoinder by Journal of Second Language Writing 12 (2003) 267–296 * Tel.: þ1-617-492-8153/699-3429. E-mail address: [email protected] (J. Chandler). 1060-3743/$ – see front matter # 2003 Elsevier Inc. All rights reserved. doi:10.1016/S1060-3743(03)00038-9
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
The efficacy of various kinds of error feedbackfor improvement in the accuracy and
fluency of L2 student writing
Jean Chandler*
New England Conservatory of Music and Simmons College, 15 Leonard Avenue,
Cambridge, MA 02139, USA
Abstract
This research uses experimental and control group data to show that students’ correction
of grammatical and lexical error between assignments reduces such error in subsequent
writing over one semester without reducing fluency or quality. A second study further
examines how error correction should be done. Should a teacher correct errors or mark
errors for student self-correction? If the latter, should the teacher indicate location or type
of error or both? Measures include change in the accuracy of both revisions and of
subsequent writing, change in fluency, change in holistic ratings, student attitudes toward
the four different kinds of teacher response, and time required by student and teacher for
each kind of response. Findings are that both direct correction and simple underlining of
errors are significantly superior to describing the type of error, even with underlining, for
reducing long-term error. Direct correction is best for producing accurate revisions, and
students prefer it because it is the fastest and easiest way for them as well as the fastest way
for teachers over several drafts. However, students feel that they learn more from self-
correction, and simple underlining of errors takes less teacher time on the first draft. Both
are viable methods depending on other goals.
# 2003 Elsevier Inc. All rights reserved.
Keywords: Accuracy; Fluency; Second language writing; Error correction; Teacher feedback;
Student preferences; Asian college students
In 1996 Truscott wrote a review article in Language Learning contending that
all forms of error correction of L2 student writing are not only ineffective but
potentially harmful and should be abandoned. This was followed by a rejoinder by
Shortreed, 1986) investigated the effects of different types of teacher feedback
on error in student writing. For example, Lalande’s (1982) experimental group of
U.S. students of German as a second language improved in grammatical accuracy
on subsequent writing after using an error code to rewrite, whereas the control
group, which received direct correction from the teacher, actually made more
errors on the essay at the end of the semester. However, the difference between the
groups’ improvement was not statistically significant. On the other hand, in
Frantzen’s 1995 study of U.S. college students of intermediate Spanish, both the
grammar-supplementation group receiving direct correction and the nongrammar
group whose errors were marked but not corrected improved in overall grammar
usage on the post essay. Neither group showed significant improvement in written
fluency over the semester, however. All four of Robb et al.’s (1986) treatment
groups of Japanese college students learning English improved in various
measures of accuracy after receiving different types of error feedback2 — direct
correction, notation of the type of error using a code, notation in the text of the
location of error, and marginal feedback about the number of errors in the line. All
of Robb et al.’s treatment groups improved in fluency and in syntactic complexity.
But neither the Lalande (1982) or the Robb, Ross, and Shortreed (1986) study
had control groups which received no correction, and neither found statistically
significant differences between the various teacher response types. Lizotte (2001)
1 Nevertheless, referring to Fathman and Walley’s research, Truscott concluded (1996, p. 339),
‘‘Nothing in this study suggests a positive answer [in favor of error correction].’’2 Truscott concluded from the fact that there were no statistical differences between the four
treatment groups in Robb et al.’s study (1986, p. 331) — ‘‘grammar correction’s futility . . . showed’’
— even though there were increases in three different measures of accuracy by all four groups.
J. Chandler / Journal of Second Language Writing 12 (2003) 267–296 269
reported gains with Hispanic bilingual and ESL students of a low–intermediate
English proficiency. After introducing students to errors using a code, Lizotte
indicated only the location of errors for student self-correction. His students
reduced errors in their writing significantly over one semester at the same time
that they made significant gains in fluency (numbers of words written in a
specified amount of time). Like Robb et al., Lizotte did not have a control group
since he could not justify, either to himself as the teacher or to his students,
providing no error feedback.
Only Ferris and Roberts (2001) and Lee (1997) had control groups that received
no error correction. Lee (1997) studied EFL college students in Hong Kong and
found that students were significantly more able to correct errors that were
underlined than errors that were either not marked or only indicated by a check in
the margin. Ferris and Roberts (2001) studied ESL students from a U.S. university
and found that two groups that received corrective feedback (either on type of
error or on location) significantly outperformed the control group (no feedback)
on the self-editing task, but there were no significant differences between the two
experimental groups. Neither of these studies measured the effect of these
treatments on the accuracy of student writing over time.
The one study that dealt with the effects of various kinds of teacher feedback on
accuracy of both revision and subsequent writing, Ferris et al. (2000), claimed that
direct correction of error by the teacher led to more correct revisions (88%) than
indirect feedback (77%). This study has not been published, but Ferris (2002,
p. 20) discussed the findings: ‘‘However, over the course of the semester, students
who received primarily indirect feedback reduced their error frequency ratios
substantially more than the students who received mostly direct feedback.’’ This
2000 study was, however, descriptive rather than quasi-experimental.
One topic that is not controversial is L2 students’ views toward teacher
feedback on their written errors. Studies (Chenowith, Day, Chun, & Luppescu,
1991; Radecki & Swales, 1988; Rennie, 2000) have consistently reported that
student writers want such error feedback. According to Ferris and Roberts (2001),
the most popular type of feedback was underlining with description, followed by
direct correction, and underlining was third.
Study one: Does error correction improve accuracy in student writing?
The first study presented here tries to fill a gap in the research by examining
three questions: (a) Do students who are required to correct the grammatical and
lexical errors marked by the teacher make fewer such errors in their writing later
in the semester? (b) Do students who do not correct these errors underlined by the
teacher make fewer errors on subsequent writing? and (c) Is there any significant
difference in the improvement in grammatical and lexical accuracy of the two
groups on their writing later in the semester?
270 J. Chandler / Journal of Second Language Writing 12 (2003) 267–296
Method
Subjects
The students were all music majors, first- or second-year students at an
American conservatory. To be placed in this course, they had either scored
between 540 and 575 on the Test of English as a Foreign Language (TOEFL) or
they had completed a year-long intermediate English as a Second Language
(ESL) course at the same institution the previous year with a grade of B� or
better, after scoring at least 500 on the TOEFL. One class (the control group)
consisted of 16 undergraduates from East Asia (Korea, Japan, China, and
Taiwan), and the other (the experimental group) contained 15 similar students.
Each class had only one male student. Although students were not randomly
assigned to the classes, there was no indication of systematic differences between
them, and both classes were taught by the same teacher-researcher.
Setting
These college ESL classes had a communicative orientation.3 The goal of the
course was to improve the ability of these high intermediate/advanced students to
read and write in English. In the ESL curricular sequence there was one additional
one-semester course between this one and the traditional first year composition
course required of all students at the conservatory. The classes met for 50 min
twice a week over 14 weeks. During these 24 h of class time, selections from
various autobiographical writings were read and discussed, both to practice
reading skills and to point out features of good writing. In addition, students
spent some class time reading reviews written by published writers and by other
students, sometimes watched videos of autobiographical stories, sometimes did
pre-writing activities, and occasionally discussed common errors in student
writing for 5–10 min. The goal of homework assignments was extensive practice
in reading and writing; students could choose an autobiography to read and review
and were assigned to write their own autobiography.
This genre was certainly not a new or particularly difficult one4 for these
students, who were highly literate in their own language and had previously read
biographies, autobiographies, and other narrative writing. At the same time, most
of these Asian students reported on a teacher-made questionnaire administered at
the beginning of the year that they had not had extensive practice in expressive
writing in high school in their first language (most said that their writing
experience was limited to ‘‘reports’’), and certainly not in English. The majority
3 The fact that the setting is a class with a communicative orientation may be important since Lucy
Fazio (2001) found no positive effect on accuracy of error correction on elementary school children’s
journals and concluded that it was due to their classes’ saturation with focus on form.4 Studies with native English-speaking students (e.g., Craig, 1981; Quellmalz, Capell, & Chou,
1982; Stone, 1981) also found narrative writing to be relatively easier than other genres.
J. Chandler / Journal of Second Language Writing 12 (2003) 267–296 271
had had quite a bit of training in English grammar, however. Therefore, students
were given extensive practice in reading and writing in a genre and about content
they were familiar with in order to focus on improving both their reading and
writing fluency and the grammatical and lexical accuracy of their self-expression
in writing English.
Various invention strategies from a process approach to writing, such as free
writing or peer discussion in pairs before writing, were demonstrated in class.
Multiple drafts were assigned, but no peer reading was done on this autobio-
graphical writing out of respect for student privacy and preferences. (Peer
feedback was required later in the semester before revision of book reviews
students wrote on the autobiographies they read.)
The teacher gave both content and error feedback on the first draft of the
autobiographical assignments. The teacher always gave a brief positive end
comment on the content of the writing. Although the teacher also occasionally
made more specific marginal comments on the content, praising vivid images or
word choices or asking for more detail or clarification, most of the teacher
feedback was on errors in grammar and usage because the writing generally was
otherwise quite acceptable. (See sample of student writing in Fig. 1).
Although the teacher underlined 16 errors in these 102 words, she considered
the content of the writing to be good. Similarly, Kroll (1990) found no correlation
between rhetorical competency and syntactic accuracy in essays written by
advanced ESL students, either at home or in class.
Grades on the autobiography were given only on the final product at the end of
the semester, and they were based on both quantity and overall quality, including
correctness. No deduction was made for errors on intermediate drafts; on the
contrary, the teacher emphasized that the goal was to learn from errors.
Design and measures
The ESL writing classes in which this study was conducted provided an
appropriate setting to test the research questions because the first five written
homework assignments were identical: Students were simply told to write
approximately five typed, double-spaced pages about their own life. (Although
various aspects of autobiographical writing, e.g., describing events, people, and
places, were discussed in class during the semester, any or all of these could be
used in each assignment, and students could write about their lives in any order,
not just chronologically.) Thus, over the semester, each student’s goal was to write
about 25 pages of autobiographical writing in addition to a book review.
Both classes were taught by the same teacher-researcher in the same way and
both received error feedback. The only difference was that the experimental group
was required to revise each assignment, correcting all the errors underlined by the
teacher before submitting the next assignment, whereas the control group did all
the corrections of their underlined errors toward the end of the semester after the
first drafts of all five homework assignments had been written. The rationale for
272 J. Chandler / Journal of Second Language Writing 12 (2003) 267–296
this arrangement arose from the results of a previous questionnaire, where the
teacher had ascertained that the vast majority of students wanted the teacher to
mark every error. Since the students felt so strongly about this, the teacher could
only justify the treatment of the control group by offering them the same treatment
as the experimental group later in the semester after the first draft of the fifth
assignment was completed and the data collection for the study ended. Therefore,
the control group corrected their errors in several homework assignments later in
the semester after the data for this study had been collected (see Fig. 2). For both
groups, after students had tried to correct their errors based on the teacher’s
underlining of them, the teacher provided direct correction for any remaining
errors or ones that had been corrected incorrectly.
In both cases, the fifth chapter of the autobiography was written 10 weeks after
the first chapter, and the same teacher-researcher tried to underline every
grammatical and lexical error on all student texts. Fig. 2 shows the schedule
of data collection and feedback for the two groups.
The dependent measure in this first study was a calculation of error rate on the
first and fifth writing assignments. Although the assignments were all to write five
pages, they did not in fact yield texts of exactly the same length; therefore to
control for these small differences in text length, a measure of errors per 100
words was calculated (total number of errors/total number of words � 100).
One of the reasons Truscott gives for the putative harmful effects of error
correction is its negative effect on fluency. Wolfe-Quintero, Inagaki, and Kim
(1998) define fluency as ‘‘rapid production of language’’ (p. 117). For most
previous research studies, the measure of fluency used has been number of words
written. However, in this study, since length was stipulated in the assignment, a
different measure of fluency was used, i.e., the amount of time it took to write each
assignment.5 I investigated this question by asking each student to keep a record
of the total amount of time spent on writing each assignment. The time each
student reported spending on the first and on the fifth assignments was then
calculated per 100 words, and the change over the semester was used as an
I was born in the end of culture revolution. Since my mother’s family is ‘the blacks’ instead of
‘the reds’, one of my grandparents three children was to send somewhere afar, of course, they let
the daughter in order to keep the sons close. My mother was sent to Gui Zhou. Right before the
due date, she was to go to shanghai to give birth where she would actually know someone. The
night before she leaves, she washed her sheets, packup, up till the woe hour, and certainly
before we both could scream, I was born around 6 O’clock in the morning.
Note: Used with permission
Fig. 1. Sample of student writing from the first assignment.
5 Chenowith and Hayes (2001) also used words written per minute as a measure of fluency.
J. Chandler / Journal of Second Language Writing 12 (2003) 267–296 273
additional outcome measure. Then the experimental group and the control group
were compared in terms of these changes over the semester in time spent writing
the same amount and kind of text.
Procedures: marking of errors
The categories of errors marked appear in Fig. 3. Fourteen of them are taken
from Azar’s Guide for Correcting Compositions (as cited in Brock & Walters,
1992, p. 123): singular–plural, word form, word choice, verb tense, add or omit a
word, word order, incomplete sentence, spelling, punctuation, capitalization,
article, meaning not clear, and run-on sentence. I added verb voice (active versus
passive) in addition to verb tense, word division in addition to spelling, and
sentence structure in addition to run-on sentences and fragments. I also added
categories of idiom, awkward (not grammatically incorrect but quite infelicitous
stylistically), subject–verb agreement, repetition or redundancy, pronoun, and
need for new paragraph in order to cover all the errors these students made even
though most of them were not frequent.
No argument is being made here that this error categorization system is better
or worse than other possible ones. It is more exhaustive than most; for example,
Ferris and Roberts (2001) used only five categories. The Asian students in the
present studies made frequent article errors so they were counted separately and
grouped (no matter whether they were errors of insertion, deletion, or wrong
article), whereas preposition errors were recorded as either insertion or deletion
errors or wrong word. Similarly, run-ons and fragments were recorded as separate
Class Experimental Control 3 Hand in chapter 1 Hand in chapter 1 4 Get errors underlined by teacher Get errors underlined by teacher 5 Correct errors on chapter 1 6 Get direct correction by teacher 7 Hand in chapter 2 Hand in chapter 2 8 Get errors underlined by teacher Get errors underlined by teacher 9 Correct errors on chapter 2 10 Get direct correction by teacher 11 Hand in chapter 3 Hand in chapter 3 12 Get errors underlined by teacher Get errors underlined by teacher 13 Correct errors on chapter 3 14 Get direct correction by teacher 15 Hand in chapter 4 Hand in chapter 4 16 Get errors underlined by teacher Get errors underlined by teacher 17 Correct errors on chapter 4 18 Get direct correction by teacher 19 Hand in chapter 5, end data collection Hand in chapter 5, end data collection 20 Get errors underlined by teacher Get errors underlined by teacher; correct
errors on chapters 1 and 2 21 Correct errors on chapter 5 Get direct correction of chaps 1 & 2; correct
errors on chaps 3, 4 & 5 22 Get direct correction by teacher Get direct correction of chaps 3, 4 & 5 23 Final draft of complete autobio Final draft of complete autobio
Fig. 2. Data collection and error correction schedule.
274 J. Chandler / Journal of Second Language Writing 12 (2003) 267–296
error categories and not as punctuation and capitalization errors. No effort was
made to weight the different kinds of errors. What categorization system is used is
not as important for purposes of this study as using the same system for pre- and
post-measures.
Thus it was important for both studies to have the same teacher-researcher
marking all errors in the same way. Another rater who is a college ESL teacher
marked 10% of the papers in order to calculate interrater agreement. The
percentage agreement on what was an error was 76%. This was calculated by
dividing the number of errors marked by only one rater (and not both) by the total
Fig. 3. Examples of error types.
J. Chandler / Journal of Second Language Writing 12 (2003) 267–296 275
number of errors (an average of each rater’s count), as Polio (1997) did. See also
Roberts (1999) for a discussion of the difficulty of getting high levels of interrater
reliability on accuracy measures.
One rater considered some things errors that the other one did not. For example,
the other rater marked many more article omissions than I did, especially with the
names of musical instruments in phrases such as ‘‘play piano,’’ ‘‘practice bassoon,’’
and ‘‘teach flute.’’ (I did not mark these as errors because of the results of a previous
study I had done on grammaticality judgments of article usage by native English-
speaking musicians [Chandler, 1994].) He also marked more omissions of commas
as errors. On the other hand, I marked an error every time there was not a new
paragraph for a new speaker in dialogue, and the other rater did not.
Thus, having the same teacher-researcher marking all errors makes compar-
isons possible between methods and between pre and post results that would not
be as easy with more than one marker, given the difficulty of attaining high
interrater reliability in marking so many kinds of errors on spontaneous writing
production. High interrater reliability on categorization of errors naturally tends
to be even more difficult to attain than on identification of errors (Polio, 1997). For
this reason, no attempt is being made in these studies to draw conclusions about
types of errors that were corrected or improved over time.
What is important for these studies is intrarater reliability rather than interrater
reliability, and the intrarater correlation of two markings of the same paper
separated by several years’ time was .92 for categorization as well as identifica-
tion of errors. A confirmation of this high intrarater reliability was done by an
independent rater on the teacher’s marking of essays by both the control and
experimental groups on the first and last assignments.
Results
Analysis of covariance was used to test for initial differences and differences in
outcome between the experimental and control groups. Tables 1 and 2 show the
results for accuracy, i.e., number of errors for each 100 words of text. There was
no significant difference between the experimental group and the control group on
the first assignment (t ¼ 2:05, P ¼ :175). The mean number of errors per 100
words for the control group was similar on the first and fifth assignments; there
was no significant difference between the control group’s error rates at the two
times (t ¼ �0:90, P ¼ :380).
The experimental group, on the other hand, went from an average of 7.8
grammatical and lexical errors per 100 words on the first assignment to 5.1 errors
on the fifth assignment, and this improvement in error rate between the two
assignments was statistically significant (t ¼ 4:05, P ¼ :001). A reduction of 2.7
errors per 100 words amounted to an average reduction of 34 errors on a five-page
paper from the first to the fifth assignment. Analysis of covariance (see Tables 1
and 2) also demonstrated a statistically significant (t ¼ 3:04, P ¼ :005) difference
276 J. Chandler / Journal of Second Language Writing 12 (2003) 267–296
in improvement in accuracy over the 10 weeks between the experimental group
(which corrected their errors between assignments) and the control group (which
did not). Nine of the 16 students in the control group actually had a higher error
rate on the fifth assignment than they did on the first, whereas only two of 15
students in the experimental group did. Moreover, in a striking contrast with the
control group, the experimental group showed much less variance between
students in their error rate by the end of the semester (see Table 1).
As an additional confirmation of these results, I regressed the change in the
number of errors from the first assignment to the last assignment for each group
(see Table 3). For the control group, neither factor explains much, but for the
experimental group, the number of errors on the first assignment and the intercept
are both significant. The negative coefficient on errors at time 1 suggests that the
improvement for the experimental teaching method falls as errors increase; e.g.,
students who have good skills to start with benefit more than ones who are less
proficient. However, the significant positive intercept suggests an improvement
for the experimental group regardless of the starting level.
Table 1
Accuracy: means and standard deviations on errors per 100 words for two groups and two testing
times
Group ANCOVA outcomes
Experimental (n ¼ 15) Control (n ¼ 16) F P
Chapter 1
Mean number of errors 7.8 6.0 �1.37 .183
Standard deviation 3.2 4.1
Chapter 5
Mean number of errors 5.1 6.9 1.45 .164
Standard deviation 1.8 4.6
Change
Mean number of errors �2.7 0.9 3.04 .005
Table 2
Accuracy: analysis of covariance for error rate per 100 words
Source of variation d.f. SS MS F P
Model* 2 177.74 88.87 10.27 .0005
Treatment group 1 56.50 56.50 6.53 .0163
Errors on chapter 1 1 78.23 78.23 9.04 .0055
Residual 28 242.40 8.66
Total 30 420.14 14.00
* The model for this analysis hypothesizes that variation in change in error rate is caused by
the group the students are in (control or experimental) and their error rate on the first assignment
(Errors 1). The P values are statistically significant for Model, Group, and Errors 1, indicating that
this model has significant explanatory value. See text for interpretation.
J. Chandler / Journal of Second Language Writing 12 (2003) 267–296 277
Table 3
Regression of the change in the number of errors from chapters 1 to 5 for two groups
Control group d.f. SS MS
Model 1 24.93 24.93
Residual 14 202.04 14.43
Total 15 226.97 15.13
Dependent variable ¼ change Coefficient S.E. t P
�0.31 0.24 �1.31 .21
2.77 1.72 1.61 .13
Experimental group d.f. SS MS
Model 1 65.32 65.32
Residual 13 28.35 2.18
Total 14 93.67 6.69
Dependent variable ¼ change Coefficient S.E. t P
�0.68 0.12 �5.47 .00
2.58 1.04 0.03 .03
A two-sample t test indicates that the coefficient for Errors 1 in the control group is significantly
different from the coefficient for Errors 1 in the experimental group
N Mean S.D.
x 16 �0.31 0.24
y 15 �0.68 0.12
Combined 31 �0.49 0.26
Satterthwaite’s d:f: ¼ 22:81. Ha: difference > 0, t ¼ 5:36, P > t ¼ 0:00.
Table 4
Fluency: means and standard deviations on minutes per 100 words for two groups and two testing
times
Group ANCOVA outcomes
Experimental (n ¼ 14) Control (n ¼ 12) F P
Chapter 1
Mean number of minutes 36.9 37.4 0.00 .952
Standard deviation 24.5 18.7
Chapter 5
Mean number of minutes 20.8 20.3 0.01 .922
Standard deviation 13.1 8.9
Change
Mean number of minutes 16.1 17.1 0.01 .907
Standard deviation 2.6 3.9
278 J. Chandler / Journal of Second Language Writing 12 (2003) 267–296
Tables 4 and 5 show that both groups increased significantly in fluency over the
10 weeks and that there was no significant difference between the two groups in
this improvement. The control group wrote the fifth assignment 17 min per 100
words faster than the first assignment (t ¼ 3:65, P ¼ :004), and the experimental
group improved by 16 min for the same amount of text during the same time
period (t ¼ 2:50, P ¼ :027). The fact that the standard deviations were high
indicates either that students spent very different amounts of time on an assign-
ment or that they calculated the time differently (e.g., one student may have
included thinking time while another may have counted only drafting time) or
both. This fact, however, does not invalidate the finding since the same students’
self-reports on the first and last assignments are being compared.
Summary and discussion
The results of this study demonstrate that the accuracy (correctness of English)
of student writing over 10 weeks improved significantly more if these high
intermediate East Asian college students were required to correct their errors than
if they were not. The fact that the control group, which did no error correction
between assignments, did not increase in accuracy while the experimental group
showed a significant increase would seem to refute the assertion that having
students correct errors is ineffective. Moreover, this increase in accuracy by the
experimental group was not accompanied by a decline in fluency over the
semester, as measured by self-reports of time students spent writing the same
amount and kind of text. On the contrary, both the experimental and the control
groups in the present study showed a significant increase in fluency over the
semester, a finding which corresponds to those reported in Robb et al.’s (1986)
research on Japanese EFL students and Lizotte’s (2001) study of Hispanic
bilingual and ESL students in a U.S. community college.
Although conventional wisdom in the field advocates that teachers respond to
content first and to form only in a later draft (Sommers, 1982; Zamel, 1985), there
Table 5
Fluency: analysis of variance for time to write: minutes per 100 words
Source of variation d.f. SS MS F
Between subjects
Group 1 0.00 0.00 0.00
Errors on chapter 1 24 0.95 0.04
Within subjects
Time 1 0.36 0.36 16.10**
Group � Time 1 0.00 0.00 0.01
Errors on chapter 5 24 0.52 0.02
** P ¼ :001.
J. Chandler / Journal of Second Language Writing 12 (2003) 267–296 279
is no indication that this is necessary, at least when students are writing in a genre
which is relatively easy for them, such as the autobiographical writing in this study
or the journal writing in Fathman and Whalley’s 1990 research. In both of these
studies, comments on content and accuracy were given simultaneously, and when
that was done in Fathman and Whalley’s study, student rewrites improved in both
content and accuracy. In a recent study, Ashwell (2000) found no significant
difference in student gains in accuracy or content scores on a third draft following
three different patterns of teacher feedback on the first two drafts: (a) the con-
ventional response (giving feedback on content first and feedback on form in a later
draft), (b) the reverse pattern, or (c) one in which form and content feedback were
mixed. All of these patterns, however, were superior to giving no feedback.
What the findings of the present study suggest is that if students did not revise
their writing based on feedback about errors, having teachers mark errors was
equivalent to giving no error feedback since the students’ new writing did not
increase in correctness over one semester. (This probably explains Kepner’s 1991
findings since she gave rule reminders as error correction feedback but did not
require revisions.) If students did make error corrections, their subsequent new
writing was more accurate without a reduction in fluency.
In summary, mere practice resulted in a significant increase in fluency for both
groups; that is, at the end of the semester they were able to write the same amount
and kind of text (in the same context of a homework assignment) in much less
time, according to self-reports. However, mere practice without error correction
did not produce more correct subsequent writing, whereas when students cor-
rected their errors before writing the next assignment, their first drafts became
more accurate over the semester.
Study two: The effects of various kinds of error correction
Having answered, in the affirmative, the question of whether to have students
correct their errors, the research turned to the question of how the teacher should
give error feedback. Should teachers simply correct the errors or should they mark
the errors for student self-correction? If the latter, is it more effective for a teacher
to indicate the location or the type of error or both?
Method
Design and subjects
In this second study, these questions are approached by looking at change in the
accuracy of both revisions and subsequent new writing over the semester, change
in the fluency of student writing as measured by self-reports of the time it took to
write the same amount and kind of text, change in the quality of student writing as
measured by holistic ratings, student attitudes toward four different kinds of
280 J. Chandler / Journal of Second Language Writing 12 (2003) 267–296
teacher response, the time it took students to make the corrections after each kind
of response, and the time required of the teacher to make the various kinds of
responses.
This second study was conducted in the same ESL writing course as the one
reported above but in a different year with different students. Part of one class
session was spent explaining the different kinds of errors listed in Fig. 3. Each
student received a sheet listing the abbreviations and an example of each error.
Besides the teacher feedback described below, the other major difference between
the two studies was that in the second study students were asked to write 40 pages,
instead of 25, of autobiographical text over the semester. In response to five
identical homework assignments, they wrote about eight pages of text each time6
and revised them after receiving feedback from the teacher and before writing the
next assignment, as the experimental class had done in the first study (after the
teacher underlined all their errors). Requiring rewriting, no matter what kind of
teacher feedback, including direct correction, ensured that the student read the
teacher’s response carefully, though there was no penalty for errors on the first
draft. As described above, the grade was based on the quality (including
correctness) and quantity of the final draft after students had made as many
revisions as they chose.
The second study was done with a total of 36 students in two sections of the
same course taught in the same way by the same teacher. The first class contained
1 Hispanic and 20 Asian undergraduate students, 18 females and 3 males, and the
second class had 15 East Asian students, 13 females and 2 males.
In this partially balanced incomplete block design, each student received four
different kinds of teacher feedback, in different orders, in response to the first four
chapters of his or her autobiographical writing. Having each student receive each
kind of feedback ensures that the treatment groups are identical as well as
ensuring a larger number of students in each group than if the 36 students had
been divided into four treatment groups. It is important to give the treatments
in different orders so as not to confound order and type of treatment. The
four treatments used were (a) Correction (see e.g., Fig. 4), (b) Underlining
with Description (see Fig. 5), (c) Description of type only (see Fig. 6), and
(d) Underlining (see Fig. 7).
The outcome measures were: (a) number of errors per 100 words on both the
revision and on the subsequent chapter before revision (accuracy), (b) holistic
ratings of overall writing quality of the first draft of both the first and last chapters
of each student’s autobiography, (c) time students reported spending writing each
chapter (fluency), (d) immediate student responses to each feedback type,
including the time it took to make corrections, and to a questionnaire comparing
the four types at the end of the semester, and (e) a rough comparison of time spent
by the teacher in giving each method of feedback, both initially and over two
drafts.
6 Lengthening the assignment from five to eight pages should increase the effect of each treatment.
J. Chandler / Journal of Second Language Writing 12 (2003) 267–296 281
Fig. 4. Correction.
Fig. 5. Underline and Describe.
Fig. 6. Describe.
282 J. Chandler / Journal of Second Language Writing 12 (2003) 267–296