Lexical Diversity, Sophistication, and Size in Academic Writing Melanie González, Ph.D., Salem State University November 15, 2014 Symposium on Second Language Writing Tempe, Arizona, USA
Jul 11, 2015
Lexical Diversity, Sophistication, and Size in Academic Writing
Melanie González, Ph.D., Salem State University November 15, 2014
Symposium on Second Language Writing Tempe, Arizona, USA
Research Problem
Vocabulary is an important aspect of second language (L2) academic writing proficiency
L2 writers in the college composition classroom face substantive writing assignments
Expected to meet the same standards as their monolingual English-‐speaking (L1) peers who have the advantage of their first language lexicon 2
Rubric Lexical Criteria
ESL Composition Profile (Jacobs et al., 1981)
Sophisticated range, effective word choice, word form mastery, appropriate register
TOEFL Independent Writing (ETS, 2005)
Variety and range of vocabulary, occasional noticeable minor errors in word form and use of idiomatic language; Appropriate word choice and idiomaticity, minor lexical errors
IELTS Tasks 1 and 2 Uses a wide range of vocabulary with very natural and sophisticated control of lexical features; rare minor errors occur only as ‘slips’, use of uncommon lexical items
The Present Investigation
What does proficient, producYve word use look like?
What makes academic wriYng “sophisYcated”?
Is it just a problem solved by teaching and learning more words?
4
Previous Findings
Essays with larger vocabulary sizes use fewer high-frequency words and more uncommon words (Crossley & McNamara, 2009; Crossley et al., 2010; Laufer & Nation, 1995; Tidball & Treffers-Daller, 2008)
Lexical diversity tends to be a strong predictor of writing quality (de Haan & van Esch; 2005; Linnerud, 1986; Crossley & McCarthy, 2009; Crossley et al., 2010)
Intuitions often assume that one would need a large, sophisticated vocabulary to diversify lexis; however, studies have indicated that proficient texts vary more frequent terms (Horst & Collins, 2006; Laufer, 1994) 5
Definition of Terms
Lexical diversity: varied use of different words in writing (Laufer & Nation, 1995)
Lexical sophistication: “advanced”, content-‐bearing words that do not occur frequently (Tidball & Treffers-‐Daller, 2008)
Vocabulary size: frequency-‐based number of all words in essay’s lexicon (Laufer & Nation, 1995)
6
Research Questions
Are there significant differences in the lexical diversity, sophistication, and size between L1 and advanced L2 writers’ academic texts?
Is there a relationship between these three measures of productive vocabulary?
Is lexical diversity, sophistication, or size a greater predictor of writing score?
7
Methods
DescripAon of the Sample: • 104 advanced L2 academic
essays, • 68 L1 academic essays (N = 172) • Spanned 14 different L1s and 7
wriYng genres • 3 raters
Instruments: • MTLD – typical score range
between 70 and 120 • Two CELEX measures (all words
and content words) – score range 0 to 6; 0 = rarest words, 6 = common words
• TOEFL iBT WriYng Rubric – score range 0 to 5; 0 = lowest score, 5 = highest proficiency 8
Descriptive Results
9
N M
Total corpus 172 4.04
L2 writer texts 104 3.42
L1 writer texts 68 4.99
Score Frequency % of Corpus
2 15 8.7
3 38 22.1
4 44 25.6
5 75 43.6
Index M
Diversity L2 L1
73.01 69.12 78.96
Sophistication L2 L1
2.39 2.47 2.29
Size L2 L1
3.07 3.09 3.04
Eight L2 texts scored a 5 One L1 text scored a 4
Research Question 1
• L1 texts exhibited significantly higher levels of lexical diversity (F3, 168 = 20.30*, η2 = .27), used more sophisticated words (F3, 168 = 56.726*, η2 = .25), and exhibited larger vocabulary sizes than L2 essays
• Lexical sophistication showed the greatest differences (F3, 168 = 56.73*, η2 =.25)
• For L1 texts, only the MTLD was able to detect significant differences (F1, 66 = 4.17*, η2 = .06)
*p < .05
L1 writers tend to vary their words more, exhibit more sophisticated vocabulary, and use greater range of terms than L2 writers
10
Are there significant differences in the lexical diversity, sophistication, and size between L1 and advanced L2 writers’ academic texts?
Research Question 2
• Moderate correlation between vocabulary size and lexical diversity (r=−.44**)
• For L1 texts, correlation is a little less (r=−.36*)
• Sophistication and diversity also moderately correlated (r=-‐.42**)
• In L1 texts, lexical diversity shared no significant relationship to sophistication
• Lexical sophistication and size were highly correlated (r=.82**)
*p<.05; **p<.001
Essays with greater lexical diversity utilized lower-‐frequency and sophisticated words, but only to a moderate degree. 11
Is there a relationship between these three measures of productive vocabulary?
Research Question 3
• Lexical diversity was the only significant contributor to the model for both L2 and L1 essays (Exp[B]=1.07**).
• Although all indexes significantly differed by each score level (F6, 336=10.61**, η2=.16), only the MTLD accounted for a greater amount of the variation in ratings (F3, 168 =21.66**, η2=.28).
**p<.001
As lexical diversity within an essay increased, so did its likelihood of earning a score of 5. 12
Is lexical diversity, sophistication, or size a greater predictor of writing score?
13Figure 1. Lexical Diversity
14Figure 2. Vocabulary Size
15Figure 3. Lexical Sophistication
Discussion
Findings suggest that vocabulary size and sophistication initially help a text advance from a level 2 to 3, but it is lexical diversity that helps to push a text into the 4 to 5 score range
Hint that mid-‐range vocabulary words could possibly account for some of the differences in score between L2 and L1’ texts
16
Implications
Results offer further explanation of vocabulary criteria for assessment rubrics
Indicate that vocabulary instruction needs to go beyond growing learner lexicons and teach advanced L2 writers how to diversify words in composition
Provide some validation of the MTLD; it performed well despite large variation in text length
17
Limitations
Text length, task topic, and writing genre presents challenges to any study of lexical richness
CELEX frequency bands were created in 1995; it is possible that word frequencies have changed
Generalizability due to corpus characteristics
18
Further Research
Study assignments from first-‐year wriYng courses that contain both L2 and L1 writers
Include lexical error as a variable
Add qualitaYve component to raters’ scores; focus only on lexis
Include an independent measure of producYve vocabulary size; use BNC/COCA
Perform frequency analysis on wriYngs to validate mid-‐frequency findings 19
References
Crossley, S.A. & McNamara, D.S. (2009). Computational assessment of lexical differences in L1 and L2 writing. Journal of
Second Language Writing, 18(2), 119-135.
Crossley, S.A., Salsbury, T., McNamara, D.S., & Jarvis, S. (2010). Predicting lexical proficiency in language learner texts using
computational indices. Language Testing, 28(4), 561-580.
De Haan, P., & van Esch, K. (2005). The development of writing in English and Spanish as foreign languages. Assessing Writing,
10(2), 100-116.
Educational Testing Service. (2005). Helping your students communicate with confidence. Princeton, NJ: Educational Testing
Service.
Jacobs, H., Zinkgraf, S., Wormuth, D., Hartfiel, V., & Hughey, J. (1981). Testing ESL composition: A practical approach. Boston:
Newbury House. Retrieved from http://seltmedia.heinle.com/resource_uploads/downloads/1424051010_35982.pdf
Laufer, B. (1994). The lexical profile of second language writing: Does it change over time? RELC Journal, 25(2), 21-33.
Laufer, B. & Nation, I.S.P. (1995). Vocabulary size and use: Lexical richness in L2 written production. Applied Linguistics, 16(3),
307-322.
Linnarud, M. (1986). Lexis in composition: A performance analysis of Swedish learners’ written English. Malmö, Sweden: Liber
Förlag Malmö.
Tidball, F. & Treffers-Daller, J. (2008). Analyzing lexical richness in French learner language: What frequency lists and teacher
judgments can tell us about basic and advanced words. Journal of French Language Studies, 18(3), 299-313. doi:
10.1017/S095926950800346320