1 Production of comparative language tests The European Survey on Language Competences Neil Jones Cambridge ESOL SQA 28 february 2013
Jan 19, 2016
1
Production of comparative language tests The European Survey on Language Competences
Neil JonesCambridge ESOL
SQA 28 february 2013
Cambridge ESOL Project Management, English Language tests, Language Test Coordination
Centre international d’études pédagogiques (CIEP) Centre international d’études pédagogiques
French Language tests
Goethe Institut German Language tests
Università per Stranieri di Perugia Italian Language tests
Universidad de Salamanca/ Instituto Cervantes
Spanish Language tests
Gallup Europe Sampling + Testing tool + Translation
National Institute for Educational Measurement (Cito)
Questionnaire design,Analysis
Key aims:The ESLC set out to:
provide information on the general level of foreign language knowledge of pupilsprovide strategic information to policy makers, teachers and learners
Using the following instruments:Language Tests:
English, French, German, Italian, Spanish 3 skills (Reading, Listening, Writing)A1 to B2 levels of CEFR
Contextual questionnaires: addressing 13 language policy issues for students, teachers, principals and countries
Validity as inference to some “real world”..
World of the test
..
“Real World” of language
use
Inference
Test performance
Test/ task
features
Processes, knowledge
Learner features
“Real world” (target situation
of use)
MeasureTest score
Test construction
1 2 3 4
Inference to some “real world”: a sequence of steps
Test/ task
features
Processes, knowledge
Learner features
Context validityTheory-based validity
What to observe? How?
Test construction
Scale construction, measurement Standard setting, interpretation
Context-specific Context-neutral
Test performance1
Test score
How can we score what we observe?
Evaluation
Scoring validity
2
Generalization
Measure
Are scores consistent and interpretable?
Measurement validity
3
“Real world” (target situation
of use)
Extrapolation
Does the test score reflect the candidate’s actual ability?
4
Frame-worklevels
Alignment
How does the specific learning/testing context relate to a more general proficiency framework?
5
Approach to developing the language testing framework
Identify the language testing objectives of the ESLC. For each skill, identify test content and testable subskills derived from:
a socio-cognitive model of language proficiencylanguage functions or competences salient at levels A1 to B2 in CEFR
identify appropriate task types to test these subskillsdevelop specifications, item writer guidelines and a collaborative test development process that are shared across languages in order to produce comparable language tests.
Common European Framework model of language use/learning
“…the actions performed by persons who as individuals and as social agents develop a range of competences, both general and in particular communicative language competences. They draw on the competences at their disposal in various contexts under various conditions and constraints to engage in language activities involving language processes to produce and/or receive texts in relation to themes in specific domains, activating those strategies which seem most appropriate for carrying out the tasks to be accomplished. The monitoring of these actions by the participants leads to the reinforcement or modification of their competences.” (Council of Europe 2001:9, emphasis in original).
CEFR’s model of language use and learning
Domain of use
The language learner/
user Knowledge
Processes
Strategies
Monitoring, assessment
Language activity
Topic (situation,theme…)
Task
Test
Task
Task
Task
Task
An interactional view
Domain of use (TLU)
The language learner/
user Knowledge
Processes
Strategies
Language activity
Topic (situation,theme…)
Task
Learner’s engagement with tasks has interactional authenticity.
Test tasks reflect TLU tasks.
Test performance enables inference to performance in TLU.
The language learner/
user Knowledge
Processes
Strategies
Building a mental modelIntegrating new information Enriching the proposition
Establishing propositional meaning at clause and sentence levels
Creating a text level structure:Construct an organised representation of the text [or texts]
Inferencing
Parsing
Lexical access
Word recognition
Visual input
Central processing core
Goal setterSelecting appropriate type of reading:
Careful readingLocal:Understand sentenceGlobaIComprehend main idea(s)Comprehend overall textComprehend overall texts
Expeditious readingLocal:Scan for specificsGlobal:Skim for gistSearch for main ideas and important detail
Monitor: goal checking
Remediation where necessary
Metacognitive mechanisms/Strategies
General knowledge of the world
Topic knowledge
Meaning representation of text(s) so far
Syntactic knowledge
LexiconLemma: Meaning Word classLexiconForm:OrthographyPhonologyMorphology
Text structure knowledge: Genre Rhetorical tasks
Knowledge
A model for reading(after Weir 2005)
Domains of language use
A1 A2 B1 B2
personal 60% 50% 40% 25%
public 30% 40% 40% 50%
educational 10% 10% 20% 20%
professional 0% 0% 0% 5%
Features of approachImplementation of construct: subskills mapped to specific task types
Reading and Listening: objectively marked; Writing: subjectively marked
Four task development stages: Pilot (2008), Pretesting (2009) Field Trial (2010), Main study (2011)
Task adaptation across languages
Cross-language vetting
Reading – an A1 task
You will read a notice about a cat. For the next 4 questions, answer A, B or C.Leo is lost. He’s my little cat. He’s white with black paws. He’s small and very sweet. He
has brown eyes. He wears a grey collar. He didn’t come home on Monday and it’s Thursday today. That’s a long time for a little cat!
Leo often sits on top of the houses near here between Smith’s baker’s shop and King Street. If you find him in your garden or under your car, please telephone me immediately. Please note – Leo doesn’t like it when people pick him up, and he doesn’t like milk.
Thank you for your help! Sophie Martin tel: 798286
Busco a mi gato Leo. Ha desaparecido. Es blanco con las patas negras. Es pequeño, tiene 7 meses y
es muy bonito. Tiene los ojos marrones. Lleva un collar gris. Le gusta sentarse en los tejados de
las casas que están entre la panadería García y la calle de la Victoria. No veo a Leo desde el
lunes y hoy es jueves. Es mucho tiempo para un gato tan pequeño. Leo no bebe leche y no come
pan.
Si lo ves cerca de tu casa o debajo de un coche, llámame.
Gracias por tu ayuda.
Sofía Alonso 626 537 548
Reading – an A1 task
1 What colour is Leo? 3 Where does Leo like to go?
A white and grey A in gardens
B brown and grey B under cars
C black and white C on houses
2 Sophie saw Leo 4 If you find Leo
A yesterday. A phone Sophie.
B a few days ago. B give him some milk.
C a week ago. C tell the baker.
1 ¿De qué color es Leo? 3 Leo lleva fuera de casa
A Blanco y gris A un día.
B Marrón y gris B varios días.
C Blanco y negro C una semana.
2 A Leo le gusta sentarse 4 Si ves a Leo debes
A en los jardines. A ir a la panadería.
B debajo de los coches. B darle leche.
C en los tejados. C llamar a Sofía.
EN - Holiday photo
You are on holiday. Send an email to an English
friend with this photo of your holiday.
Tell your friend about:
• the hotel
• the weather
• what the people are doing
Write 20–30 words.
FR - Photo de vacances DE - Urlaubsfoto
Tu es en vacances. Tu envoies un email à un ami
avec cette photo de tes vacances.
Tu utilises la photo pour parler de :
• l’hôtel
• le temps
• les activités
Tu écris 20–30 mots.
Du hast Ferien. Schreib deiner deutschen
Freundin eine E-Mail mit diesem Urlaubsfoto.
Schreib deiner Freundin über:
• das Hotel
• das Wetter
• was die Leute machen
Schreib 20–30 Wörter.
ES - Foto de vacaciones IT – A1 level not tested
ES - Foto de vacaciones
Estás de vacaciones. Envía un e-mail a un amigo español con esta foto de tus vacaciones.Escribe sobre:• el hotel• el tiempo• qué hace la gente Escribe 20–30 palabras.
Marking of WritingResponsibility of countriesCentral trickle-down training sessions held for national coordinatorsA proportion of multiple marking in each country: check on in-country rater agreementBut (all) multiple-marked scripts also centrally marked: additional check on leniency/severity
Central
marking
Country A Country CCountry B
Multiple marking
Single
marking
Central markers
A. Communication
how many of the content points are dealt with (clearly)how well the points are expandedstyle – register
B. Languagecoherencevocabularycohesionaccuracy
1 3
~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~
2
lower higher
1 3
~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~
2
Lower exemplar
lower higher
5
~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~
4
Higher exemplar
Measurement scale
B1
A2
A1
Standards consistently
applied
Test 3
Test 1
Test 2
Tests at appropriate
level
90
80
70
60
50
40
30
Item bank links all levels
. .
. .
. .
Learners located on scale
Item response theory and item-banking
Targeted language testing
A1 A2
A2 B1
B1 B2
Routing test
Test designtasks\booklets English time b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 b11 b12
A1-R1-a ER111 7.5 1 2 1 2A1-R1-b ER112 7.5 2 1 2 1A1-R2-a ER211 7.5 2 1 2 1A1-R2-b ER212 7.5 1 2 1 2A1-R3-a ER311 7.5 2 1 1 2A1-R3-b ER312 7.5 1 2 2 1A2-R2-a ER221 7.5 3 4 3A2-R2-b ER223 7.5 3 3 4A2-R3-a ER321 7.5 4 4 3A2-R3-b ER323 7.5 4 3 3A2-R4-a ER422 7.5 3 4 4A2-R4-b ER423 7.5 4 4 3A2-R5-a ER522 7.5 3 3 4A2-R5-b ER523 7.5 4 3 4
30 30 30 30 30 30 30 30 30 30 30 30
tasks\booklets English time b13 b14 b15 b16 b17 b18 b19 b20 b21 b22 b23 b24
A2-R2-a ER221 7.5 1 2 1A2-R2-b ER223 7.5 2 2 1A2-R3-a ER321 7.5 1 2 1A2-R3-b ER323 7.5 1 2 1A2-R4-a ER422 7.5 2 1 2A2-R4-b ER423 7.5 2 1 2A2-R5-a ER522 7.5 1 1 2A2-R5-b ER523 7.5 2 1 2B1-R5-a ER532 7.5 4 3 4 3B1-R5-b ER533 7.5 3 4 3 4B1-R6-a ER631 7.5 3 4 4 3B1-R6-b ER633 7.5 4 3 3 4B1-R7-a ER731 7.5 3 4 3 4B1-R7-b ER733 7.5 4 3 4 3
30 30 30 30 30 30 30 30 30 30 30 30
tasks\booklets English time b25 b26 b27 b28 b29 b30 b31 b32 b33 b34 b35 b36
B1-R5-a ER532 7.5 1 2 2B1-R5-b ER533 7.5 1 2 1B1-R6-a ER631 7.5 2 1 2B1-R6-b ER633 7.5 1 2 2B1-R7-a ER731 7.5 1 2 1B1-R7-b ER733 7.5 1 2 1B2-R6-a ER642 15 3 1 2B2-R6-b ER643 15 3 3B2-R7-a ER741 15 3 2 1B2-R7-b ER742 15 3 3B2-R8-a ER841 15 3 2 1B2-R8-b ER843 15 3 3
30 30 30 30 30 30 30 30 30 30 30 30
Level 1
Level 2
Level 3
Standard setting to the CEFRStandard reference: the CoE Manual for relating language exams to the CEFR;
http://www.coe.int/t/dg4/linguistic/manuel1_en.asp
Jones, N (2009) A comparative approach to constructing a multilingual proficiency framework: constraining the role of standard setting
http://www.coe.int/t/dg4/linguistic/Proceedings_CITO_EN.pdf
See too the CoE Manual for language test development and examining (ALTE)
http://www.coe.int/t/dg4/linguistic/ManualtLangageTest-Alte2011_EN.pdf
Standard setting to the CEFRMy conclusions:
Build on what you already know;
Performance skills are a more practical target for standard setting judgment than indirectly observable, objectively marked skills;
Comparative judgments are easier than absolute judgments, and therefore ranking may offer more than rating;
In a multilingual framework it is essential to minimize the role of subjective judgment.
Cross-language alignment In ESLC a study was possible for Writing.
A ranking study, cf Sevres (2008) for Speaking
Ranking approach to cross-language comparison (Speaking, CIEP 2008)
-10
-8
-6
-4
-2
0
2
4
6
8
10
-10 -5 0 5 10
Ratings
Ran
kin
gs
German
English
Spanish
French
Italian
A1 A2 B1 B2 C1Levels from rating
C1
B2
B1
A2
A1
StandardSet forRankings
ESLC Writing alignment: five languages on a single scale
-4
-3
-2
-1
0
1
2
3
4
0 1 2 3 4 5 6
English
French
German
Italian
Spanish
Leve
l
-2.5 -1.5 -0.5 0.5 1.5 2.5
Students
mean
median
English
French
German
Italian
Spanish
B2B1A2A1Pre-A1
B2B1A2A1Pre-A1
B2B1A2A1Pre-A1
B2B1A2A1Pre-A1
B2B1A2A1Pre-A1
CEFR levels First language (Skills averaged)
0%
20%
40%
60%
80%
100%
UK-ENG(FR)
FR(EN)
BE nl(FR)
PL(EN)
ES(EN)
PT(EN)
BE fr(EN)
BG(EN)
BEde
(FR)
EL(EN)
HR(EN)
SI(EN)
EE(EN)
NL(EN)
MT(EN)
SE(EN)
Pe
rce
nta
ge B2
B1
A2
A1
Pre-A1
0%
20%
40%
60%
80%
100%
First target language (Skills averaged)
Second target language (Skills averaged)
CEFR levels Second language (Skills averaged)
0%
20%
40%
60%
80%
100%
SE(ES)
PL(DE)
UK-ENG(DE)
EL(FR)
PT(FR)
FR(ES)
HR(DE)
BG(DE)
SI(DE)
EE(DE)
BE fr(DE)
ES(FR)
MT(IT)
NL(DE)
BEde
(EN)
BE nl(EN)
Pe
rce
nta
ge B2
B1
A2
A1
Pre-A1
0%
20%
40%
60%
80%
100%
Asset Languages link between GCSE and CEFR
NQF levelGeneral
qualificationsAsset Languages Asset
CEFR levels
Cambridge CEFR levels
Cambridge ESOL exams
Level 7-8 Mastery C2
Levels 4-6 Proficiency C1
Level 3 AS/A/AEA Advanced B2 C2 CPE
Level 2 Higher GCSE Intermediate B1 C1 CAE
Level 1 Foundation GCSE Preliminary A2 B2 FCE
Entry 3 Level Entry 1-3 Breakthrough A1 B1 PET
Entry 2 Level A2 KET
Entry 1 LevelA1
GCSE grades and CEFR levels
http://www.surveylang.org