International Outcomes of Learning in Mathematics Literacy and Problem Solving U.S. Department of Education Institute of Education Sciences NCES 2005-003 PISA 2003 Results From the U.S. Perspective Highlights International Outcomes of Learning in Mathematics Literacy and Problem Solving: PISA 2003 Results From the U.S. Perspective 2004
132
Embed
PISA 2003 Results From the · Race/Ethnicity In the United States in PISA 2003, Blacks and Hispanics scored lower on average than Whites, Asians, and students of more than one race
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
U.S. Department of EducationED Pubs8242-B Sandy CourtJessup, MD 20794-1398
Official BusinessPenalty for Private Use, $300
U.S. Department of EducationED Pubs8242-B Sandy CourtJessup, MD 20794-1398
Official BusinessPenalty for Private Use, $300
InternationalOutcomes of Learningin MathematicsLiteracy and Problem Solving
U.S. Department of EducationInstitute of Education SciencesNCES 2005-003
PISA 2003 Results From theU.S. Perspective
Highlights
U.S. POSTAGE PAIDU.S. DEPARTMENT OF EDUCATION
PERMIT NO. G-17
Interna
tiona
l Outc
om
es o
f Lea
rning in M
athe
ma
tics Lite
rac
y and
Prob
lem
Solving
: PISA 2003 Re
sults From
the U
.S. Persp
ec
tive2004
InternationalOutcomes of Learningin MathematicsLiteracy and ProblemSolving: PISA 2003 Results From theU.S. Perspective
HighlightsU.S. Department of EducationInstitute of Education SciencesNCES 2005–003
Mariann LemkeNational Center for Education Statistics
Anindita SenErin PahlkeLisette PartelowDavid MillerEducation Statistics Services Institute
Trevor WilliamsDavid KastbergLeslie JocelynWestat
December 2004
U.S. Department of EducationRod PaigeSecretary
Institute of Education SciencesGrover J. WhitehurstDirector
National Center for Education StatisticsRobert LernerCommissioner
The National Center for Education Statistics (NCES) is the primary federal entity for collecting,analyzing, and reporting data related to education in the United States and other nations. It ful-fills a congressional mandate to collect, collate, analyze, and report full and complete statisticson the condition of education in the United States; conduct and publish reports and special-ized analyses of the meaning and significance of such statistics; assist state and local educa-tion agencies in improving their statistical systems; and review and report on education activi-ties in foreign countries.
NCES activities are designed to address high priority education data needs; provide consistent,reliable, complete, and accurate indicators of education status and trends; and report timely,useful, and high quality data to the U.S. Department of Education, the Congress, the states,other education policymakers, practitioners, data users, and the general public.
We strive to make our products available in a variety of formats and in language that is appro-priate to a variety of audiences. You, as our customer, are the best judge of our success incommunicating information effectively. If you have any comments or suggestions about this orany other NCES product or report, we would like to hear from you. Please direct your com-ments to:
National Center for Education StatisticsInstitute of Education SciencesU.S. Department of Education1990 K Street NWWashington, DC 20006-5651
December 2004
The NCES World Wide Web Home Page is http://nces.ed.govThe NCES World Wide Electronic Catalog is http://nces.ed.gov/pubsearch
Suggested Citation
Lemke, M., Sen, A., Pahlke, E., Partelow, L., Miller, D., Williams, T., Kastberg, D., Jocelyn, L. (2004).International Outcomes of Learning in Mathematics Literacy and Problem Solving: PISA 2003Results From the U.S. Perspective. (NCES 2005–003). Washington, DC: U.S. Department ofEducation, National Center for Education Statistics.
For ordering information on this report, write:
U.S. Department of EducationED PubsP.O. Box 1398Jessup, MD 20794-1398
Call toll free 1-877-4ED-PUBS or order online at http://www.edpubs.org
SummaryThe Program for International Student Assessment (PISA) is a system of internationalassessments that measures 15-year-olds’ capabilities in reading literacy, mathematics liter-acy, and science literacy every 3 years. PISA was first implemented in 2000 and is carriedout by the Organization for Economic Cooperation and Development (OECD), an intergov-ernmental organization of industrialized countries. Each PISA data-collection effortassesses one subject area in depth, even as all three are assessed in each cycle so that par-ticipating countries have an ongoing source of achievement data in every subject area (fig-ure 1). In addition to the major subject areas of reading literacy, mathematics literacy, andscience literacy, PISA also measures general or cross-curricular competencies such aslearning strategies. In this second cycle, PISA 2003, mathematics literacy was the subjectarea assessed in depth, along with the new cross-curricular area of problem solving. Majorfindings for 2003 in mathematics literacy and problem solving are provided here, as well asbrief discussions of student performance in reading literacy and science literacy andchanges in performance between 2000 and 2003.
U.S. Performance in Mathematics Literacy and Problem SolvingIn 2003, U.S. performance in mathematics literacy and problem solving was lower than theaverage performance for most OECD countries (tables 2 and 3). The United States alsoperformed below the OECD average on each mathematics literacy subscale representing aspecific content area (space and shape, change and relationships, quantity, and uncertainty).This is somewhat different from the PISA 2000 results, when reading literacy was themajor subject area, which showed the United States performing at the OECD average(Lemke et al. 2001).
Along with scale scores, PISA 2003 also uses six proficiency levels (levels 1 through 6, withlevel 6 being the highest level of proficiency) to describe student performance in mathemat-ics literacy (exhibit 5) and three proficiency levels (levels 1 through 3, with level 3 being thehighest level of proficiency) to describe student performance in problem solving (exhibit 9).In mathematics literacy, the United States had greater percentages of students below level 1and at levels 1 and 2 than the OECD average percentages (figure 5, table B-6). The UnitedStates also had a lower percentage of students at levels 4, 5, and 6 than the OECD averagepercentages. Results for each of the four mathematics content areas followed a similar pat-tern. In problem solving, the United States also had greater percentages of students belowlevel 1 and at level 1 than the OECD average percentages, and a lower percentage of studentsat levels 2 and 3 than the OECD average percentages (figure 8, table B-15).
This is also somewhat different from the PISA 2000 reading literacy results, which showedthat while the percentages of U.S. students performing at level 1 and below were not meas-urably different from the OECD averages, the United States had a greater percentage ofstudents performing at the highest level (level 5) compared to the OECD average (Lemke etal. 2001). In mathematics literacy and problem solving in 2003, even the highest U.S. achiev-ers (those in the top 10 percent in the United States) were outperformed on average by theirOECD counterparts (figures 4 and 7, tables B-4 and B-13).
There were no measurable changes in the U.S. scores from 2000 to 2003 on either the spaceand shape subscale or the change and relationships subscale, the only content areas forwhich trend data from 2000 to 2003 are available (table B-11). In both 2000 and 2003, abouttwo-thirds of the other participating OECD countries outperformed the United States inthese content areas. iii
PISA 2003 Results From the U.S. Perspective
U.S. Performance in Reading Literacy and Science LiteracyThe U.S. average score in reading literacy was not measurably different from the OECDaverage in 2000 or 2003 (figure 9, table B-16), nor was there any measurable change in theU.S. reading literacy score from 2000 to 2003.
The U.S. score was below the OECD average science literacy score in 2003 (figure 9, table B-17). There was no measurable change in the U.S. science literacy score from 2000 to 2003.
Differences in Performance by Selected Student CharacteristicsSex
Males outperformed females in mathematics literacy in the United States and in two-thirdsof the other countries (figure 10, table B-18). Within the United States, greater percentagesof male students performed at level 6 (the highest level) than female students in mathemat-ics literacy, but larger percentages of females were not seen at lower levels (below level 1and levels 1 through 5; table B-19). In other words, differences in the overall scores betweenmales and females in the United States were due at least in part to the fact that a higherpercentage of males were found among the highest performers, not to a higher percentage offemales found among the lowest performers.
In the majority of the PISA 2003 countries (32 out of 39 countries), including the UnitedStates, there were no measurable differences in problem-solving scores by sex (figure 10,table B-21). However, females outscored their male peers in problem solving in six of theseven remaining participating countries, as well as at the OECD average. Males outscoredfemales in problem solving in Macao-China.
Socioeconomic Background
In 2003, a few countries showed stronger relationships between socioeconomic background(as measured by parental occupational status) and student performance than the UnitedStates, while more showed weaker relationships. In 2003, the relationship between socio-economic background and student performance in mathematics literacy was stronger in 5countries (Belgium, Czech Republic, Germany, Hungary, and Poland) than in the UnitedStates, while 11 countries had weaker relationships (table B-25). Three of the same fivecountries (Belgium, Germany, and Hungary) had stronger relationships between socioeco-nomic background and problem-solving performance than the United States, while 12 hadweaker relationships.
Race/Ethnicity
In the United States in PISA 2003, Blacks and Hispanics scored lower on average thanWhites, Asians, and students of more than one race in mathematics literacy and problemsolving (figure 11, table B-26). Hispanic students, in turn, outscored Black students. In both mathematics literacy and problem solving, the average scores for Blacks and Hispanicswere below the OECD average scores, while scores for Whites were above the OECD average scores.
For further results from PISA 2003, see the Organization for Economic Cooperation andDevelopment (OECD) publication Learning for Tomorrow’s World — First Results From PISA2003, available at http://www.pisa.oecd.org (OECD 2004). A technical report for PISA 2003—which describes in detail all the procedures used in the design, data collection, quality con-trol, and analysis for the study, as well as the PISA 2003 data itself—will also be made avail-able at that site.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
AcknowledgmentsThe authors of this report cannot take full credit for its production. Many people contributedto making this report possible, and the authors wish to thank all those who have assistedwith various aspects of the report, including data analysis, reviews, and design.
Members of the Trends in International Mathematics and Science Study-Program forInternational Student Assessment (TIMSS-PISA) expert panel provided valuable input onissues related to communication and dissemination. Members are listed in appendix C.
We also wish to thank the technical reviewers, Marilyn Seastrom, Devon Carlson, and ZeyuXu, and other reviewers (Mary Lindquist, Jeremy Kilpatrick, Joan Ferrini-Mundy, and DanaKelly) for their comments. Thanks are also owed to Brian Henigin, Michael Stock, and TraceySummerall of Westat for their design work.
Members of the PISA international management team, including the Australian Council forEducational Research (ACER) and the Organization for Economic Cooperation andDevelopment (OECD), graciously assisted with data analysis and documentation questions.In particular, we would like to gratefully acknowledge the assistance of Claudia Tamassia,Andreas Schleicher, and Claire Shewbridge at the OECD.
Finally, we wish to especially thank the students, schools, and principals who participated inPISA 2003. Their time and effort provide us with data to look beyond our borders and gainvaluable insight into our own educational practices.
v
PISA 2003 Results From the U.S. Perspective
International Outcomes of Learning in Mathematics Literacy and Problem Solving
Table B-2. Percentage distribution and average combined mathematics literacy scores of U.S. 15-year-old students, by type of mathematics class: 2003 . . . . . . . . . . . . .71
Table B-6. Percentage distribution of 15-year-old students scoring at each proficiency level on the combined mathematics literacy scale, by country: 2003 . . . . . . .76
Table B-7. Percentage distribution of 15-year-old students scoring at each proficiency level on the mathematics literacy quantity subscale, by country: 2003 . . . . . .78
Table B-8. Percentage distribution of 15-year-old students scoring at each proficiency level on the mathematics literacy space and shape subscale, by country: 2003 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .80
Table B-9. Percentage distribution of 15-year-old students scoring at each proficiency level on the mathematics literacy change and relationships subscale, by country: 2003 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .82
Table B-10. Percentage distribution of 15-year-old students scoring at each proficiency level on the mathematics literacy uncertainty subscale, by country: 2003 . . . .84
Table B-15. Percentage distribution of 15-year-old students scoring at each proficiency level on the problem-solving scale, by country: 2003 . . . . . . . . . . . . . . . . . . . .91
Figure 8. Percentage distribution of 15-year-old students in the OECD countries and the United States on the problem-solving scale, by proficiency level: 2003 . . . . . . .32
Figure 9. Average reading literacy and science literacy scores of 15-year-old students in the OECD countries and the United States: 2003 . . . . . . . . . . . . . . . . . . . . . . .33
Figure 10. Differences in average scores of 15-year-old students on the combined mathematics literacy scale and in problem solving, by sex and country: 2003 . . . . . . . .36
Figure 11. Average scores of U.S. 15-year-old students on the combined mathematics literacy scale and in problem solving, by race/ethnicity: 2003 . . . . . . . . . .38
International Outcomes of Learning in Mathematics Literacy and Problem Solving
xvi
IntroductionPISA in BriefThe Program for International StudentAssessment (PISA) is a system of interna-tional assessments that measures 15-year-olds’ capabilities in reading literacy, mathe-matics literacy, and science literacy every 3years. PISA was first implemented in 2000(figure 1).
PISA is sponsored by the Organization forEconomic Cooperation and Development(OECD), an intergovernmental organizationof 30 industrialized nations. In 2003, 41 coun-tries participated in PISA, including 30OECD countries and 11 non-OECD coun-tries (table 1). Of those 41 countries, com-parisons for 39 countries (29 OECD coun-tries and 10 non-OECD countries) are pro-vided in this report. Data for one country,Brazil, were not available at the time ofreport production, and data for one other, theUnited Kingdom, are not discussed due tolow response rates.
Figure 1. Program for International StudentAssessment (PISA) cycle
NOTE: The subject in all capital letters in each assessmentcycle is the major domain for that cycle.SOURCE: Organization for Economic Cooperation andDevelopment (OECD), Program for International StudentAssessment (PISA), 2003.
RE
AD
ING
Ma
them
atic
sS
cien
ce
SCIENCEMathematicsReading
Reading
MATH
EMA
TIC
S
Science
2000(2009…)
2003(2012…)
2006(2015…)
1
PISA 2003 Results From the U.S. Perspective
Table 1. Participation in the Program forInternational Student Assessment(PISA), by country: 2000 and 2003
Country 2000 2003Organization for EconomicCooperation and Development(OECD) countries
1Due to low response rates, PISA 2000 data for theNetherlands are not discussed in this report. For informationon the results for the Netherlands, see OECD (2001).2 Due to low response rates, PISA 2003 data for the UnitedKingdom are not discussed in this report.3 Although Brazil participated in PISA 2003, its data were notavailable in time for production of this report. NOTE: A "•" indicates that the country participated in PISAin the specific year. Because PISA is principally an OECDstudy, non-OECD countries are displayed separately from theOECD countries. SOURCE: Organization for Economic Cooperation andDevelopment (OECD), Program for International StudentAssessment (PISA), 2000 and 2003.
In order to provide a critical external per-spective on the achievement of U.S. stu-dents through comparisons to other nations,the United States participates at the inter-national level in PISA, the Progress inInternational Reading Literacy Study(PIRLS), and the Trends in InternationalMathematics and Science Study (TIMSS).1
TIMSS and PIRLS seek to measure stu-dents’ mastery of specific knowledge, skills,and concepts, and are designed to reflectcurriculum frameworks in the United Statesand other participating countries.
PISA provides a unique and complementaryperspective to these studies by not focusingexplicitly on curricular outcomes, but on theapplication of knowledge in reading, mathe-matics, and science to problems with a real-life context (OECD 1999). The framework foreach assessment area is based upon con-tent, processes, and situations or contexts.For example, for mathematics literacy, thecontent is made up of major mathematicalideas, such as space and shape and uncer-tainty. The processes describe what strate-gies students use to solve mathematicsproblems, such as making connections orperforming simple calculations. The situa-tions or contexts refer to the kinds of placesin which students might encounter mathe-matical problems, such as personal or edu-cational. Assessment items are then devel-oped based on these descriptions.
PISA uses the terminology of “literacy” ineach subject area to denote its broad focuson application of knowledge and skills; thatis, PISA seeks to ask if 15-year-olds aremathematically literate, or to what extentthey can apply mathematical knowledge andskills to a range of different situations theymay encounter in their lives. Literacy itselfrefers to a continuum of skills—it is not acondition that one has or does not have (i.e.,literacy or illiteracy), but rather each per-son’s skills place them in a particular placeon the literacy continuum.
Each PISA data-collection effort assessesone subject area in depth, even as all threeare assessed in each cycle so that partici-pating countries have an ongoing source ofachievement data in every subject area. Inaddition to the reading literacy, mathematicsliteracy, and science literacy, PISA alsomeasures general or cross-curricular com-petencies such as learning strategies. Inthis second cycle, PISA 2003, mathematicsliteracy was the subject area assessed indepth, along with the new cross-curriculararea of problem solving. In 2006, PISA willfocus on science literacy. Results fromPISA 2000, which focused on reading litera-cy, are described in Lemke et al. (2001) andOrganization for Economic Cooperation andDevelopment (OECD) (2001). In addition, aseries of thematic reports exploring topicsrelated to reading literacy in greater depthare available through http://www.pisa.oecd.org(see also the PISA resources and publica-tions section of this report for informationabout PISA publications).
This report focuses on the performance ofU.S. students in the two major areasassessed in 2003, mathematics literacy andproblem solving. Achievement in the minordomains of reading literacy and science lit-eracy in 2003 is also presented, and differ-ences in achievement by selected studentcharacteristics are covered in the final sec-tion.
The Unique Contribution of PISAThe United States has conducted surveys ofstudent achievement at a variety of gradelevels and in a variety of subject areasthrough the National Assessment ofEducational Progress (NAEP) for manyyears. NAEP provides a regular benchmarkfor states and the nation and a means tomonitor progress in achievement over time.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
1The United States has also participated in international comparative assessments of civics knowledge and skills (CivEd 1999)and adult literacy (International Adult Literacy Survey [IALS 1994] and Adult Literacy and Lifeskills Survey [ALL 2003]).
The target age of 15 allows countries tocompare outcomes of learning as studentsnear the end of compulsory schooling.PISA’s goal is to answer the question “whatknowledge and skills do students have atage 15?” taking into account schooling andother factors that may influence their per-formance. In this way, PISA’s achievementscores represent a “yield” of learning at age15, rather than a direct measure of attainedcurriculum knowledge at a particular gradelevel, since 15-year-olds in the UnitedStates and elsewhere come from severalgrade levels and are enrolled in a varietyof classes (figures 2 and 3, tables B-1 andB-2).
How PISA 2003 Was ConductedPISA 2003 was sponsored by the OECD andcarried out at the international level througha contract with the PISA Consortium, led bythe Australian Council for EducationalResearch (ACER).2 The National Center forEducation Statistics (NCES) of the Instituteof Education Sciences at the U.S.Department of Education was responsiblefor the implementation of PISA in theUnited States. Data collection in the UnitedStates was carried out through a contractwith Westat. A review panel (see appendixC for a list of members) provides input onthe development and dissemination of PISA(and TIMSS) in the United States.
PISA 2003 was a 2-hour paper-and-pencilassessment of 15-year-olds collected fromnationally representative samples in partici-pating countries. Like other large-scaleassessments, PISA was not designed toprovide individual student scores, but rathernational and sub-national estimates of per-formance. Every student in PISA 2003 wasassessed in mathematics literacy; reading,problem solving, and science questions werespread among students (for more informa-tion on PISA 2003’s design, see the techni-cal notes in appendix A).
3
PISA 2003 Results From the U.S. Perspective
2The PISA Project Consortium consists of the Australian Council for Educational Research (ACER), the Netherlands NationalInstitute for Educational Measurement (CITO), Educational Testing Service (ETS, USA), National Institute for Educational PolicyResearch (NIER, Japan), and Westat (USA). 3The sample frame data for the United States for public schools were from the Common Core of Data (CCD), and the data forprivate schools were from the Private School Survey (PSS). Any school containing at least one 7th- through 12th-grade class asof the school year 2000–01 was included on the school sampling frame.
PISA 2003 was administered between Marchand May 2003. The U.S. sample includedboth public and private schools, randomlyselected and weighted to be representativeof the nation.3 In the United States, toimprove response rates (a response rate ofapproximately 50 percent was projected forthe end of the data collection period) andbetter accommodate school schedules, asecond testing window was opened fromSeptember through November 2003. In total,262 schools and 5,456 students participatedin PISA 2003 in the United States. An over-all weighted school response rate of 65 per-cent before the use of replacement schoolsand a weighted student response rate of 83percent was achieved after testing in thesecond window was complete (see technicalnotes in appendix A for additional details onsampling, administration, response rates,and other issues).
For further results from PISA 2003, see theOrganization for Economic Cooperation andDevelopment (OECD) publication Learningfor Tomorrow’s World — First Results FromPISA 2003, available at http://www.pisa.oecd.org(OECD 2004). A technical report for PISA2003—which describes in detail all the pro-cedures used in the design, data collection,quality control, and analysis for the study, aswell as the PISA 2003 data itself—is alsoavailable at that site.
This report provides results for the UnitedStates in relation to the other countries par-ticipating in PISA 2003, distinguishingOECD countries and non-OECD countries.All differences described in this report havebeen tested for statistical significance atthe .05 level. Additional information on sta-tistical procedures used in this report is pro-vided in the technical notes.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
4
Grades 7 and 8
Grade 9 Grade 10 Grade 11 and above
0
10
20
30
40
50
60
70
80
90
100
Percent
Grade
2
30
61
7
Figure 2. Percentage distribution of U.S. 15-year-old students, by grade: 2003
9
Algebra I29
Algebra II 21
Geometry 31
3
Pre-algebra or general math
Precalculus or calculus
Other 8
Figure 3. Percentage distribution of U.S. 15-year-old students, by type of mathematics class:2003
NOTE: Detail may not sum to totals because of rounding.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2003.
NOTE: Type of class refers to the mathematics class in which the student was enrolled at the time of assessment. Detail maynot sum to totals because of rounding.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2003.
U.S. Performance inMathematics LiteracyPISA’s major focus in 2003 was mathematicsliteracy. Mathematics literacy is defined as:
...an individual’s capacity to identify andunderstand the role that mathematics playsin the world, to make well-founded judge-ments and to use and engage with mathe-matics in ways that meet the needs of thatindividual’s life as a constructive, con-cerned, and reflective citizen. (OECD 2003,p.24)
PISA’s emphasis is on the ability to apply arange of knowledge and skills to a variety ofproblems with real-life contexts. In thePISA 2003 mathematics literacy assess-ment, students completed exercisesdesigned to assess their capabilities inusing a range of mathematical competen-cies, grouped and described as “competen-cy clusters.” These clusters—reproduction,connections, and reflection—describe setsof skills students may use to solve prob-lems. The reproduction cluster involves thereproduction of the practiced material andperforming routine operations. The connec-tions cluster calls for integration and con-nection of material, and the modest exten-sion of practiced material. The reflectioncluster relates to students’ abilities inadvanced reasoning, argumentation,abstraction, generalization, and modelingapplied to new contexts.
The problems themselves were designed tocome from the variety of situations (person-al, educational/occupational, public, or sci-entific) that students encounter, and to havea real-life context. The mathematical con-tent of the problems was drawn from fouroverarching ideas: space and shape, changeand relationships, quantity, and uncertainty.
5
PISA 2003 Results From the U.S. Perspective
These overarching ideas represent a way toorganize mathematical content broadly andencompass many traditional curricular areassuch as algebra or geometry (see alsoSteen 1990).
• Space and shape includes recognizingshapes and patterns, describing, encod-ing, and decoding visual information,understanding dynamic changes toshapes, understanding similarities anddifferences and relative positions, andunderstanding the relationship betweenvisual representations and real shapesand images.
• Change and relationships covers the rep-resentation of change, including mathe-matical functions such as linear, expo-nential, or logistic, as well as data analy-sis needed to specify relationships ortranslate between representations.
• Quantity focuses on quantitative reason-ing (including number sense, estimating,mental arithmetic, understanding mean-ing of operations, having a feel for themagnitude of numbers, and computa-tions) and understanding of numericalpatterns, counts, and measures.
• Uncertainty includes the two related topicsof data and chance, or statistics and prob-ability, including data analysis and graphicand numeric representations of data.
A comparative analysis of the NAEP, PISA,and TIMSS mathematics assessments spon-sored by NCES found that the 2003 PISAmathematics literacy assessment used farfewer multiple-choice items than NAEP orTIMSS. PISA also had a much stronger con-tent focus on the “data” area (which oftendeals with using charts and graphs), whichfits with PISA’s emphasis on using materi-als with a real-world context (see technicalnotes for more information on the results ofthe assessment comparisons).4
4See Neidorf, T.S., Binkley, M., Gattis, K., and Nohara, D. (forthcoming) and the technical notes in appendix A for more informa-tion. Other comparative analyses focus on assessments of science and reading in PISA, NAEP, TIMSS, and PIRLS. SeeNeidorf, T.S., Binkley, M., and Stephens, M. (forthcoming); Binkley, M., and Kelly, D. (2003); Binkley, M., Afflerbach, P., and Kelly, D.(forthcoming); and Nohara, D. (2001).
Sample mathematics literacy items for eachof these areas and student responses areshown here. For more information about themathematics literacy domain, refer to ThePISA 2003 Assessment Framework:Mathematics, Reading, Science, and ProblemSolving Knowledge and Skills (OECD 2003).Additional mathematics literacy sampleitems can be found at http://nces.ed.gov/surveys/pisa, in the PISA 2003 frameworkdocument referenced above, in MeasuringStudent Knowledge and Skills: The PISA 2000Assessment of Reading, Mathematical andScientific Literacy (OECD 2000) and inSample Tasks from the PISA 2000Assessment: Reading, Mathematical andScientific Literacy (OECD 2002).
International Outcomes of Learning in Mathematics Literacy and Problem Solving
SOURCE: Organization for Economic Cooperation and Development (OECD), Program forInternational Student Assessment (PISA), 2003.
(a)(b)
(c)
(d)(e)
(f)
International Outcomes of Learning in Mathematics Literacy and Problem Solving
8
Exhibit 2. Change and relationships sample item: 2003
The Best Car
SOURCE: Organization for Economic Cooperation and Development (OECD), Program forInternational Student Assessment (PISA), 2003.
9
PISA 2003 Results From the U.S. Perspective
Exhibit 2. Change and relationships sample item: 2003—Continued
The Best Car
SOURCE: Organization for Economic Cooperation and Development (OECD), Program forInternational Student Assessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
10
Exhibit 3. Quantity sample item: 2003
Exchange Rate
SOURCE: Organization for Economic Cooperation and Development (OECD), Program forInternational Student Assessment (PISA), 2003.
11
PISA 2003 Results From the U.S. Perspective
Exhibit 3. Quantity sample item: 2003—Continued
Exchange Rate
SOURCE: Organization for Economic Cooperation and Development (OECD), Program forInternational Student Assessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
12
Exhibit 4. Uncertainty sample item: 2003
Test Scores
SOURCE: Organization for EconomicCooperation and Development(OECD), Program for InternationalStudent Assessment (PISA), 2003.
Combined mathematics literacy scores arereported on a scale with a mean of 500 andstandard deviation of 100.5 Fifteen-year-oldstudents in the United States had an aver-age score of 483 on the combined mathemat-ics literacy scale, lower than the OECDaverage score of 500 (tables 2 and B-3). U.S.students were less mathematically literatethan their peers in 20 of the other 28 OECDcountries and 3 of the 10 non-OECD coun-tries. Eleven countries (5 OECD countriesand 6 non-OECD countries) reported lowerscores compared to the United States inmathematics literacy.
U.S. students also had lower scores than the OECD average scores for each of thefour content area subscales (space andshape, change and relationships, quantity,and uncertainty). Twenty-four countries (20OECD and 4 non-OECD countries) outper-formed the United States on the space andshape subscale, 21 countries (18 OECD and 3 non-OECD countries) outperformed theUnited States on the change and relation-ships subscale, 26 countries (23 OECD and 3non-OECD countries) outscored the UnitedStates on the quantity subscale, and 19countries (16 OECD and 3 non-OECD coun-tries) outscored the United States on theuncertainty subscale.
13
PISA 2003 Results From the U.S. Perspective
5Because the average was set for the combined mathematics literacy scale, average scores for the mathematics literacy subscalesdiffer slightly from 500. PISA 2000 mathematics literacy scores were re-scaled using the greater detail in PISA 2003 data in order toprovide a more complete measure of achievement than that available in 2000. See technical notes in appendix A for more informa-tion on scaling. PISA’s intent for each subject area is to draw baseline information for describing changes and trends in achieve-ment from the cycle in which that subject area is the major domain. The use of minor domains allows PISA to provide indicativeinformation about changes in performance over time; however, changes in a subject area are best measured from the cycle in whichit is the major domain. Thus, changes in reading literacy achievement are based upon PISA 2000 data, when reading literacy was themajor domain, and changes in mathematics literacy scores, in turn, are based upon this 2003 cycle. Science literacy scores from 2000and 2003 may be re-scaled based up on the much greater detail for science literacy which will be available in 2006.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
14
Table 2. Average combined mathematics literacy scores and subscale scores of 15-year-oldstudents, by country: 2003
Combined mathematics literacy
Mathematics subscales
Space and shape Change and relationships
Country Score Country Score Country ScoreOECD average 500 OECD average 496 OECD average 499
OECD countries OECD countries OECD countriesFinland 544 Japan 553 Netherlands 551Korea 542 Korea 552 Korea 548Netherlands 538 Switzerland 540 Finland 543Japan 534 Finland 539 Canada 537Canada 532 Belgium 530 Japan 536Belgium 529 Czech Republic 527 Belgium 535Switzerland 527 Netherlands 526 New Zealand 526Australia 524 New Zealand 525 Australia 525New Zealand 523 Australia 521 Switzerland 523Czech Republic 516 Canada 518 France 520Iceland 515 Austria 515 Czech Republic 515Denmark 514 Denmark 512 Iceland 509France 511 France 508 Denmark 509Sweden 509 Slovak Republic 505 Germany 507Austria 506 Iceland 504 Ireland 506Germany 503 Germany 500 Sweden 505Ireland 503 Sweden 498 Austria 500Slovak Republic 498 Poland 490 Hungary 495Norway 495 Luxembourg 488 Slovak Republic 494Luxembourg 493 Norway 483 Norway 488Poland 490 Hungary 479 Luxembourg 487Hungary 490 Spain 476 United States 486Spain 485 Ireland 476 Poland 484United States 483 United States 472 Spain 481Portugal 466 Italy 470 Portugal 468Italy 466 Portugal 450 Italy 452Greece 445 Greece 437 Greece 436Turkey 423 Turkey 417 Turkey 423Mexico 385 Mexico 382 Mexico 364
Non-OECD countries Non-OECD countries Non-OECD countriesHong Kong-China 550 Hong Kong-China 558 Hong Kong-China 540Liechtenstein 536 Liechtenstein 538 Liechtenstein 540Macao-China 527 Macao-China 528 Macao-China 519Latvia 483 Latvia 486 Latvia 487Russian Federation 468 Russian Federation 474 Russian Federation 477Serbia and Montenegro 437 Serbia and Montenegro 432 Serbia and Montenegro 419Uruguay 422 Thailand 424 Uruguay 417Thailand 417 Uruguay 412 Thailand 405Indonesia 360 Indonesia 361 Tunisia 337Tunisia 359 Tunisia 359 Indonesia 334
See notes at end of table.
15
PISA 2003 Results From the U.S. Perspective
Table 2. Average combined mathematics literacy scores andsubscale scores of 15-year-old students, by country:2003—Continued
Mathematics subscales
Quantity Uncertainty
Country Score Country ScoreOECD average 501 OECD average 502
OECD countries OECD countriesFinland 549 Netherlands 549Korea 537 Finland 545Switzerland 533 Canada 542Belgium 530 Korea 538Netherlands 528 New Zealand 532Canada 528 Australia 531Czech Republic 528 Japan 528Japan 527 Iceland 528Australia 517 Belgium 526Denmark 516 Ireland 517Germany 514 Switzerland 517Sweden 514 Denmark 516Iceland 513 Norway 513Austria 513 Sweden 511Slovak Republic 513 France 506New Zealand 511 Czech Republic 500France 507 Austria 494Ireland 502 Poland 494Luxembourg 501 Germany 493Hungary 496 Luxembourg 492Norway 494 United States 491Spain 492 Hungary 489Poland 492 Spain 489United States 476 Slovak Republic 476Italy 475 Portugal 471Portugal 465 Italy 463Greece 446 Greece 458Turkey 413 Turkey 443Mexico 394 Mexico 390
Non-OECD countries Non-OECD countriesHong Kong-China 545 Hong Kong-China 558Liechtenstein 534 Macao-China 532Macao-China 533 Liechtenstein 523Latvia 482 Latvia 474Russian Federation 472 Russian Federation 436Serbia and Montenegro 456 Serbia and Montenegro 428Uruguay 430 Thailand 423Thailand 415 Uruguay 419Tunisia 364 Indonesia 385Indonesia 357 Tunisia 363Average is significantly higher than the U.S. averageAverage is not significantly different than the U.S. averageAverage is significantly lower than the U.S. average
NOTE: Statistical comparisons between the U.S. average and the Organization forEconomic Cooperation and Development (OECD) average take into account the contri-bution of the U.S. average toward the OECD average. The OECD average is the averageof the national averages of the OECD member countries with data available. Because theProgram for International Student Assessment (PISA) is principally an OECD study, theresults for non-OECD countries are displayed separately from those of the OECD coun-tries and are not included in the OECD average. Due to low response rates, data for theUnited Kingdom are not discussed in this report.SOURCE: Organization for Economic Cooperation and Development (OECD), Programfor International Student Assessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
16
Along with scale scores, PISA 2003 also usessix proficiency levels (levels 1 through 6, withlevel 6 being the highest level of proficiency)to describe student performance in mathe-matics literacy (exhibit 5). An additional level(below level 1) encompasses students whoseskills cannot be described using these profi-ciency levels. The proficiency levels describewhat students at each level can do and allowcomparisons of the percentages of studentsin each country who perform at different lev-els of mathematics literacy (see technicalnotes in appendix A for more informationabout how levels were set).
The U.S. average score of 483 on the com-bined mathematics literacy scale was justabove the bottom cut point for level 3; theOECD average score of 500 was near themidpoint of level 3 (table 2, exhibit 5). Thecutoff score of 607 for U.S. high performers(those in the top 10 percent in the UnitedStates) placed it just into level 5; the OECDscore for high performers was near the mid-point of level 5. The cutoff U.S. score of 356for low performers (those in the bottom 10percent) was below level 1, while the OECDcutoff score of 369 for the bottom 10 percentwas a level 1 score (figure 4, exhibit 5).
On average, the highest U.S. achievers(those in the top 10 percent of U.S. students)were outperformed by their OECD counter-parts (figure 4, table B-4). To be in the top 10percent in the United States, students hadto score 607 or higher, while on averageacross the OECD countries, students wouldhave had to score 628 or higher to be in thetop 10 percent. Scores for the top 10 percentof students within countries ranged from 466or better in Indonesia and Tunisia to 672 orbetter in Hong Kong-China. Low performersin the United States (those in the bottom 10percent) had a cutoff score of 356 or lower,which was lower than the cutoff score of 369or lower for the OECD average. There wasapproximately a 251 point score difference,or about two and a half standard deviations,between the cutoff scores for the top 10 per-cent and the bottom 10 percent of 15-year-oldstudents for mathematics literacy in theUnited States, compared to about a 259 pointdifference using the OECD average scores.
The standard deviation (which measures thespread of scores around the average) for theUnited States (95), in fact, was lower thanthe OECD average standard deviation of 100(table B-5). Sixteen countries (10 OECD and6 non-OECD countries) showed less varia-tion in performance than the United States,while three countries (Belgium, Germany,and Uruguay) had larger standard deviations.
17
PISA 2003 Results From the U.S. Perspective
0 100 200 300 400 500 600 700 800 900 10
TunisiaIndonesiaThailandUruguay
Serbia and MontenegroRussian Federation
LatviaMacao-ChinaLiechtenstein
Hong Kong-China
MexicoTurkey
GreeceItaly
PortugalUnited States
SpainHungary
PolandLuxembourg
NorwaySlovak Republic
IrelandGermany
AustriaSweden
FranceDenmark
IcelandCzech Republic
New ZealandAustralia
SwitzerlandBelgiumCanada
JapanNetherlands
Korea, Republic ofFinland
OECD average
Average score
NOTE: The Organization for Economic Cooperation and Development (OECD) average is the average of the national averages of the OECD member countries with data available. Because the Program for International Student Assessment (PISA) is principally an OECD study, the results for non-OECD countries are displayed separately from those of the OECD countries and are not included in the OECD average. Due to low response rates, data for the United Kingdom are not discussed in this report.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment (PISA), 2003.
10 75 90
Percentiles of performance
25 th th th th
Non-OECD countries
OECD countries
Country
Mean and 95% Confidence interval (+/- 2 standard error)
Figure 4. Distribution of combined mathematics literacy scores of 15-year-old students, bycountry: 2003
International Outcomes of Learning in Mathematics Literacy and Problem Solving
18
Exhibit 5. Description of proficiency levels for combined mathematics literacy: 2003
Proficiency level Task descriptionsLevel 1 At Level 1 students can answer questions involving familiar contexts where all rele-
vant information is present and the questions are clearly defined. They are able toidentify information and to carry out routine procedures according to direct instruc-tions in explicit situations. They can perform actions that are obvious and followimmediately from the given stimuli.
Level 2 At Level 2 students can interpret and recognize situations in contexts that requireno more than direct inference. They can extract relevant information from a singlesource and make use of a single representational mode. Students at this level canemploy basic algorithms, formula, procedures, or conventions. They are capable ofdirect reasoning and making literal interpretations of the results.
Level 3 At Level 3 students can execute clearly described procedures, including those thatrequire sequential decisions. They can select and apply simple problem solvingstrategies. Students at this level can interpret and use representations based ondifferent information sources and reason directly from them. They can develop shortcommunications reporting their interpretations, results, and reasoning.
Level 4 At Level 4 students can work effectively with explicit models for complex concretesituations that may involve constraints or call for making assumptions. They canselect and integrate different representations, including symbolic, linking themdirectly to aspects of real-world situations. Students at this level can utilize well-developed skills and reason flexibly, with some insight, in these contexts. They canconstruct and communicate explanations and arguments based on their interpreta-tions, arguments, and actions.
Level 5 At Level 5 students can develop and work with models for complex situations, iden-tifying constraints and specifying assumptions. They can select, compare, and eval-uate appropriate problem solving strategies for dealing with complex problemsrelated to these models. Students at this level can work strategically using broad,well-developed thinking and reasoning skills, appropriate linked representations,symbolic and formal characterizations, and insight pertaining to these situations.They can reflect on their actions and formulate and communicate their interpreta-tions and reasoning.
Level 6 At Level 6 students can conceptualize, generalize, and utilize information based ontheir investigations and modeling of complex problem situations. They can link dif-ferent information sources and representations and flexibly translate among them.Students at this level are capable of advanced mathematical thinking and reason-ing. These students can apply this insight and understandings along with a masteryof symbolic and formal mathematical operations and relationships to develop newapproaches and strategies for attacking novel situations. Students at this level canformulate and precisely communicate their actions and reflections regarding theirfindings, interpretations, arguments, and the appropriateness of these to the origi-nal situations.
NOTE: In order to reach a particular level, a student must have been able to correctly answer a majority of items at thatlevel. Students were classified into mathematics literacy levels according to their scores. Exact cut point scores areas follows: below level 1 (a score less than or equal to 357.77); level 1 (a score greater than 357.77 and less than or equal to420.07); level 2 (a score greater than 420.07 and less than or equal to 482.38); level 3 (a score greater than 482.38 and lessthan or equal to 544.68); level 4 (a score greater than 544.68 and less than or equal to 606.99); level 5 (a score greater than606.99 and less than or equal to 669.3); level 6 (a score greater than 669.3).SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International StudentAssessment (PISA), 2003.
19
PISA 2003 Results From the U.S. Perspective
States (four of these nine—Greece, Italy,Mexico, and Turkey—were OECD countries).These same nine countries, as well as theRussian Federation and Portugal, had morestudents at level 1 than the United States.
The United States had a lower percentage ofstudents at level 6 than the OECD average foreach of the four content area subscales(space and shape, change and relationships,quantity, and uncertainty) and a smaller per-centage than the OECD average at level 4 andlevel 5 on three of the four subscales (excep-tions were for uncertainty at level 5 and changeand relationships at level 4) (tables B-7through B-10).
The United States also had a higher percent-age of students at level 1 than the OECDaverage on each of the four subscales andmore at level 2 for all subscales except uncer-tainty. On the quantity and uncertainty sub-scales, the United States also had greaterpercentages of students than the OECD aver-age percentages below level 1.
The United States had greater percentages ofstudents below level 1 and at levels 1 and 2than the OECD average percentages (figure 5,table B-6). The United States also had a lowerpercentage of students at levels 4, 5, and 6,than the OECD average percentages. This issomewhat different from the 2000 results,when reading literacy was the major domain.PISA 2000 results showed that while the per-centages of U.S. students performing at level 1and below were not measurably different fromthe OECD averages, the United States had agreater percentage of students performing atthe highest level (level 5) compared to theOECD average (Lemke et al. 2001).
In mathematics literacy in 2003, half (19) of theother 38 countries had a higher percentage ofstudents at level 6 than the United States,including 16 OECD countries and 3 non-OECDcountries (Hong Kong-China, Liechtenstein,and Macao-China) (figure 6, table B-6). In contrast, nine countries had a higher percent-age of students below level 1 than the United
Figure 5. Percentage distribution of 15-year-old students in the OECD countries and theUnited States on the combined mathematics literacy scale, by proficiency level: 2003
NOTE: In order to reach a particular proficiency level, a student must have been able to correctly answer a majority of items atthat level. Students were classified into mathematics literacy levels according to their scores. Exact cut point scores are as fol-lows: below level 1 (a score less than or equal to 357.77); level 1 (a score greater than 357.77 and less than or equal to 420.07); level2 (a score greater than 420.07 and less than or equal to 482.38); level 3 (a score greater than 482.38 and less than or equal to 544.68);level 4 (a score greater than 544.68 and less than or equal to 606.99); level 5 (a score greater than 606.99 and less than or equal to669.3); level 6 (a score greater than 669.3). The Organization for Economic Cooperation and Development (OECD) average is theaverage of the national averages of the OECD member countries with data available. Detail may not sum to totals because ofrounding.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
20
Figure 6. Percentage distribution of 15-year-old students on the combined mathematics literacy scale, by proficiency level and country: 2003
NOTE: In order to reach a particular proficiency level, a student must have been able to correctly answer a majority of items atthat level. Students were classified into mathematics literacy levels according to their scores. Exact cut point scores are as fol-lows: below level 1 (a score less than or equal to 357.77); level 1 (a score greater than 357.77 and less than or equal to 420.07); level2 (a score greater than 420.07 and less than or equal to 482.38); level 3 (a score greater than 482.38 and less than or equal to 544.68);level 4 (a score greater than 544.68 and less than or equal to 606.99); level 5 (a score greater than 606.99 and less than or equal to669.3); level 6 (a score greater than 669.3). The Organization for Economic Cooperation and Development (OECD) average is theaverage of the national averages of the OECD member countries with data available. Because PISA is principally an OECDstudy, the results for non-OECD countries are displayed separately from those of the OECD countries and are not included inthe OECD average. Due to low response rates, data for the United Kingdom are not discussed in this report. Detail may notsum to totals because of rounding.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2003.
6For more information on scaling, see the technical notes in appendix A.
Of the five countries that showed increaseson the space and shape subscale, Belgiumand the Czech Republic already outper-formed the United States in 2000 and alsoimproved their scores in 2003. Italy, despiteits improvement in score, was not measura-bly different from the United States in eitheryear. Poland, which was not measurably dif-ferent from the United States in 2000,outscored the United States in 2003, andLuxembourg, which scored below the UnitedStates in 2000, also outscored the UnitedStates in 2003.
Two countries (Mexico and Iceland) showeddecreased scores from 2000 to 2003 on thespace and shape scale. Despite thesedecreases in performance, there was nochange in the relative position of eithercountry compared to the United States: thatis, Iceland outperformed the United Statesin 2000 and 2003 on the space and shape sub-scale, and Mexico performed worse than theUnited States in 2000 and 2003.
Of the other 25 OECD countries, 11 had theirscores improve from 2000 to 2003 on thechange and relationships subscale, while nocountry had a decrease. Of the 11 countriesthat improved from 2000 to 2003, severalalready outperformed the United States in2000: Belgium, Canada, Denmark, Finland,and Korea all scored higher than the UnitedStates in 2000 on the change and relation-ships subscale. Several other countrieswere not measurably different from theUnited States in 2000, but outperformed theUnited States in 2003 (Czech Republic,Germany, Hungary). Three countries(Luxembourg, Poland, and Spain) had lowerscores than the United States in 2000 on thechange and relationships subscale, but werenot measurably different from the UnitedStates in 2003. Portugal, despite itsimprovement in score, still scored lower thanthe United States in 2000 and 2003.
Changes in Mathematics LiteracyPerformance From 2000 to 2003Because mathematics literacy was a minordomain in 2000, items from only two contentareas (space and shape and change and rela-tionships) were administered in that assess-ment cycle. As a result, it is not possible todescribe changes since 2000 for the combinedmathematics literacy scale or for the othertwo content areas (quantity or uncertainty).Rather, changes can only be discussed forthe two content areas represented in 2000 and2003 (space and shape and change and rela-tionships). Data from 2000 were re-scaledusing 2003 mathematics literacy data in orderto make these comparisons.6 Comparisonswere available only for OECD countries com-mon to both the 2000 and 2003 cycle (28 coun-tries) but results for the United Kingdom andthe Netherlands are not discussed here dueto low response rates for the United Kingdomin 2003 and the Netherlands in 2000. In total,results for 26 OECD countries were availablefor comparisons and are discussed here.
There were no measurable changes in theU.S. scores from 2000 to 2003 on either thespace and shape subscale or the change andrelationships subscale (table B-11). In both2000 and 2003, about two-thirds of the othercountries outperformed the United Stateson these scales. Eighteen of the other 25OECD countries outscored the UnitedStates on the space and shape scale in 2003(compared to 19 in 2000); 17 OECD countriesoutscored the United States on the changeand relationships scale in 2003 (compared to14 in 2000).
Five countries had their scores improve onthe space and shape subscale. Four of thefive countries with improved scores on thespace and shape subscale also showedimprovements on the change and relation-ships scale (Belgium, Czech Republic,Luxembourg, and Poland; Italy improved itsscore on the space and shape scale but noton the change and relationships scale).
International Outcomes of Learning in Mathematics Literacy and Problem Solving
22
U.S. Performance inProblem SolvingAs noted, one of PISA’s major goals is toassess skills that cut across traditional cur-ricular areas. In 2003, PISA assessed stu-dents’ abilities in problem solving.7
Problem solving is defined as:
...an individual’s capacity to use cognitiveprocesses to confront and resolve real,cross-disciplinary situations where thesolution is not immediately obvious, andwhere the literacy domains or curricularareas that might be applicable are not with-in a single domain of mathematics, sci-ence, or reading. (OECD 2003, p. 156).
Students completed exercises that assessedtheir capabilities in using reasoningprocesses not only to draw conclusions butto make decisions, to troubleshoot (i.e., tounderstand the reasons for malfunctioningof a system or device), or to analyze the procedures and structures of acomplex system (such as a sim-ple kind of programming lan-guage). Problem-solvingitems required students toapply various reasoningprocesses, such as induc-tive and deductive reasoning, reasoning about cause andeffects, or combinatorial rea-soning (i.e., systematically compar-ing all the possible variations which canoccur in a well-described situation).Students were also assessed in their skillsin working toward a solution and communi-cating the solution to others through appro-priate representations. Sample problem-solving items and student responses areshown here.
7PISA 2003’s problem-solving assessment focused explicitly on problem-solving skills, using a variety of contexts, disciplines, andproblem types. The items used to measure problem solving in PISA 2003 were different from other items, such as those measuringmathematics literacy. Problem solving can also be embedded within measures of content areas such as mathematics or science,however. TIMSS 2003, for example, incorporated an explicit aspect of problem solving and inquiry into the description of desiredoutcomes for mathematics and science. A review of mathematics and science items in PISA and TIMSS showed that 38 percent ofeighth-grade TIMSS 2003 mathematics items and 48 percent of PISA 2003 mathematics literacy items measured some aspect ofproblem solving; additionally, 26 percent of eighth-grade TIMSS 2003 science items and 49 percent of PISA science literacy itemsmeasured problem-solving skills (Dossey, O’Sullivan, and McCrone forthcoming).
For more information about the problem-solving framework, please refer to The PISA2003 Assessment Framework: Mathematics,Reading, Science, and Problem SolvingKnowledge and Skills (OECD 2003).Additional released problem-solving itemscan be found athttp://nces.ed.gov/surveys/pisa.
1Design by Numbers was developed by the Aesthetics and Computation Group at the MIT Media Laboratory.Copyright 1999, Massachusetts Institute of Technology. The program can be downloaded fromhttp://dbn.media.mit.edu.
SOURCE: Organization for EconomicCooperation and Development (OECD),Program for International StudentAssessment (PISA), 2003.
Problem-solving scores are reportedon a scale with a mean of 500 andstandard deviation of 100. Fifteen-year-old students in the UnitedStates had an average score of 477on the problem-solving scale, lowerthan the OECD average score of 500(table 3, table B-12). U.S. studentsscored lower in problem solvingthan their peers in 25 of the other 38countries (22 OECD and 3 non-OECD countries). Eight countries (3OECD—Greece, Mexico, andTurkey—and 5 non-OECD countries)reported lower scores compared tothe United States in problem solv-ing. Three OECD country scores(and two non-OECD country scores)were not measurably different fromthe U.S. average score in problemsolving.
On average, U.S. high achievers forproblem solving (those scoring inthe top 10 percent in the UnitedStates) were outperformed by theirOECD counterparts (figure 7, tableB-13). To be in the top 10 percent ofstudents in the United States, stu-dents needed at least a score of 604,while they needed a score of 446 orbetter in Tunisia but 675 or better inJapan. Low performers in the UnitedStates (those in the bottom 10 per-cent) scored 347 or lower, which waslower than the cutoff score of 368 orlower for the OECD average. Therewas approximately a 256 point scoredifference, or two and a half stan-dard deviations, between the cutoffscores for the top 10 percent (604)and the bottom 10 percent (347) of15-year-old students for problemsolving in the United States.
Table 3. Average scores of 15-year-old students onthe problem-solving scale, by country: 2003
Non-OECD countriesHong Kong-China 548Macao-China 532Liechtenstein 529Latvia 483Russian Federation 479Thailand 425Serbia and Montenegro 420Uruguay 411Indonesia 361Tunisia 345
Average is significantly higher than the U.S. averageAverage is not significantly different than the U.S. averageAverage is significantly lower than the U.S. average
NOTE: Statistical comparisons between the U.S. average and theOrganization for Economic Cooperation and Development (OECD) aver-age take into account the contribution of the U.S. average toward theOECD average. The OECD average is the average of the national aver-ages of the OECD member countries with data available. Because theProgram for International Student Assessment (PISA) is principally anOECD study, the results for non-OECD countries are displayed separate-ly from those of the OECD countries and are not included in the OECDaverage. Due to low response rates, data for the United Kingdom are notdiscussed in this report.SOURCE: Organization for Economic Cooperation and Development(OECD), Program for International Student Assessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
30
0 100 200 300 400 500 600 700
TunisiaIndonesia
UruguaySerbia and Montenegro
ThailandRussian Federation
LatviaLiechtensteinMacao-China
Hong Kong-China
MexicoTurkey
GreeceItaly
PortugalUnited States
SpainPoland
NorwaySlovak Republic
LuxembourgIreland
HungaryIcelandAustriaSweden
GermanyCzech Republic
DenmarkFrance
NetherlandsSwitzerland
BelgiumCanada
AustraliaNew Zealand
JapanFinland
Korea
Non-OECD countries
Country
OECD averageOECD countries
Average score
NOTE: The Organization for Economic Cooperation and Development (OECD) average is the average of the national averages of the OECD member countries with data available. Because the Program for International Student Assessment (PISA) is principally an OECD study, the results for non-OECD countries are displayed separately from those of the OECD countries and are not included in the OECD average. Due to low response rates, data for the United Kingdom are not discussed in this report.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment (PISA), 2003.
10 75 90
Percentiles of performance
Mean and 95% Confidence interval (+/- 2 standard error)
25 th th th th
Figure 7. Distribution of problem-solving scores of 15-year-old students, by country: 2003
31
PISA 2003 Results From the U.S. Perspective
Along with scale scores, PISA 2003 alsouses three proficiency levels (levels 1through 3, with level 3 being the highest levelof proficiency) to describe student perform-ance in problem solving. An additional level(below level 1) encompasses studentswhose skills cannot be described usingthese proficiency levels (exhibit 9). The pro-ficiency levels describe what students ateach level can do and allow comparisons ofthe percentages of students in each countrywho performed at different levels in problemsolving (see appendix A for more informa-tion about how levels were set).
Of the 38 other participating countries, 22countries (including 16 OECD countries) hadless variation (as measured by standarddeviation) in performance in problem solvingthan the United States, while 3 countries(Belgium, Japan, and Uruguay) showedgreater variation in performance (table B-14). The U.S. variation in performance wasnot measurably different from the OECDaverage variation.
Exhibit 9. Description of proficiency levels for problem solving: 2003
Proficiency level Task descriptionsLevel 1 At Level 1 students can solve problems where they have to deal with a single data
source containing discrete, well-defined information. They understand the nature of aproblem and consistently locate and retrieve information related to the major fea-tures of the problem. Level 1 students may be able to transform the information in theproblem to present the problem differently (e.g., take information from a table to cre-ate a drawing or graph). Also, students may be able to apply information to check alimited number of well-defined conditions within the problem. However, Level 1 stu-dents are generally incapable of dealing with multi-faceted problems involving morethan one data source or requiring the student to reason with the information provided.
Level 2 At Level 2 students use reasoning and analytic processes and solve problemsrequiring decision-making skills. Level 2 students apply various types of reason-ing (inductive and deductive reasoning, reasoning about causes and effects, orcombinatorial reasoning, that is, systematically comparing all possible variationsin well-described situations) to analyze situations and to solve problems thatrequire students to make a decision among well-defined alternatives. To analyze asystem or make decisions, Level 2 students combine and synthesize informationfrom a variety of sources. Students may need to combine various forms of repre-sentations (e.g., a formalized language, numerical information, and graphicalinformation), handle unfamiliar representations (e.g., statements in a proto-pro-gramming language or flow diagrams related to a mechanical or structuralarrangement of components), or draw inferences based on two or more sources ofinformation.
Level 3 At Level 3 students do not only analyze a system and make decisions, they alsorepresent the underlying relationships in a problem and relate these to the solu-tion. Level 3 students approach problems systematically, construct their own rep-resentations and verify that their solution satisfies all requirements of the problem.These students communicate their solutions to others using written statementsand other representations.
NOTE: In order to reach a particular proficiency level, a student must have been able to correctly answer a majorityof items at that level. Students were classified into problem-solving levels according to their scores. Exact cut pointscores are as follows: below level 1 (a score less than or equal to 404.06); level 1 (a score greater than 404.06 and lessthan or equal to 498.08); level 2 (a score greater than 498.08 and less than or equal to 592.10); level 3 (a score greaterthan 592.10).SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International StudentAssessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
32
The U.S. average score of 477 on the prob-lem-solving scale placed it at level 1, whilethe OECD average score was at level 2(table B-12, exhibit 9). The cutoff score of604 for U.S. high performers (those in the top10 percent in the United States) equated to a level 3 score, while the U.S. cutoff score of347 for low performers (those in the bottom10 percent) was below level 1 (table B-13,exhibit 9).
Twenty-four percent of U.S. students scoredbelow level 1, 34 percent at level 1, 30 percentat level 2, and 12 percent at level 3 (figure 8,table B-15). The United States had greaterpercentages of students below level 1 and atlevel 1 than the OECD average percentages.The United States also had a lower percent-age of students at levels 2 and 3 than theOECD average percentages. Four countries(Finland, Hong Kong-China, Japan, andKorea) had 30 percent or more of their stu-dents performing at level 3 in problem solv-ing, compared with 12 percent for the UnitedStates and 18 percent for the OECD average.
United States
OECD average
1000 20 40 60 80
Country
Percent
24 34 30 12
17 30 34 18
Below level 1 Level 1 Level 2 Level 3
Figure 8. Percentage distribution of 15-year-old students in the OECD countries and theUnited States on the problem-solving scale, by proficiency level: 2003
NOTE: In order to reach a particular proficiency level, a student must have been able to correctly answer a majority ofitems at that level. Students were classified into problem-solving levels according to their scores. Exact cut point scoresare as follows: below level 1 (a score less than or equal to 404.06); level 1 (a score greater than 404.06 and less than or equalto 498.08); level 2 (a score greater than 498.08 and less than or equal to 592.10); level 3 (a score greater than 592.10). TheOrganization for Economic Cooperation and Development (OECD) average is the average of the national averages of theOECD member countries with data available. Detail may not sum to totals because of rounding. SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International StudentAssessment (PISA), 2003.
33
PISA 2003 Results From the U.S. Perspective
8Due to low response rates, data for the Netherlands were not discussed for PISA 2000; data for PISA 2003 for the United Kingdomare also not discussed due to low response rates; data for Brazil were not available at the time of production for this report.9Large standard errors for the United States in 2000 may account at least in part for the fact that U.S. reading literacy and science lit-eracy scores were not measurably different from 2000 to 2003 and that the scores were not different from the OECD averages in 2000.
There was no measurable change in eitherthe U.S. reading literacy score from 2000 to2003 or the U.S. position compared to theOECD average, although scores in 12 othercountries did change (table B-16).9 Fourcountries saw their average reading literacyscores increase (two non-OECD countries,Latvia and Liechtenstein, and two OECDcountries, Luxembourg and Poland). TheUnited States outperformed all four of thesecountries in 2000; in 2003, scores for Latviaand Poland were not measurably differentfrom the U.S. scores in reading literacy,while Liechtenstein outscored the UnitedStates in 2003. Despite an increase inLuxembourg’s average reading literacyscore, the United States outperformed it in2000 and 2003.
U.S. Performance inReading Literacy andScience LiteracyOf the 41 countries that participated in PISA2003, 32 also participated in PISA 2000.Changes in reading literacy and science lit-eracy are reported for 29 of these 32 coun-tries.8
In 2003, the average U.S. score in readingliteracy was 495, not measurably differentfrom the OECD average of 494 (figure 9,table B-16). Eleven countries (including 9 OECD countries) among the other 38countries outperformed the United Statesin reading literacy in 2003.
Figure 9. Average reading literacy and science literacy scores of 15-year-old students in theOECD countries and the United States: 2003
494 500495 491*
Reading literacy Science literacy
0
100
200
300
400
500
600
700
Average score
OECD average United States
* Average is significantly different from OECD average.NOTE: The Organization for Economic Cooperation and Development (OECD) average is the average of the national averagesof the OECD member countries with data available. SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
34
Eight countries’ scores (including sevenOECD countries) were lower in 2003 than2000 in reading literacy. Decreases in two ofthese eight countries’ scores resulted in achange relative to the United States. Japan,which outperformed the United States inreading literacy in 2000, was not measurablydifferent in 2003, while Spain, which did notperform measurably differently in 2000, per-formed worse than the United States in 2003.
In 2003, the U.S. average score in science literacy was 491, lower than the OECD aver-age of 500 (figure 9, table B-17). Eighteencountries (including 15 OECD countries)outscored the United States in science in2003.
There was no measurable differencebetween the U.S. average science literacyscore of 499 in 2000 and 491 in 2003, althoughthe relative position of the United Statescompared to the OECD average did change(the U.S. science literacy score in 2000 wasnot measurably different from the OECDaverage, while in 2003 the U.S. score wasbelow the OECD average). Seventeen countries showed changes in theirscores from 2000 to 2003—5 coun-tries (all OECD countries) hadlower scores in 2003 than in2000 and 12 countries (includ-ing 9 OECD countries) hadhigher scores (table B-17).The OECD average score inscience literacy was 500 in2000 and 2003.
Of the 12 countries whose science litera-cy scores improved between 2000 and 2003, 8 also improved their performance relative tothe United States. Belgium, the CzechRepublic, France, Germany, Liechtenstein,and Switzerland did not perform differentlyfrom the United States in 2000 but outscored
the United States in 2003. Latvia and theRussian Federation scored below the U.S.average in 2000 but were not measurably dif-ferent in 2003. Of the five countries whosescience literacy scores decreased between2000 and 2003, two (Canada and Korea) con-tinued to outperform the United States, one(Norway) was not measurably different ineither year, one (Mexico) performedbelow the U.S. average in bothyears, and one (Austria) wentfrom outscoring the UnitedStates to not being measur-ably different from theUnited States.
35
PISA 2003 Results From the U.S. Perspective
mathematics literacy, but larger percentagesof females were not seen at lower levels(below level 1 and levels 1 through 5, table B-19). In other words, differences in the overallscores between males and females in theUnited States were due at least in part tothe fact that a greater percentage of maleswere found among the highest performers,not to a greater percentage of females foundamong the lowest performers.
On average across the OECD countries,males outperformed females on each of thefour mathematics literacy subscales (tableB-20). In the United States, differencesbetween males and females were evidentonly on the space and shape subscale.
In the majority of the PISA 2003 countries(32 out of 39 countries), including the UnitedStates, there were no measurable differ-ences in problem-solving scores by sex (figure 10, table B-21). However, femalesoutscored their male peers in problem solv-ing in six of the remaining seven participat-ing countries (including four OECD coun-tries), as well as at the OECD average.Males outscored females in problem solvingin Macao-China.
As in 2000, females in the United States andnearly every other participating countryoutscored males in reading literacy in 2003(table B-22). Only Liechtenstein showed nostatistical difference between males andfemales in 2003, although there was a differ-ence in favor of females in 2000.
There was no measurable differencebetween the performance of U.S. males andfemales in science literacy in PISA 2000 orPISA 2003, and scores for neither groupchanged between 2000 and 2003. Thirteencountries showed differences betweenmales and females in 2003 (12 OECD coun-tries and the Russian Federation). Eleven ofthe 13 countries showed differences in favorof males, but in Finland and Iceland femalesoutperformed males.
Differences inPerformance bySelected StudentCharacteristicsThis section provides information about howstudents with various characteristics (malesand females, students of different races andfrom different socioeconomic backgrounds)performed on PISA 2003. Because PISA2003’s emphasis was on mathematics litera-cy and problem solving, the focus in thissection is on performance in these areas.10
This report does not address possiblechanges in performance for these groupsfrom 2000 to 2003.
When considering these results, it is impor-tant to bear in mind that there need not be acause-and-effect relationship between beinga member of a group and achievement inPISA 2003. Student performance can beaffected by a complex mix of educationaland other factors that are not examinedhere.
SexFifteen-year-old females in the UnitedStates scored 480 on the combined mathe-matics literacy scale, which was lower thanthe average male score of 486 (figure 10,table B-18). Males also outperformedfemales in 25 other countries (20 OECDcountries and 5 non-OECD countries), a pat-tern evident in the OECD average scores of494 for females and 506 for males. Icelandwas the only country in which femalesscored higher in mathematics literacy thanmales.
Within the United States, greater percent-ages of male students performed at level 6(the highest level) than female students in
10Information on performance in reading literacy and science literacy by sex and race/ethnicity is provided, however.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
36
NOTE: Each bar above represents the average score difference between males and females on combined mathematics literacyand problem solving. The Organization for Economic Cooperation and Development (OECD) average is the average of thenational averages of the OECD member countries with data available. Because the Program for International StudentAssessment (PISA) is principally an OECD study, the results for non-OECD countries are displayed separately from those ofthe OECD countries and are not included in the OECD average. Due to low response rates, data for the United Kingdom are notdiscussed in this report.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2003.
UruguayTunisia
ThailandSerbia and Montenegro
Russian FederationMacao-ChinaLiechtenstein
LatviaIndonesia
Hong Kong-ChinaNon-OECD countries
United StatesTurkey
SwitzerlandSweden
SpainSlovak Republic
PortugalPolandNorway
New ZealandNetherlands
MexicoLuxembourg
KoreaJapan
ItalyIrelandIceland
HungaryGreece
GermanyFranceFinland
DenmarkCzech Republic
CanadaBelgiumAustria
AustraliaOECD countries
OECD average
60 40 20 0 20 40 60
Country
Combined mathematics literacy
Females Males
Problem solving
Females Males
Average score difference is statistically significant
Average score difference
Average score difference is not statistically significant
60 40 20 0 20 40 60
Average score difference
Figure 10. Differences in average scores of 15-year-old students on the combined mathematics literacy scale and in problem solving, by sex and country: 2003
37
PISA 2003 Results From the U.S. Perspective
Socioeconomic StatusThe measure of student socioeconomic status(SES) used in PISA 2003 is based on theoccupational status of the student’s father ormother (whichever was higher) as reported bythe student. Parental occupation was codedbased on the International StandardClassification of Occupations (ISCO)(International Labor Organization 1990).Occupational codes were in turn mapped ontoan internationally comparable index of occupa-tional status, the International SocioeconomicIndex (ISEI), developed by Ganzeboom, DeGraaf, and Treiman (1992). Using the index,students were assigned numbers ranging fromabout 16 to 90 based on their parents’ occupa-tions, so that they were arrayed on a continu-um from low to high socioeconomic status,rather than placed into discrete categories.Typical occupations among parents of 15-year-olds with between 16 and 35 points on the ISEIscale include small-scale farmer, metalworker,mechanic, taxi or truck driver, and waiter/wait-ress. Between 35 and 53 index points, the mostcommon occupations are bookkeeping, sales,small business management, and nursing. Asthe required skills increase, so does the statusof the occupation. Between 54 and 70 points,typical occupations are marketing manage-ment, teaching, civil engineering, and account-ant. Finally, between 71 and 90 points, the topinternational quarter of the index, occupationsinclude medicine, university teaching, and law(OECD 2001).
The average ISEI index score for the UnitedStates in 2003 was 55, higher than that of allbut two countries (Norway and Iceland) (tableB-23). Low ISEI students in the United Stateswere also comparatively better off in terms ofsocioeconomic status than most of theirOECD peers. U.S. students with low ISEI(those in the bottom 25 percent in the UnitedStates) had an average index value of 33,which was higher than the index values for lowISEI students in 35 of the other 38 PISA 2003countries (including 25 OECD countries). Twocountries (Japan and Norway) reported higheraverage index values for low ISEI studentscompared to the United States.
Within the United States, students with lowISEI values were outperformed in mathemat-ics literacy by their peers with higher ISEIvalues (table B-24). Moreover, U.S. studentswith low ISEI values were outperformed bytheir peers with low ISEI values in 22 of the39 PISA 2003 countries (including 18 OECDcountries) for mathematics literacy.Students with the highest ISEI backgroundin the United States (those in the top quar-ter) were outperformed by high ISEI stu-dents from 20 other countries (including 19OECD countries) in mathematics literacy.
The overall linkage of ISEI to mathematicsliteracy and problem solving can be exam-ined by the specific change in score on thecombined mathematics literacy scale inresponse to a one standard deviation changein the ISEI index score for each country. Agreater increase in the average achievementscore in a country implies a stronger rela-tionship between socioeconomic status andperformance in that country.
For example, in the United States, a onestandard deviation change in the ISEI indexwas associated with an average difference of30 points on the combined mathematics liter-acy and 31 points on the problem-solvingscale (table B-25). In Macao-China, socio-economic background differences in achieve-ment were at a minimum—one standard devi-ation’s difference on the ISEI index wasassociated with a 10 point difference on thecombined mathematics literacy scale and a12 point difference on the problem-solvingscale. By contrast, among students inHungary, a one standard deviation change inISEI score was associated with about a 41point difference in both mathematics literacyand problem-solving achievement scores.Twelve countries (including six OECD coun-tries) had a weaker relationship betweenISEI and problem-solving performance thanthe United States, while three countries(Belgium, Germany, and Hungary) had astronger one. Belgium, Germany, andHungary also had stronger relationshipsbetween ISEI and mathematical literacy than
International Outcomes of Learning in Mathematics Literacy and Problem Solving
38
the United States, as did the Czech Republicand Poland. Eleven countries (including 6OECD countries) had weaker relationships.
Race/EthnicityRacial and ethnic groups vary between coun-tries, so it is not possible to compare their per-formance across countries on internationalassessments. Thus, this section refers only to2003 findings for the United States. Throughoutthis section, “White” refers to White, non-Hispanic students, “Black” to Black, non-Hispanic students, “Asian” to Asian, non-Hispanic students, and “Hispanic” to Hispanicstudents of any race. Results for two groups(American Indian or Alaska Native andHawaiian or Other Pacific Islander) are notshown separately because small sample sizesdid not allow for accurate estimates.
In both mathematics literacy and problemsolving, Blacks and Hispanics scored lower,on average, than Whites, Asians, and stu-
dents of more than one race (figure 11, tableB-26). Hispanic students, in turn, outscoredBlack students. This pattern of performanceon PISA 2003 by race/ethnicity is similar tothat found in PISA 2000 and on the NationalAssessment of Educational Progress(NAEP) (Braswell, Daane, and Grigg 2003;Lemke et al. 2001).
In both mathematics literacy and problemsolving, the average scores for Blacks andHispanics were below the respective OECDaverage scores, while scores for Whiteswere above the OECD average scores.Students who were White, Asian, and ofmore than one race scored at level 3 inmathematics literacy, compared to level 2 forHispanic students and level 1 for Black stu-dents (figure 11, exhibit 5). In problem solv-ing, average scores for Whites and Asiansplaced them in level 2, while Black, Hispanic,and students of more than one race scored atlevel 1 (figure 11, exhibit 9).
* Average is significantly different from OECD average.NOTE: Reporting standards not met for American Indian/Alaska Native and Native Hawaiian/Other Pacific Islander. Blackincludes African American and Hispanic includes Latino. Racial categories exclude Hispanic origin.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2003.
OECD average
White Asian More thanone race
More thanone race
Hispanic Black OECD average
White Asian Hispanic Black
0
100
200
300
400
500
600
Average score
Race/ethnicity
Mathematics literacy
500 512* 506 502
443*417*
500 506* 505 498
436*413*
Problem solving
700
Figure 11. Average scores of U.S. 15-year-old students on the combined mathematics literacyscale and in problem solving, by race/ethnicity: 2003
39
PISA 2003 Results From the U.S. Perspective
For Further InformationThis report provides selected findings fromPISA 2003 from a U.S. perspective. Readersmay be interested in exploring other aspectsof PISA’s results. Additional findings arepresented in the OECD report on PISA 2003and further results will be published in aseries of OECD thematic reports on PISA2003. Data with which researchers can con-duct their own analyses are also available athttp://www.pisa.oecd.org.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
40
References
41
PISA 2003 Results From the U.S. Perspective
Ganzeboom, H.B.G., De Graaf, P., Treiman,D.J. (with De Leeuw, J.). (1992). AStandard International Socio-EconomicIndex of Occupational Status. SocialScience Research, 21(1): 1-56.
Ganzeboom, H.B.G., and Treiman, D.J. (1996).Internationally Comparable Measures ofOccupational Status for the 1988International Standard Classification ofOccupations. Social Science Research,25(3): 201-239.
International Labor Organization. (1990).ISCO-88: International StandardClassification of Occupations. Geneva:Author.
Lemke, M., Calsyn, C., Lippman, L., Jocelyn,L., Kastberg, D., Liu, Y.Y., Roey, S.,Williams, T., Kruger, T., Bairu, G. (2001).Outcomes of Learning: Results from the2000 Program for International StudentAssessment of 15-Year-Olds in Reading,Mathematics, and Science Literacy (NCES2002–115). U.S. Department of Education,National Center for Education Statistics.Washington, DC: U.S. GovernmentPrinting Office.
Neidorf, T.S., Binkley, M., Gattis, K., andNohara, D. (forthcoming). A ContentComparison of the National Assessment ofEducational Progress (NAEP), Trends inInternational Mathematics and ScienceStudy (TIMSS), and Program forInternational Student Assessment (PISA)2003 Mathematics Assessments. (NCES2005–112). U.S. Department of Education.Washington, DC: National Center forEducation Statistics.
Adams, R. (Ed.). (forthcoming). PISA 2003Technical Report. Paris: Organization forEconomic Cooperation and Development.
Adams, R. (Ed.). (2002). PISA 2000 TechnicalReport. Paris: Organization for EconomicCooperation and Development.
Binkley, M., Afflerbach, P., and Kelly, D.(forthcoming). A Content Comparison of theNAEP and PISA Reading Assessments(NCES 2005–110). U.S. Department ofEducation. Washington, DC: NationalCenter for Education Statistics.
Binkley, M., and Kelly, D. (2003). A ContentComparison of the NAEP and PIRLS Fourth-Grade Reading Assessments (NCES2003–10). U.S. Department of Education.Washington, DC: National Center forEducation Statistics Working Paper.
Braswell, J.S., Daane, M.C., and Grigg, W.S.(2003). The Nation’s Report Card:Mathematics Highlights 2003 (NCES2004–451). U.S. Department of Education.National Center for Education Statistics.Washington, DC: U.S. GovernmentPrinting Office.
Dossey, J., O’Sullivan, C., and McCrone, S.(forthcoming). Problem Solving inInternational Comparative Assessments(NCES 2005–107). U.S. Department ofEducation. Washington, DC: NationalCenter for Education Statistics.
Ferraro, D., Czuprynski, J., and Williams, T.(forthcoming). U.S. 2003 PISA NonresponseBias Analysis. (2005–102). U.S. Departmentof Education. Washington, DC: NationalCenter for Education Statistics.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
42
Neidorf, T.S., Binkley, M., and Stephens, M.(forthcoming). A Content Comparison ofthe National Assessment of EducationalProgress (NAEP) 2000 and Trends inInternational Mathematics and ScienceStudy (TIMSS) 2003 Science Assessments(NCES 2005–106). U.S. Department ofEducation. Washington, DC: NationalCenter for Education Statistics.
Nohara, D. (2001). A Comparison of theNational Assessment of EducationalProgress (NAEP), the Third InternationalMathematics and Science Study-Repeat(TIMSS-R), and the Programme forInternational Student Assessment (PISA)(NCES 2001–07). U.S. Department ofEducation. Washington, DC: NationalCenter for Education Statistics WorkingPaper.
Organization for Economic Cooperation andDevelopment (OECD). (1999). MeasuringStudent Knowledge and Skills: A NewFramework for Assessment. Paris: Author.
Organization for Economic Cooperation andDevelopment (OECD). (2000). MeasuringStudent Knowledge and Skills: The PISA2000 Assessment of Reading, Mathematicaland Scientific Literacy. Paris: Author.
Organization for Economic Cooperation andDevelopment (OECD). (2001). Knowledgeand Skills for Life: First Results from theOECD Programme for International StudentAssessment. Paris: Author.
Organization for Economic Cooperation andDevelopment (OECD). (2002). Sample Tasksfor the PISA 2000 Assessment: Reading,Mathematical and Scientific Literacy. Paris:Author.
Organization for Economic Cooperation andDevelopment (OECD). (2003). The PISA2003 Assessment Framework: Mathematics,Reading, Science and Problem SolvingKnowledge and Skills. Paris: Author.
Organization for Economic Cooperation andDevelopment (OECD). (2004). Learning forTomorrow’s World — First Results FromPISA 2003. Paris: Author.
Organization for Economic Cooperation andDevelopment (OECD). (2004). ProblemSolving for Tomorrow’s World — FirstMeasures of Cross-Curricular Skills FromPISA 2003. Paris: Author.
Raudenbush, S. W. and Bryk, A. S. (2002).Hierarchical Linear Models: Applicationsand Data Analysis Methods (2nd ed.).Newbury Park, CA: Sage Publications.
Steen, L. A. (Ed.). (1990). On the Shoulders ofGiants: New Approaches to Numeracy.Washington, DC: National Academy Press.
Wilson, M. and Xie, Y. (2004). Report on theImperial Versus Metric Study. BEARResearch Report, Berkeley, CA: Universityof California. Available athttp://bear.berkeley.edu/pub.html.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
44
Appendix A:Technical Notes
45
PISA 2003 Results From the U.S. Perspective
The Program for International StudentAssessment (PISA) is a system of interna-tional assessments that measures 15-year-olds’ capabilities in reading literacy, mathe-matics literacy, and science literacy everythree years. PISA was first implemented in2000 and is carried out by the Organizationfor Economic Cooperation and Development(OECD). In addition to the major subjectareas, PISA also measures general orcross-curricular competencies such aslearning strategies. In this second cycle,PISA 2003, mathematics literacy was themajor focus, along with the new cross-cur-ricular cognitive domain of problem solving.This appendix describes features of thePISA 2003 survey methodology, includingsample design, test design, scoring, datareliability, and analysis variables. For furtherdetails about the assessment and any of thetopics discussed here, see the OECD’s PISA2003 Technical Report (Adams forthcoming)and the PISA 2000 Technical Report (Adams2002).
Sampling, Data Collection, and ResponseRate RequirementsTo provide valid estimates of studentachievement and characteristics, the sampleof PISA students had to be selected in away that represented the full population of15-year-old students in each country. Theinternational desired population in eachcountry consisted of 15-year-olds attendingboth publicly and privately controlled educa-tional institutions in grades 7 and higher. Aminimum of 4,500 students from a minimumof 150 schools was required. Within schools,a sample of 35 students was to be selectedin an equal probability sample unless fewerthan 35 students aged 15 were available (inwhich case all students were selected).International standards required that stu-dents be sampled based on an age definitionof 15 years and 3 months to 16 years and 2months at the beginning of the testing period.The testing period was required not to
exceed 42 days between March 1, 2003, andAugust 31, 2003. Each country collected itsown data, following international guidelinesand specifications.
A minimum response rate target of 85 per-cent was required for initially selected edu-cational institutions. In instances in whichthe initial response rate of educational insti-tutions was between 65 and 85 percent, anacceptable school response rate could stillbe achieved through the use of replacementschools. Replacement schools were to beselected at the time of sample selection.
Three school response rate zones—accept-able, intermediate, and not acceptable—were defined (figure A-1). “Acceptable”meant that the country’s data would beincluded in all international comparisons.“Not Acceptable” meant that the country’sdata would be a candidate for not beingreported in international comparisonsunless considerable evidence was presentedthat nonresponse bias was minor. “Intermediate”meant that a decision on whether or not toinclude the country’s data in comparisonswould be made while taking into account avariety of factors, such as student responserates, quality control, closeness of theresponse rates to the acceptable level, etc.For the purposes of calculating responserates, schools with less than 50 percent ofstudents responding were considered non-responding and their students were excludedfrom the student response rates. If the stu-dent response rates within such schoolswere at least 25 percent, these schools andstudents were included in the PISA 2003database. Schools with student responserates below 25 percent were not used in anytype of analysis nor are the data for thesestudents or schools available in the PISA2003 database. Note that schools with stu-dent response rates above 25 percent wereincluded in the nonresponse bias analysesdescribed in this report.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
46
six countries for schools that would other-wise have been excluded. Special bookletswere used in Austria, Belgium, the CzechRepublic, Hungary, the Netherlands, and theSlovak Republic. Within schools, exclusiondecisions were made by staff members whowere knowledgeable about students withIndividualized Education Programs (IEPs) orstudents who were limited English profi-cient, using the following internationalguidelines on possible student exclusions:
• Functionally disabled students.These were students who were perma-nently physically disabled in such a waythat they could not perform in the testingsituation. Functionally disabled stu-dents who could respond were to beincluded in the testing. Any sampled
PISA 2003 also required a minimum partici-pation rate of 80 percent of sampled stu-dents from original and replacement schoolswithin each country. A student was consid-ered to be a participant if he or she partici-pated in the first testing session or a follow-up or makeup testing session.
Exclusion guidelines allowed for 0.5 percentat the school level for approved reasons (forexample, remote regions or very smallschools), and 2 percent for special educationschools. Overall estimated student exclu-sions to be under 5 percent. PISA’s intentwas to be as inclusive as possible. Noaccommodations were offered in the UnitedStates for PISA. A special one-hour bookletwith lower difficulty items, which was scaledwith the regular PISA booklets, was used in
0
60
55
65
70
75
80
85
90
95
100
Percent after replacement
Percent before replacement
55 60 65 70 75 80 85 90 95 100
Not acceptable Intermediate
Acceptable
NOTE: A minimum response target of 85 percent was required for initially selected educational institutions. In instances inwhich the initial response rate of educational institutions was between 65 and 85 percent, an acceptable school response ratecould still be achieved through the use of replacement schools.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2003.
Figure A-1. School response rate requirements for PISA 2003
47
PISA 2003 Results From the U.S. Perspective
11The Northeast region consists of Connecticut, Delaware, District of Columbia, Maine, Maryland, Massachusetts, NewHampshire, New Jersey, New York, Pennsylvania, Rhode Island, and Vermont. The Central region consists of Illinois, Indiana,Iowa, Kansas, Michigan, Minnesota, Missouri, Nebraska, North Dakota, Ohio, Wisconsin, and South Dakota. The West regionconsists of Alaska, Arizona, California, Colorado, Hawaii, Idaho, Montana, Nevada, New Mexico, Oklahoma, Oregon, Texas, Utah,Washington, and Wyoming. The Southeast region consists of Alabama, Arkansas, Florida, Georgia, Kentucky, Louisiana,Mississippi, North Carolina, South Carolina, Tennessee, Virginia, West Virginia.
student who was temporarily disabledsuch that s/he could not participate inthe assessment was considered absentfrom the assessment.
• Students with mental or emotionaldisabilities. These were students whowere considered in the professionalopinion of the school principal or byother qualified staff members to be intel-lectually disabled or who had been psy-chologically tested as such. This includ-ed students who were emotionally ormentally unable to follow even the gener-al instructions of the test. Students werenot to be excluded solely because ofpoor academic performance or normaldisciplinary problems.
• Students with limited proficiency inthe test language. These were studentswho had received less than one year ofinstruction in the language of the test.Generally, these were students who wereunable to read or speak the language ofthe test (English in the United States)and would be unable to overcome thelanguage barrier in the test situation.
Quality monitors from the PISA Consortiumvisited schools in every country to ensuretesting procedures were carried out in a consistent manner across countries.
Sampling, Data Collection, and ResponseRates in the United StatesThe 2003 PISA school sample was drawn forthe United States in November 2002. Thesample design for this school sample wasdeveloped to retain some of the propertiesof the 2000 PISA U.S. school sample, and tofollow international requirements as given inthe PISA sampling manual. Unlike the 2000PISA sample, which had a three-stagedesign, the U.S. sample for 2003 was a two-stage sampling process with the first stage
a sample of schools, and the second stage a sample of students within schools. ForPISA in 2000, the U.S. school sample hadthe selection of a sample of geographicPrimary Sampling Units (PSUs) as the firststage of selection. The sample was not clus-tered at the geographic level for PISA 2003.This change was made in an effort to reducethe design effects observed in the 2000 dataand to spread the respondent burden acrossschool districts as much as possible.
The sample design for PISA was a stratifiedsystematic sample, with sampling probabili-ties proportional to measures of size. ThePISA sample had no explicit stratificationand no oversampling of subgroups. Theframe was implicitly stratified (i.e., sortedfor sampling) by five categorical stratifica-tion variables: grade span of the school (fivelevels), type of school (public or private),region of the country11 (Northeast, Central,West, Southeast), type of location relative topopulous areas (eight levels), minority sta-tus (above or below 15 percent). The last sortkey within the implicit stratification was byestimated enrollment of 15-year-olds basedon grade enrollments.
At the same time that the PISA sample wasselected, replacement schools were identi-fied following the PISA guidelines byassigning the two schools neighboring thesampled school on the frame as replace-ments. There were several constraints onthe assignment of substitutes. One sampledschool was not allowed to substitute foranother, and a given school could not beassigned to substitute for more than onesampled school. Furthermore, substituteswere required to be in the same implicit stra-tum as the sampled school. If the sampledschool was the first or last school in thestratum, then the second school following orpreceding the sampled school was identifiedas the substitute. One was designated a firstreplacement and the other a second replace-
International Outcomes of Learning in Mathematics Literacy and Problem Solving
48
data collection, the school sample includedonly original schools from the sample thathad refused to participate in the spring butindicated a willingness to participate in afall assessment. Substitute schools werenot included in the fall sample because theirparticipation would have had little effect onraising the final response rate. In order toachieve a comparable sample of students inspring and fall, the age definition for stu-dents tested in the fall was adjusted suchthat all students tested were the same age.
Of the 420 sampled schools, 382 were eligible(some did not have any 15-year-olds enrolled)and 179 agreed to participate in the spring of2003. An additional 70 original schools par-ticipated in the fall assessment for a total of249 participating original schools. The schoolresponse rate (including spring and fallassessments) before replacement was 65percent (weighted and unweighted), placingthe United States in the “intermediate”response rate category. The weightedschool response rate before replacement isgiven by the formula:
where Y denotes the set of responding origi-nal sample schools with age-eligible stu-dents, N denotes the set of eligible non-responding original sample schools, Widenotes the base weight for school i, Wi =1/Pi , where Pi denotes the school selectionprobability for school i, and Ei denotes theenrollment size of age-eligible students, asindicated on the sampling frame.
In addition to the 249 participating originalschools, 13 replacement schools also partici-pated in the spring for a total of 262 partici-pating schools.
�Wi Ei
�Wi Ei= i�Y
i�( )’
Y N
weighted school response rate before replacement
ment. If an original school refused to partic-ipate, the first replacement was then contact-ed. If that school also refused to participate,the second school was then contacted.
The U.S. PISA school sample consisted of420 schools. This number was increased fromthe international minimum requirement of150 to offset school nonresponse, reducedesign effects, and include additional stu-dents in a metric-imperial experiment(described below).
The schools were selected with probabilityproportionate to the school’s estimatedenrollment of 15-year-olds from the 2003NAEP school frame with 2000-2001 schooldata. The data for public schools were fromthe Common Core of Data (CCD), and thedata for private schools were from thePrivate School Survey (PSS). Any schoolcontaining at least one 7th- through 12th-grade class as of the school year 2000–01was included on the school sampling frame.Participating schools provided lists of 15-year-old students, and a sample of 35 stu-dents was selected within each school in anequal probability sample. The overall sampledesign for the United States was intended toapproximate a self-weighting sample of stu-dents as much as possible, with each 15-year-old student having an equal probabilityof being selected.
In the United States, for a variety of reasonsreported by school administrators (such asincreased testing requirements at thenational, state, and local levels, concernsabout timing of the PISA assessment andloss of learning time), many schools in theoriginal sample declined to participate. Asit was clear that the United States would notmeet the minimum response rate standards,in order to improve response rates and bet-ter accommodate school schedules, a sec-ond testing window was opened fromSeptember to November 2003 with the agree-ment of the PISA Consortium. For the fall
49
PISA 2003 Results From the U.S. Perspective
A total of 7,598 students were sampled forthe assessment. Of these students, 261 weredeemed ineligible because of their enrolledgrades, birthdays, or other reasons, andwere removed from the sample. Of the eligi-ble 7,337 sampled students, an additional 534students were excluded using the criteriadescribed above, for a weighted exclusionrate of 7 percent.
Of the 6,803 remaining sampled students, atotal of 5,456 students participated in theassessment in the United States, but 114 ofthese came from schools which had lessthan 50 percent student participation.Schools which had less than 50 percent stu-dent participation were classified as schoolnonrespondents, and these students (114participating students and 187 nonparticipat-ing students) were therefore excluded for thepurposes of calculating student responserates. Thus, although data for 5,456 studentsare included in the database, studentresponse rates were calculated by subtract-ing the 114 students from the 5,456 for atotal of 5,342 participating students. Thedenominator for the student response rate is6,502, which consists of 7,598 sampled stu-dents minus the following students: 261 inel-igible, 534 excluded, 114 responding studentsfrom nonresponding schools, and 187 nonre-sponding students from nonrespondingschools. An overall weighted studentresponse of 83 percent was achieved (82 per-cent unweighted).
Two separate bias analyses were conductedin the United States to address potentialproblems in the data due to school nonre-sponse and possible achievement differ-ences between students in spring and falltesting windows.
The analysis of school nonresponse wasconducted in two parts, examining first theoriginal sample of schools (spring and fallparticipants) and then the final sample ofschools (including replacements), treatingas nonrespondents those schools fromwhom a final response was not received(Ferraro, Czuprynski and Williams forthcom-ing). Schools with 25 to 49 percent studentresponse rates were treated as respondentsin the nonresponse bias analysis, since theirdata are included in the PISA database.Schools with student response rates lessthan 25 percent were treated as nonrespon-dents in the analysis and were not includedin the PISA database.
In order to compare PISA respondents andnonrespondents, it was necessary to matchthe sample of schools back to the sampleframe to detect as many characteristics aspossible that might provide informationabout the presence of nonresponse bias.Comparing frame characteristics for respon-dents and nonrespondents is not always agood measure of nonresponse bias if thecharacteristics are unrelated or weaklyrelated to more substantive items in the sur-vey; however, this was the only approachavailable given that no comparable school orstudent level achievement data were avail-able. Frame characteristics were taken fromthe 2000–01 Common Core of Data (CCD) forpublic schools and from the 2000–01 PrivateSchool Survey (PSS) for private schools.For categorical variables, response rates bycharacteristics were calculated. The hypoth-esis of independence between the character-istics and response status was tested usinga Rao-Scott modified Chi-square statistic.For continuous variables, summary meanswere calculated. The 95 percent confidenceinterval for the difference between the meanfor respondents and the overall mean wastested to see whether or not it included zero.In addition to these tests, logistic regressionmodels were set up to identify whether any
International Outcomes of Learning in Mathematics Literacy and Problem Solving
50
While the implications of these analyses forthe direction of any resulting bias achieve-ment are not entirely clear, an attempt wasmade to minimize any bias by incorporatingthe variables in question into the adjustmentfor school nonresponse that was a compo-nent of the sampling weights.
One other country, the United Kingdom, alsofell below the acceptable range for schoolresponse rates, although response rateproblems were largely limited to England(Scotland and Northern Ireland also partici-pated). In that case, however, the PISAConsortium was unable to make adjust-ments for any potential bias, and data for theUnited Kingdom are therefore annotated andare not included in the main text or figures.Data for one additional participating PISA2003 country, Brazil, were not available intime for production of this report.
The other U.S. bias analysis aimed toaddress the question of whether there was a“session” effect between students tested inthe spring and fall, in order to provide evi-dence for the acceptability of combiningdata from both sessions for the UnitedStates. Despite PISA’s focus on an agesample, concern remained that studentstested at the beginning of the school yearmight perform worse than their peers testedat the end of the previous school year.
The approach taken was to investigate ses-sion effects in a multilevel model, sincethese were school-level effects—all stu-dents within a school were in either thespring or the fall sessions. Two similar two-level models were estimated. In each, stu-dent achievement in PISA was modeled as afunction of various school characteristics (inparticular those on the sample frame knownto be related to willingness to participate inthe original testing window, including pub-lic/private status, number of age-eligiblestudents, region, and location) and time oftesting (spring/fall) and, in one model, the
of the frame characteristics were significantin predicting response status. All analyseswere performed using WesVar and replicateweights to properly account for the complexsample design. The JK2 method was used tocreate the weights. The school base weightsused in these analyses did not include anonresponse adjustment factor. The baseweight for each original school was thereciprocal of its selection probability. Thebase weight for each replacement schoolwas equal to the base weight of the originalschool it replaced.
Characteristics available for public and pri-vate schools included: public/private affilia-tion, community type, region, number of age-eligible students enrolled, total number ofstudents, and percentage of variousracial/ethnic groups (percentage Asian orPacific Islander; Black, non-Hispanic;Hispanic; American Indian or Alaska Native;White, non-Hispanic). Percentage of stu-dents eligible for free or reduced-price lunchwas also available for public schools only(however, this variable was missing for 50 ofthe 359 public schools). For the original sam-ple of schools, two of these variablesshowed a relationship to response status intests of independence and in the multivari-ate logistic regression model: region(specifically, schools in the Central regionwere less likely to respond) and percentageof Asian or Pacific Islander students(responding schools had fewer of these stu-dents than the original sample schools).Using the same analytic procedure for thefinal sample (including replacementschools), tests of independence againshowed that responding schools were morelikely to be in the West. Responding schoolswere also more likely to have fewer Asian orPacific Islander students and more Black,non-Hispanic students. However, the onlyvariable found to be significant in the logisticregression model predicting response wasthe percentage of Asian or Pacific Islanderstudents (again, responding schools werelikely to have fewer Asian or Pacific Islanderstudents).
51
PISA 2003 Results From the U.S. Perspective
student characteristic grade level. In thesimpler of the student-level models no pre-dictors of achievement were included. In thesecond model, student grade level wasincluded as a predictor of achievement toallow for the possibility that the schoolmeans predicted in the school-level modelwere affected by differences in thespring/fall distribution of students acrossgrades. That is, the school-level model waspredicting mean school achievement adjust-ed for grade-level differences. The two mod-els proposed were estimated with HLM(Raudenbush and Bryk 2002). Neither modelshowed evidence of a statistically significantsession effect. On this basis, and on thebasis of the adjustments made to the sam-pling weights based on the nonresponsebias analysis, the PISA Consortium con-cluded that the data for the United Stateswas adequate to generalize to the U.S. 15-year-old population and should be includedin the international report and database.
Table A-1 provides summary information onthe samples of all countries. A moredetailed presentation can be found in theOECD’s PISA 2003 Technical Report (Adamsforthcoming).
Test DevelopmentThe development of the PISA 2003 assess-ment instruments was an interactiveprocess among the PISA Consortium, vari-ous expert committees, and OECD mem-bers. The assessment was developed byinternational experts and PISA Consortiumtest developers, and items were reviewed byrepresentatives of each country for possiblebias and relevance to PISA’s goals. Theintention was to reflect the national, cultur-al, and linguistic variety among OECD coun-tries. The assessments included materialselected from among items submitted byparticipating countries as well as items thatwere developed by the Consortium’s testdevelopers.
The final assessment consisted of 85 mathe-matics items, 35 science items, 19 problemsolving items, and 32 reading items allocatedto 13 test booklets. In the United States, anadditional 4 test booklets were included inPISA 2003 in order to investigate the possi-ble effects of the use of metric units on U.S.student performance, for a total of 17 book-lets (see description that follows). Eachbooklet was made up of 4 test clusters.There were 7 mathematics clusters (M1 -M7), 2 science clusters (S1 - S2), 2 problemsolving clusters (P1 - P2) and 2 reading clus-ters (R1 - R2). The clusters were allocated ina rotated design to the 13 booklets. Eachcluster contained approximately 12 testitems, equivalent to 30 minutes of test mate-rial. Each student took one booklet, withabout 2 hours worth of testing material.Approximately one-third of the mathematicsliteracy items were multiple choice and com-plex multiple choice, one-third were closedor short response types in which studentswrote an answer which was simply eithercorrect or incorrect, and about one-thirdwere open constructed responses for whichstudents wrote answers which were markedby trained scorers based upon an interna-tional scoring guide. In PISA 2003, everystudent answered mathematics items.Problem solving, science, and reading itemswere spread throughout other booklets. Formore information on assessment design, seethe OECD’s PISA 2003 Technical Report(Adams forthcoming).
In order to examine similarities and differ-ences between national and internationalassessments, NCES has sponsored a num-ber of comparative studies of assessmentframeworks and items. In October 2003 astudy of the NAEP, TIMSS, and PISA 2003mathematics assessments was undertaken.The aim of the study was to provide informa-tion that would be useful in interpreting andcomparing the results from the three assess-ments, based on an in-depth look at the con-
International Outcomes of Learning in Mathematics Literacy and Problem Solving
52
Table A-1. Coverage of target population, student and school samples,and participation rates in the Program for InternationalStudent Assessment (PISA), by country: 2003
United Kingdom5 768,180 90.9 94.6 5.4See notes at end of table.
53
PISA 2003 Results From the U.S. Perspective
Table A-1. Coverage of target population, student and school samples, and participation rates in the Program for International Student Assessment (PISA), by country:2003—Continued
United Kingdom5 64.3 77.4 77.9 361 9,5351Fifteen-year-olds in primary school in Greece were originally excluded from the assessment. Changes in the target populationdefinition to 15-year-olds in grades 7 and above required Greece to adjust its data to reflect the fact that 15-year-olds in primaryschool would no longer be considered part of the target population.2Indonesia excluded 4 provinces and close to 5 percent of its eligible population for security reasons. There were 4,137,103 15-year-olds in the total population, but the 4 provinces were already excluded. Therefore, the 144,792 noted as being excluded in theseprovinces was added to this number to get 4,281,895 15-year-olds. The number of enrolled 15-year-olds was noted as 2,968,756 so144,792 was also added to this.3Serbia and Montenegro excluded Kosovo; however, there were no estimates for the number of 15-year-olds, so this does notappear as an exclusion.4Tunisia noted late in the process that one French school needed to be excluded because of French (rather than Arabic) language.The school had 33 eligible students.5Due to low response rates, data for the United Kingdom are not discussed in this report.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
54
adapt the instrument for cultural purposes,even in nations such as the United Statesthat use English as the primary language ofinstruction. For example, words such as“lift” might be adapted to “elevator” for theUnited States. The PISA Consortium veri-fied the national translation and adaptationof all instrumentation. Copies of printedmaterials were sent to the PISAConsortium for a final optical check prior todata collection.
As noted, in the United States, an additional4 test booklets were included in PISA 2003that used adapted versions of 27 mathemat-ics items. These items in their original for-mat used metric units of measurement, suchas meters, liters, etc. To investigate the pos-sible effects of the use of metric units onU.S. student performance, the items wereadapted to use “imperial” forms with famil-iar units such as feet, gallons, and degreesFahrenheit. Differential item analysisshowed that U.S. students were not disad-vantaged by the use of metric units in PISA2003. The few discrepancies that wereobserved are possibly due to (1) differencesin the nature of the two systems (e.g., deci-mal vs. duodecimal, or no equivalent wordingof the units), and (2) difficulties in the modi-fication process (e.g., no comparable scor-ing guides for some incorrect approaches toan item). For more information, see Wilsonand Xie (2004).
Test Administration and Quality AssurancePISA 2003 emphasized the use of standard-ized procedures in all countries. Each coun-try collected its own data, based on compre-hensive manuals and trainings provided bythe PISA Consortium to explain the survey’simplementation, including precise instruc-tions for the work of school coordinators andscripts for test administrators for use intesting sessions. Test administration in theUnited States was carried out by profes-sional staff trained according to the interna-
tent of the respective frameworks and items.The results showed that PISA used farfewer multiple choice items and had a muchstronger content focus on the “data” area(which often deals with using charts andgraphs), which fits with PISA’s emphasis onusing materials with a real-world context.For more results from the study, see AContent Comparison of the NAEP, TIMSS,and PISA 2003 Mathematics Assessments(Nohara forthcoming). An earlier studycompared NAEP 2000, PISA 2000, andTIMSS 1999 mathematics and science items.That study found that PISA items requiredmultistep reasoning more often than TIMSSor NAEP and that PISA mathematics andscience literacy items more often involvedthe interpretation of charts and graphs orother “real life” material (Nohara 2001).
In addition to the cognitive assessment, students also received a 30-minute question-naire designed to provide information abouttheir backgrounds, attitudes, and experiencesin school. Principals in schools were PISAwas administered also received a 20-30minute questionnaire about their schools.Results from the school survey are not dis-cussed in this report but are available athttp://www.pisa.oecd.org.
Translation and the Metric-Imperial StudySource versions of all instruments (assess-ment booklets, questionnaires, and manu-als) were prepared in English and Frenchand translated into the primary language orlanguages of instruction in each nation.PISA recommended that countries prepareand consolidate independent translationsfrom both source versions, and providedprecise translation guidelines that includeda description of the features each item wasmeasuring and statistical analysis from thefield trial. In cases where one source lan-guage was used, independent translationswere required and discrepancies reconciled.In addition, it was sometimes necessary to
tional guidelines. School staff were askedonly to assist with listings of students, iden-tifying space for testing in the school, andspecifying any parental consent proceduresneeded for sampled students. Use of calcu-lators was at the discretion of participatingcountries; in the United States, this choicewas left to schools based on school, district,or state policy. Students were asked at theend of their test booklets if they had used acalculator and if so, what type. Approximately12 percent of U.S. students did not respond.Of the responding students, 91 percent ofU.S. students reported using a calculator.Students who reported using a calculatorhad a mean score of 498 on the combinedmathematics literacy scale compared to 461for those who reported not using a calculator.
Members of the PISA Consortium visited allnational centers to review data collectionprocedures, and members of the PISAConsortium also visited a randomly selectedsubsample of approximately 10 percent ofthe educational institutions to ensure thatprocedures were being carried out in accor-dance with international guidelines. For adetailed description of the quality assuranceprocedures, see the OECD’s PISA 2003Technical Report (Adams forthcoming).
ScoringAt least one-third of the PISA assessmentwas devoted to items requiring constructedresponses. The process of scoring theseitems was an important step in ensuring thequality and comparability of the PISA data.Detailed guidelines were developed for thescoring guides themselves, training materi-als to recruit scorers, and workshop materi-als used for the training of national scorers.Prior to the national training, the PISAConsortium organized training sessions topresent the material and train the scoringcoordinators from the participating coun-tries, who trained the national scorers.
For each test item, the scoring guidedescribed the intent of the question and howto code the students’ responses to eachitem. This description included the creditlabels—full credit, partial credit, or no cred-it—attached to the possible categories ofresponse. Also included was a system ofdouble-digit coding for some mathematicsand science items where the first digit repre-sented the score, and the second digit repre-sented different strategies or approachesthat students used to solve the problem. Thesecond digit generated national profiles ofstudent strategies and misconceptions. Inaddition, the scoring guides included realexamples of students’ responses accompa-nied by a rationale for their classification forpurposes of clarity and illustration.
To examine the consistency of this markingprocess in more detail within each countryand to estimate the magnitude of the vari-ance components associated with the use ofmarkers, the PISA Consortium conductedan interscorer reliability study on a subsam-ple of assessment booklets. Homogeneityanalysis was applied to the national sets ofmultiple scoring and compared with theresults of the field trial. A full description ofthis process and the results can be found inthe PISA 2003 Technical Report published bythe OECD (Adams forthcoming).
Data Entry and CleaningResponsibility for data entry was taken by thenational project manager from each nation.The data collected for PISA 2003 wereentered into data files with a common interna-tional format, as specified in the PISA 2003 Data Entry Manual. Data entry wasfacilitated by the use of a common softwareavailable to all participating nations(KeyQuest). The software facilitated thechecking and correction of data by providingvarious data consistency checks. The datawere then sent to the Australian Council for
International Outcomes of Learning in Mathematics Literacy and Problem Solving
56
weight for each replacement school wasequal to the base weight of the originalschool it replaced.
Scaling and Plausible ValuesPISA used Item Response Theory (IRT)methods to produce scale scores that summa-rized the achievement results. PISA 2003 uti-lized a mixed coefficients multinomial logitIRT model. This model is similar in principleto the more familiar two-parameter IRTmodel. With this method, the performance ofa sample of students in a subject area or subarea can be summarized on a simple scale ora series of scales, even when different stu-dents are administered different items.Because of the reporting requirements forPISA and the large number of backgroundvariables associated with the assessment,PISA used these IRT procedures to produceaccurate results for groups of students whilelimiting the testing burden on individual stu-dents. Furthermore, these procedures provid-ed data that could be readily used in second-ary analyses. IRT scaling provides estimatesof item parameters (e.g., difficulty, discrimina-tion) that define the relationship between theitem and the underlying variable measured bythe test. Parameters of the IRT model areestimated for each test question, with anoverall scale being established as well asscales for each predefined content area spec-ified in the assessment framework. For exam-ple, PISA 2003 had five scales describingmathematics (a combined score and subscalescores in four domains) and one each forreading, problem solving, and science.
The reading literacy and science literacyreporting scales used for PISA 2000 andPISA 2003 are directly comparable. Thevalue of 500, for example, has the samemeaning as it did in PISA 2000—that is, themean score in 2000 of the sampled studentsin the 27 OECD countries that participatedin PISA 2000.
Educational Research (ACER) for cleaning.ACER’s role in this instance was to check thatthe international data structure was followed,check the identification system within andbetween files, correct single case problemsmanually, and apply standard cleaning proce-dures to questionnaire files. Results of thedata cleaning process were documented andshared with the national project managersand included specific questions whenrequired. The national project manager thenprovided ACER with revisions to coding orsolutions for anomalies. ACER then compiledbackground univariate statistics and prelimi-nary classical and Rasch Item Analysis.Detailed information on the entire data entryand cleaning process can be found in theforthcoming PISA 2003 technical report.
WeightingStudents included in the final PISA samplefor a given country were not all equally rep-resentative of the full student population,even though random samplings of schoolsand students were used to select the sam-ple. The use of sampling weights is neces-sary for the computation of statisticallysound, nationally representative estimates.Survey weights help adjust for intentionalover- or under-sampling of certain sectors ofthe population, school or student nonre-sponse, or errors in estimating size of aschool at the time of sampling. Surveyweighting for PISA 2003 was carried out byWestat, as part of the PISA Consortium.
The internationally defined weighting speci-fications for PISA required that eachassessed student’s sampling weight be theproduct of the inverse of the school’s proba-bility of selection, an adjustment for school-level nonresponse, the inverse of the stu-dent’s probability of selection, and anadjustment for student-level nonresponse.All PISA analyses were conducted usingthese adjusted sampling weights. The base
57
PISA 2003 Results From the U.S. Perspective
This is not the case, however, for mathemat-ics literacy. Mathematics literacy, as themajor domain, was the subject of majordevelopment work for PISA 2003, and thePISA 2003 mathematics literacy assessmentwas much more comprehensive than thePISA 2000 mathematics assessment—thePISA 2000 assessment covered just two(space and shape, and change and relation-ships) of the four areas that are covered inPISA 2003. Because of this broadening inthe assessment it was deemed inappropriateto report the PISA 2003 mathematics litera-cy scores on the same scale as the PISA2000 mathematics scores.
The PISA 2000 and PISA 2003 assessmentsof mathematics, reading and science litera-cy are linked assessments. That is, the setsof items used to assess each of mathemat-ics, reading and science literacy in PISA2000 and the sets of items used to assesseach of mathematics, reading and scienceliteracy in PISA 2003 include a subset ofitems common to both sets. For mathemat-ics there were 20 items that were used inboth assessments, in reading there were 28items used in both assessments and for sci-ence 25 items were used in both assess-ments. These common items are referred toas link items.
To establish common reporting metrics forPISA 2000 and PISA 2003 the difficulty oflink items (items used in 2000 and 2003) wascompared. Items were calibrated using 2003data only, and then 2000 items were re-cali-brated using the 2003 parameters. Adjustmentswere then made to ability estimate toaccount for booklet effects seen in 2000. Thecomparison of the item difficulties on thetwo occasions was used to determine ascore transformation that allows the report-ing of the data from the two assessments ona common scale. The change in the difficultyof each of the individual link items is used indetermining the transformation and as a
consequence the sample of link items thathas been chosen will influence the choice oftransformation. This means that if an alter-native set of link items had been chosen theresulting transformation would be slightlydifferent. The consequence is an uncertaintyin the transformation due to the sampling ofthe link items, just as there is an uncertaintyin values such as country means due to theuse of a sample of students. The section onstatistical testing below describes how thisuncertainty has been accounted for in mak-ing comparisons over time.
Plausible Values
During the scaling phase, plausible valueswere used to characterize scale scores forstudents participating in the assessment. Tokeep student burden to a minimum, PISAadministered few assessment items to eachstudent—too few to produce accurate con-tent-related scale scores for each student.To account for this, PISA generated fivepossible scale scores for each student thatrepresented selections from the distributionof scale scores of students with similarbackgrounds who answered the assessmentitems the same way. The plausible valuestechnology is one way to ensure that theestimates of the average performance ofstudent populations and the estimates ofvariability in those estimates are more accu-rate than those determined through tradi-tional procedures, which estimate a singlescore for each student. During the construc-tion of plausible values, careful quality con-trol steps ensured that the subpopulationestimates based on these plausible valueswere accurate.
It is important to recognize that plausiblevalues are not test scores for individuals andthey should not be treated as such.Plausible values are randomly drawn fromthe distribution of scores that could be rea-sonably assigned to each individual. Assuch, the plausible values contain random
International Outcomes of Learning in Mathematics Literacy and Problem Solving
58
error variance components and are not opti-mal as scores for individuals. The PISA stu-dent file contains many plausible values, fivefor each of the PISA 2003 cognitive scales(combined mathematics literacy scale, fourmathematics literacy subscales, reading lit-eracy, science literacy, and problem solving).If an analysis is to be undertaken with one ofthese cognitive scales, then (ideally) theanalysis should be undertaken five times,once with each of the five relevant plausiblevalue variables. The results of these fiveanalyses are averaged and then significancetests that adjust for variation between thefive sets of results are computed.
PISA uses the plausible value methodologyto represent what the true performance of anindividual might have been, had it beenobserved, using a small number of randomdraws from an empirically derived distribu-tion of score values based on the student’sobserved responses to assessment itemsand on background variables. Each randomdraw from the distribution is considered arepresentative value from the distribution ofpotential scale scores for all students in thesample who have similar characteristics andidentical patterns of item responses. Thedraws from the distribution are differentfrom one another to quantify the degree ofprecision (the width of the spread) in theunderlying distribution of possible scalescores that could have caused the observedperformance. The PISA plausible valuesfunction like point estimates of scale scoresfor many purposes, but they are unlike truepoint estimates in several respects. Theydiffer from one another for any particularstudent, and the amount of difference quan-tifies the spread in the underlying distribu-tion of possible scale scores for that stu-dent. Because of the plausible valuesapproach, secondary researchers can usethe PISA data to carry out a wide range ofanalyses.
LevelsWhile the basic form of measurement inPISA describes student literacy in eachcountry in terms of a range of scale scores,PISA also treats proficiency in mathematicsliteracy in terms of six described levels, andproficiency in problem solving in threedescribed levels. In both cases, increasinglevels represent tasks of increasing com-plexity. As a result, the findings are reportedin terms of percentages of the populationproficient at handling tasks of different lev-els of difficulty.
Each of the five mathematics literacyscales—the combined score and the foursubscale scores—is divided into six levelsbased on the type of knowledge and skillsstudents need to demonstrate at a particularlevel. A seventh level (below level 1) is madeup of students whose abilities could not beaccurately described based on theirresponses. Exact cut point scores are as fol-lows: below level 1 (a score less than orequal to 357.77); level 1 (a score greater than357.77 and less than or equal to 420.07); level2 (a score greater than 420.07 and less thanor equal to 482.38); level 3 (a score greaterthan 482.38 and less than or equal to 544.68);level 4 (a score greater than 544.68 and lessthan or equal to 606.99); level 5 (a scoregreater than 606.99 and less than or equal to669.30); level 6 (a score greater than 669.30.The tasks that represent each level of per-formance for the specific mathematicsprocesses on the combined mathematics lit-eracy scale are described in exhibit 5. ExhibitA-1 describes the kind of tasks that repre-sent each level of performance on the math-ematics subscales.
The problem-solving scale is divided intothree levels based on the type of knowledgeand skills students must demonstrate at aparticular level. A fourth level (below level 1)is made up of students whose abilities couldnot be accurately described based on theirresponses. In order to reach a particular
59
PISA 2003 Results From the U.S. Perspective
Exhibit A-1. Description of proficiency levels for mathematics literacy subscales: 2003
Proficiency level
Task descriptions
Space and shape QuantityLevel 1 Students at Level 1 or 2 can
work with a single mathemati-cal representation where themathematical content is directand clearly presented, usemathematical thinking in famil-iar contexts, identify geometricpatterns, and apply basic geo-metric concepts.
Students at Level 1 or 2 caninterpret simple tables, carryout basic arithmetic calcula-tions, work with simple quanti-tative models, interpret a sim-ple quantitative model (e.g., aproportional relationship), andapply the model using basicarithmetic calculations.
Level 2
Level 3 Students at Level 3 can beginto use visual and spatial rea-soning, begin linking differentrepresentations, use elemen-tary problem solving (devisingsimple strategies), apply sim-ple algorithms, and interprettextual descriptions of unfamil-iar geometric situations.
Students at Level 3 can use sim-ple problem solving strategies,interpret tables to locate infor-mation, carry out well-describedcalculations, interpret a textdescription of a sequential calcu-lation process, correctly imple-ment the process, and use basicproblem-solving procedures.
Level 4 Students at Level 4 can usemore advanced and flexiblereasoning, link and integratedifferent representations, usemulti-step processes, usewell-developed spatial visuali-zation and interpretation, anduse reasoning about numericrelationships in geometricproblems.
Students at Level 4 can workeffectively with simple modelsof complex situations, use rea-soning skills, insight and inter-pretation with different repre-sentations, use a variety of cal-culation skills to solve prob-lems, and accurately apply agiven numeric algorithm involv-ing a number.
Level 5 Students at Level 5 have theability to make or work withassumptions, use insight, inter-pretation and linking of differ-ent representations, and cancarry out multiple and sequen-tial processes. They can alsouse well-developed spatial rea-soning.
Students at Level 5 have theability to work effectively withincreasingly complex situa-tions and models and havewell-developed reasoningskills. They can also use insightand interpretation of differentrepresentations and carry outmultiple sequential problems.
Level 6 Students at Level 6 can manipu-late complex and multiple repre-sentations, link different informa-tion, use significant insight andreflection, make generalizations,communicate the solution andexplanation of a problem inunstructured form, and interpretcomplex textual descriptions andrelate these to other problems.
Students at Level 6 can con-ceptualize and work with com-plex mathematical processesand relationships, useadvanced thinking and reason-ing skills to link multiple con-texts, use sequential calcula-tion processes, and conceptu-alize complex mathematicalprocesses.
See notes at end of exhibit.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
60
Exhibit A-1. Description of proficiency levels for mathematics literacy subscales: 2003—Continued
Proficiency level
Task descriptions
Change and relationships UncertaintyLevel 1 Students at Level 1 or 2 can
work with simple algorithms,formula, and procedures, linktext with single representa-tions, begin to interpret anduse elementary reasoning, andinterpret text to produce a sim-ple mathematical model in anapplied context.
Students at Level 1 or 2 canunderstand and use basicprobabilistic ideas in familiarexperimental contexts, locatestatistical information present-ed in familiar graphical form,and understand basic probabil-ity concepts in the context of asimple and familiar experiment.
Level 2
Level 3 Students at Level 3 can workwith related representations(text, graph, table and simplealgebra) including some inter-pretation and reasoning, inter-pret unfamiliar graphical repre-sentations of real-world situa-tions, and link and connect mul-tiple related representations.
Students at Level 3 can inter-pret information and data, linkdifferent information sources,use basic reasoning with sim-ple probability concepts, inter-pret tabular information, useinsight into aspects of datapresentation, and link data tosuitable chart types.
Level 4 Students at Level 4 can under-stand and work with multiplerepresentations, includingexplicitly mathematical mod-els of real-world situations,carry out a sequence of calcu-lations involving percentage orproportion, and show insightinto three-dimensional geo-metric problems.
Students at Level 4 can usebasic statistical and probabilityconcepts combined with logicalreasoning in less familiar con-texts, use argumentation basedon interpretation of data, inter-pret text, including in an unfa-miliar (scientific) context, andtranslate text description intomathematics problems.
Level 5 Students at Level 5 have quiteadvanced use of algebraic andother formal mathematicalexpressions and models andhave the ability to link formalmathematical representationsto complex real-world situa-tions. They can also solve com-plex and multi-step problems.
Students at Level 5 can applystatistical knowledge in situa-tions that are somewhat struc-tured and where the mathemat-ical representation is partiallyapparent and use reasoningand insight to interpret giveninformation.
Level 6 Students at Level 6 can use sig-nificant insight, well-developedreasoning skills and explicittechnical knowledge to solveproblems and to begin to gener-alize mathematical solutions tocomplex real-world problemsand can interpret complexmathematical information inthe context of a problem.
Students at Level 6 can usehigh-level thinking and reason-ing skills in statistical or prob-abilistic contexts to createmathematical representationsof real-world situations, useinsight, reflection and argu-mentation to communicatearguments and explanations,and interpret and reflect.
NOTE: In order to reach a particular proficiency level, a student must have been able to correctlyanswer a majority of items at that level. Students were classified into mathematics literacy lev-els according to their scores. Exact cut point scores are as follows: below level 1 (a score lessthan or equal to 357.77); level 1 (a score greater than 357.77 and less than or equal to 420.07); level 2(a score greater than 420.07 and less than or equal to 482.38); level 3 (a score greater than 482.38 andless than or equal to 544.68); level 4 (a score greater than 544.68 and less than or equal to 606.99);level 5 (a score greater than 606.99 and less than or equal to 669.3); level 6 (a score greater than669.3).SOURCE: Organization for Economic Cooperation and Development (OECD), Program forInternational Student Assessment (PISA), 2003.
61
PISA 2003 Results From the U.S. Perspective
Nonsampling Errors
Nonsampling error is a term used todescribe variations in the estimates thatmay be caused by population coverage limi-tations, nonresponse bias, and measurementerror, as well as data collection, processing,and reporting procedures. For example, thesampling frame was limited to regular publicand private schools in the 50 states and theDistrict of Columbia. The sources of non-sampling errors are typically problems likeunit and item nonresponse, the differencesin respondents’ interpretations of the mean-ing of the questions, response differencesrelated to the particular time the survey wasconducted, and mistakes in data prepara-tion. Some of these issues (particularly unitnonresponse) are discussed above in thesection on U.S. sampling and data collec-tion.
Missing Data
There are four kinds of missing data.“Nonresponse” data occurs when a respon-dent was expected to answer an item but noresponse is given. Responses that are“missing or invalid” occur in multiple-choiceitems where an invalid response is given.The code is not used for open-ended ques-tions. An item is “not applicable” when it isnot possible for the respondent to answerthe question. Finally, items that are “notreached” are consecutive missing valuesstarting from the end of each test session.All four kinds of missing data are coded differently in the PISA 2003 database.
Missing background data are not included inthe analyses for this report and are notimputed. In general, item response rates forvariables discussed in this report were overthe NCES standard of 85 percent to reportwithout notation (table A-2). The one casein which more than 15 percent of the studentresponses were missing (for New Zealandfor student report of parent occupation, withan item response rate of 84 percent) isflagged in the supporting statistical datatables in appendix B.
proficiency level, a student must have beenable to correctly answer a majority of itemsat that level. Students were classified intoproblem-solving levels according to theirscores. Exact cut point scores are as fol-lows: below level 1 (a score less than orequal to 404.06); level 1 (a score greater than404.06 and less than or equal to 498.08); level2 (a score greater than 498.08 and less thanor equal to 592.10); level 3 (a score greaterthan 592.10).
All students within a level are expected toanswer at least half of the items from thatlevel correctly. Students at the bottom of alevel have a 62 percent chance of success onthe easiest items from that level and a 42percent chance of success on the hardestitems from that level (overall response prob-ability was 62). Students at the top of a levelare able to provide the correct answers toabout 70 percent of all items from that level,have a 62 percent chance of success on thehardest items from that level, and have a 78percent chance of success on the easiestitems from that level. Students just belowthe top of a level would score less than 50percent on an assessment of the next higherlevel. Students at a particular level not onlydemonstrate the knowledge and skills asso-ciated with that level but also the proficien-cies defined by lower levels. Thus, all stu-dents proficient at level 3 are also proficientat levels 1 and 2. Patterns of responses forstudents below level 1 suggest they areunable to answer at least half of the items inlevel 1 correctly.
Data LimitationsAs with any study, there are limitations toPISA 2003 that researchers should take intoconsideration. Estimates produced usingdata from PISA 2003 are subject to twotypes of error, nonsampling and samplingerrors. Nonsampling errors can be due toerrors made in the collection and processingof data. Sampling errors can occur becausethe data were collected from a sample ratherthan a complete census of the population.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
62
In general, it is difficult to identify and esti-mate either the amount of nonsampling erroror the bias caused by this error. In PISA2003, efforts were made to prevent sucherrors from occurring and to compensate forthem when possible. For example, the designphase entailed a field test that evaluateditems as well as the implementation proce-dures for the survey. It should also be rec-ognized that most background informationwas obtained from students’ self-reports,which are subject to respondent bias. Onepotential source of respondent bias in thissurvey was social desirability bias, for exam-ple, if students reported that they were goodat mathematics.
Sampling Errors
Sampling errors occur when the discrepancybetween a population characteristic and thesample estimate arises because not allmembers of the reference population aresampled for the survey. The size of the sam-ple relative to the population and the vari-ability of the population characteristics bothinfluence the magnitude of sampling error.The particular sample of 15-year-old stu-dents from the 2002–03 school year was justone of many possible samples that couldhave been selected. Therefore, estimatesproduced from the PISA 2003 sample maydiffer from estimates that would have beenproduced had another sample of 15-year-oldstudents been drawn. This type of variabilitywas called sampling error because it arisesfrom using a sample of 15-year-old studentsin 2002, rather than all 15-year-old studentsin that year.
The standard error is a measure of the vari-ability due to sampling when estimating astatistic. The approach used for calculatingsampling variances in PISA was theBalanced Repeated Replication (BRR), orBalanced Half-Samples (Fay’s method).Standard errors can be used as a measurefor the precision expected from a particularsample.
Standard errors for all of the estimates areincluded in appendix B to this report. Thesestandard errors can be used to produce con-fidence intervals. There is a 95 percentchance that the true average lies within therange of 1.96 times the standard errors aboveor below the estimated score. For example, itwas estimated that 15.5 percent of U.S. stu-dents scored at level 1 on the combinedmathematics literacy scale, and this statis-tic had a standard error of 0.81. Therefore, itcan be stated with 95 percent confidencethat the actual percentage of U.S. studentsat level 1 for the total population in 2003 wasbetween 13.9 and 17.1 percent (1.96 x 0.81 =1.59; confidence interval = 15.5 +/- 1.59).
Descriptions of Background VariablesFull PISA 2003 student and school question-naires are available athttp://nces.ed.gov/surveys/pisa orhttp://www.pisa.oecd.org.
Socioeconomic Status
The measure of student socioeconomic statusused in PISA 2003 is based on the occupa-tional status of the student’s father and/ormother (whichever is higher) as reported bythe student. Parental occupation was codedto 4 digits based on the InternationalStandard Classification of Occupations(ISCO). Occupational codes were in turnmapped onto an internationally comparableindex of occupational status, the InternationalSocioeconomic Index (ISEI), developed byGanzeboom, De Graaf, and Treiman (1992).Using the index, students were assigned num-bers ranging from about 16 to 90 based ontheir parents’ occupations, so that they werearrayed on a continuum from low to highsocioeconomic status, rather than placed intodiscrete categories. The range of ISEI scoresgiven for the 1988 ISCO occupations listed inGanzeboom and Treiman (1996) goes from 16,the lowest (agricultural laborer), to 90, thehighest (judge). Typical occupations among
United Kingdom1 100 94 † †† Not applicable.1Due to low response rates, data for the United Kingdom are not discussed in this report.NOTE: Cases where more than 15 percent of the student responses are missing are flagged in the support-ing statistical data tables in appendix B. For more information about the variables, see the Description ofVariables section in appendix B. The overall percentage refers to the sample estimate for the overall 15-year-old student population. The International Socioeconomic Index (ISEI) is an internationally compara-ble index of occupational status, with a range of approximately 16 to 90, developed by Ganzeboom, DeGraaf, and Treiman (1992).SOURCE: Organization for Economic Cooperation and Development (OECD), Program for InternationalStudent Assessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
64
PISA schools and adding an additionalmeasure of uncertainty of school and stu-dent identification through random swappingof data elements within the student andschool files.
Statistical ProceduresTests of Significance
Comparisons made in the text of this reporthave been tested for statistical significance.For example, in the commonly made compar-ison of country averages against the averageof the United States, tests of statistical sig-nificance were used to establish whether ornot the observed differences from the U.S.average were statistically significant.
The estimation of the standard errors thatare required in order to undertake the testsof significance is complicated by the com-plex sample and assessment designs whichboth generate error variance. Together theymandate a set of statistically complex proce-dures in order to estimate the correct stan-dard errors. As a consequence, the estimat-ed standard errors contain a sampling vari-ance component estimated by BalancedRepeated Replication (BRR)—the Faymethod of BRR; and, where the assessmentsare concerned, there is an additional imputa-tion variance component arising from theassessment design. Details on the BRRprocedures used can be found in the WesVar4.0 User’s Guide (Westat 2000).
In almost all instances, the tests for signifi-cance used were standard t tests. These fellinto two categories according to the natureof the comparison being made: comparisonsof independent and non-independent sam-ples. In PISA, country samples are inde-pendent. To determine whether the averagescores for two countries are different wetest the null hypothesis:
H 0 : µ̂ (country1) – µ̂ (country2) = 0
parents of 15-year-olds with between 16 and35 points on the ISEI scale include small-scale farmer, metalworker, mechanic, taxi ortruck driver, and waiter/waitress. Between 35and 53 index points, the most common occu-pations are bookkeeping, sales, small busi-ness management, and nursing. As therequired skills increase, so does the status ofthe occupation. Between 54 and 70 points,typical occupations are marketing manage-ment, teaching, civil engineering, and account-ant. Finally, between 71 and 90 points, the topinternational quarter of the index, occupationsinclude medicine, university teaching, and law(OECD 2001).
Race/Ethnicity
In the United States, students’ race/ethnici-ty was obtained through student responsesto a two-part question. Students were askedfirst whether they were Hispanic or Latino,and then asked whether they were membersof the following racial groups: AmericanIndian or Alaska Native, Asian, Black orAfrican American, Native Hawaiian or otherPacific Islander, or White. Multiple respons-es to the race classification question wereallowed. Results are shown separately forAsians, Blacks, Hispanics, Whites, and stu-dents who selected more than one race.Students identifying themselves as Hispanicand also other races were included in theHispanic group, rather than in a racial group.
Confidentiality and Disclosure LimitationsThe PISA 2003 data are hierarchical andinclude school data and student data fromthe participating schools. Confidentialityanalyses for the United States weredesigned to provide reasonable assurancethat public use data files issued by the PISAConsortium would not allow identification ofindividual U.S. schools or students whencompared against public data collections.Disclosure limitation included the identifica-tion and masking of potential disclosure-risk
65
PISA 2003 Results From the U.S. Perspective
To test this hypothesis, the two observedvalues and their respective standard errorsare needed to perform a t test. The standarderror on the estimate for some statistic � is:
Thus, in simple comparisons of independentaverages, such as the average score ofcountry 1 with that of country 2, the followingformula was used to compute the t-statistic:
t = est1 - est2 / SQRT[(se1)2 + (se2)2]
where est1 and est2 are the estimates beingcompared (e.g., averages of country 1 andcountry 2) and se1 and se2 are the correspon-ding standard errors of these averages.
This test may also be used for comparisonswithin a particular country if the categoricalvariable used to define the groups beingcompared was used as an explicit stratifica-tion variable; however, there was no explicitstratification used in the United Statessample.
The second type of comparison used in thisreport occurred when comparing differencesof non-subset, non-independent groups.When this occurs, the correlation and relat-ed covariance between the groups must betaken into account, such as when comparinga country mean with the OECD mean whichincludes that particular country, or whencomparing the average scores of males ver-sus females within the United States.
How are scores like those for µ̂(boys) and µ̂(girls)
correlated? Suppose that in the school sam-ple, a coeducational school attended by lowachievers is replaced by a coeducationalschool attended by high achievers. The coun-try mean will increase slightly, as well as the males’ and the females’ means. If such a school replacement process is continued, µ̂(boys) and µ̂(girls) will likely increase in a similar
2)ˆ(
2)ˆ()ˆˆ( jiji θθθθ
σσσ +=−
pattern. Indeed, a coeducational schoolattended by high achieving males is usuallyalso attended by high achieving females.Therefore, the covariance between µ̂(boys) andµ̂(girls) will be positive.
What does the covariance between the twovariables, i.e., µ̂(boys), µ̂(girls) , tell us? A positivecovariance means that if µ̂(boys) increasesthen µ̂(girls) will also increase. A covarianceequal or close to 0 means that µ̂(boys) canincrease or decrease with µ̂(girls) remainingunchanged. Finally, a negative covariancemeans that if µ̂(boys) increases, then µ̂(girls) willdecrease, and inversely.
Next, to determine whether the females’ performance differs from the males’ per-formance, for example, as for all statisticalanalyses, a null hypothesis has to be tested.In this particular example, it will consist ofcomputing the difference between themales’ performance mean and the females’performance mean (or the inverse). The nullhypothesis will be:
H 0 : µ̂ (boys) – µ̂ (girls) = 0.
The variance of the observed difference isneeded to test this null hypothesis. The vari-ance of a difference is equal to the sum ofthe variances of the two initial variablesminus two times the covariance between thetwo initial variables. A sampling distributionhas the same characteristics as any distri-bution, except that units consist of sampleestimates and not observations. Therefore,the sampling variance of a difference isequal to the sum of the two initial samplingvariances minus two times the covariancebetween the two sampling distributions onthe estimates.
�2(µ̂ x - µ̂ y) = �2
(µ̂ X ) + �2(µ̂ Y ) - 2cov(µ̂X,µ̂Y)
The estimation of the covariance between,for instance, µ̂(boys) and µ̂ (girls) requires theselection of several samples and then the
International Outcomes of Learning in Mathematics Literacy and Problem Solving
66
In PISA, in each of the three subject matterareas, a common transformation was esti-mated from the link items, and this transfor-mation was applied to all participating coun-tries. It follows that any uncertainty thatwas introduced through the linking is com-mon to all students and all countries. Thus,for example, suppose the unknown linkingerror (between PISA 2000 and PISA 2003) inreading literacy resulted in an over-estima-tion of student scores by two points on thePISA 2000 scale. It follows that every stu-dent’s score will be over-estimated by twoscore points. This over-estimation will haveeffects on certain, but not all, summary sta-tistics computed from the PISA 2003 data.For example, consider the following:
• each country’s mean will be over-esti-mated by an amount equal to the linkerror, (in our example this is two scorepoints);
• the mean performance of any subgroupwill be over-estimated by an amountequal to the link error (in our examplethis is two score points);
• the standard deviation of student scoreswill not be affected because the over-estimation of each student by a commonerror does not change the standard devi-ation;
• the difference between the mean scoresof two countries in PISA 2003 will not beinfluenced because the over-estimationof each student by a common error willhave distorted each country’s mean bythe same amount;
• the difference between the mean scoresof two groups (e.g., males and females)in PISA 2003 will not be influenced,because the over-estimation of each stu-dent by a common error will have distort-ed each group’s mean by the sameamount;
analysis of the variation of µ̂(boys) in conjunc-tion with µ̂ (girls). Such a procedure is, ofcourse, unrealistic. Therefore, as for any com-putation of a standard error in PISA, replica-tion methods using the supplied replicateweights are used to estimate the standarderror on a difference. Use of the replicateweights implicitly incorporates the covari-ance between the two estimates into the esti-mate of the standard error on the difference.
To test such comparisons, the following for-mula was used to compute the t statistic:
t = estgrp1 – estgrp2 /se(estgrp1 – estgrp2)
Estgrp1 and estgrp2 are the non-independentgroups estimates being compared; se (estgrp1 – estgrp2) is the standard error ofthe difference calculated using BalancedRepeated Replication (BRR) to account forany covariance between the estimates forthe two non-independent groups.
A third type of comparison (addition of a stan-dard error term to the standard t test shownabove for simple comparisons of independentaverages) was also used when analyzingchange in performance over time. The uncer-tainty that results from the link item sampling(described in the scaling section above) isreferred to as linking error and this error mustbe taken into account when making certaincomparisons between PISA 2000 and PISA2003 results. Just as with the error that isintroduced through the process of samplingstudents, the exact magnitude of this linkingerror cannot be determined. We can, however,estimate the likely range of magnitudes forthis error and take this error into account wheninterpreting PISA results. As with samplingerrors, the likely range of magnitude for theerrors is represented as a standard error. Thestandard error of linking for reading is 3.74, thestandard error of linking for science is 3.02,and the standard error for mathematics (spaceand shape scale) is 6.01 and mathematics(change and relationships scale) is 4.84.
67
PISA 2003 Results From the U.S. Perspective
• the difference between the performanceof a group of students (e.g., a country)between PISA 2000 and PISA 2003 willbe influenced because each student’sscore in PISA 2003 will be influenced bythe error; and
• a change in the difference in perform-ance between two groups from PISA2000 to PISA 2003 will not be influenced.This is because neither of the compo-nents of this comparison, which are dif-ferences in scores in 2000 and 2003respectively, is influenced by a commonerror that is added to all student scoresin PISA 2003.
In general terms, the linking error need onlybe considered when comparisons are beingmade between PISA 2000 and PISA 2003results, and then usually only when groupmeans are being compared. Because thelinking error need only be used in a limitedrange of situations we have chosen not toreport the linking error in the tables includedin this report. The general formula is given by:
t = est1 – est2 / SQRT[(se1)2 + (se2)2 +(selinking)2]
The most obvious example of a situationwhere there is a need to use linking error isin the comparison of the mean performancefor a country between PISA 2000 and PISA2003. For example, let us consider a compar-ison between 2000 and 2003 of the perform-ance of Italy in reading. The mean perform-ance of Italy in 2000 was 487 with a standarderror of 2.9, while in 2003 the mean was 476with a standard error of 3.0. The standard-ized difference in the Italian mean is 1.97,which is computed as follows:
and is statistically significant.
+ 1.97 = (487 - 476) 2.9 2 + 3.0 2 3.7 2
In the U.S. report on PISA 2000, a Bonferroniadjustment was used in all multiple compar-isons of countries. This was not the case in2003, which may result in some differencesin how 2000 results are reported in 2003. Thismay also result in some differences betweenthe PISA 2003 U.S. and OECD reports(which uses a Bonferroni adjustment formultiple comparisons of country averages).The discontinuation of the use of theBonferroni adjustment for multiple compar-isons was made in order to avoid the possi-bility that comparisons of achievementbetween countries could be interpreted dif-ferently depending on the numbers of coun-tries compared.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
68
Appendix B: Reference Tables
69
PISA 2003 Results From the U.S. Perspective
Table B-1. Percentage distribution of 15-year-old students, by grade and country: 2003
† Not applicable.# Rounds to zero.‡ Reporting standards not met.1Due to low response rates, data for the United Kingdom are not discussed in this report. NOTE: Detail may not sum to totals because of rounding. s.e. means standard error.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for InternationalStudent Assessment (PISA), 2003.
71
PISA 2003 Results From the U.S. Perspective
Table B-2. Percentage distribution and average combined mathematics literacy scores of U.S. 15-year-old students, by type ofmathematics class: 2003
Type of class Percent s.e. Average s.ePre-algebra or general mathematics 8.7 0.80 419.5 4.97Algebra I 28.6 1.01 442.1 3.28Geometry 31.1 1.18 498.4 3.47Algebra II 20.7 1.01 537.2 3.67Precalculus or calculus 3.1 0.39 595.6 7.53Other 7.7 0.68 482.8 8.57
NOTE: Type of class refers to the mathematics class in which the studentwas enrolled at time of the assessment. Detail may not sum to totalsbecause of rounding. s.e. means standard error.SOURCE: Organization for Economic Cooperation and Development(OECD), Program for International Student Assessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
72
Table B-3. Average mathematics literacy scores and subscale scores of 15-year-old students, by country: 2003
Country
Combined mathematics
literacy
Mathematics subscales
Space and shapeChange and
relationships Quantity Uncertainty
Average s.e. Average s.e. Average s.e. Average s.e. Average s.e.OECD average 500.0 0.63 496.3 0.65 498.8 0.70 500.7 0.63 502.0 0.61
United Kingdom1 508.3 2.43 496.0 2.50 512.9 2.54 498.5 2.52 520.1 2.411Due to low response rates, data for the United Kingdom are not discussed in this report.NOTE: The OECD average is the average of the national averages of the OECD member countries with data available. Because PISA is principally anOECD study, the results for non-OECD countries are displayed separately from those of the OECD countries and are not included in the OECD average.s.e. means standard error.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment (PISA), 2003.
73
PISA 2003 Results From the U.S. Perspective
Table B-4. Combined mathematics literacy scores of 15-year-old students, by percentilesand country: 2003
Country 5th percentile 10th percentile 25th percentile 50th percentile
United Kingdom1 572.6 3.18 628.7 3.55 659.3 4.791Due to low response rates, data for the United Kingdom are not discussed in this report.NOTE: The OECD average is the average of the national averages of the OECD member countries withdata available. Because PISA is principally an OECD study, the results for non-OECD countries are dis-played separately from those of the OECD countries and are not included in the OECD average. s.e.means standard error.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for InternationalStudent Assessment (PISA), 2003.
75
PISA 2003 Results From the U.S. Perspective
Table B-5. Standard deviations of 15-year-old students' combinedmathematics literacy scores,by country: 2003
United Kingdom1 92.3 1.351Due to low response rates, data for the United Kingdom arenot discussed in this report.NOTE: The OECD average is the average of the nationalaverages of the OECD member countries with data available.Because PISA is principally an OECD study, the results fornon-OECD countries are displayed separately from those ofthe OECD countries and are not included in the OECD aver-age. s.e. means standard error.SOURCE: Organization for Economic Cooperation andDevelopment (OECD), Program for International StudentAssessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
76
Table B-6. Percentage distribution of 15-year-old students scoring at each proficiency level onthe combined mathematics literacy scale, by country: 2003
United Kingdom1 5.2 0.54 12.5 0.67 21.2 1.20 25.6 0.88See notes at end of table.
77
PISA 2003 Results From the U.S. Perspective
Table B-6. Percentage distribution of 15-year-old students scoring at eachproficiency level on the combined mathematics literacy scale, bycountry: 2003—Continued
United Kingdom1 20.6 0.73 11.0 0.73 3.9 0.43† Not applicable.# Rounds to zero.1Due to low response rates, data for the United Kingdom are not discussed in this report.NOTE: In order to reach a particular proficiency level, a student must have been able to correctly answer amajority of items at that level. Students were classified into mathematics literacy levels according to theirscores. Exact cut point scores are as follows: below level 1 (a score less than or equal to 357.77); level 1 (ascore greater than 357.77 and less than or equal to 420.07); level 2 (a score greater than 420.07 and less than orequal to 482.38); level 3 (a score greater than 482.38 and less than or equal to 544.68); level 4 (a score greaterthan 544.68 and less than or equal to 606.99); level 5 (a score greater than 606.99 and less than or equal to669.3); level 6 (a score greater than 669.3). The OECD average is the average of the national averages of theOECD member countries with data available. Because PISA is principally an OECD study, the results fornon-OECD countries are displayed separately from those of the OECD countries and are not included inthe OECD average. s.e. means standard error. Detail may not sum to totals because of rounding.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for InternationalStudent Assessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
78
Table B-7. Percentage distribution of 15-year-old students scoring at each proficiency level onthe mathematics literacy quantity subscale, by country: 2003
United Kingdom1 8.3 0.64 13.7 0.67 20.7 0.96 24.2 0.68See notes at end of table.
79
PISA 2003 Results From the U.S. Perspective
Table B-7. Percentage distribution of 15-year-old students scoring at each proficiency level on the mathematics literacy quantity subscale,by country: 2003—Continued
United Kingdom1 19.2 0.73 10.1 0.80 3.8 0.471Due to low response rates, data for the United Kingdom are not discussed in this report.NOTE: In order to reach a particular proficiency level, a student must have been able to correctly answer amajority of items at that level. Students were classified into mathematics literacy levels according to theirscores. Exact cut point scores are as follows: below level 1 (a score less than or equal to 357.77); level 1 (ascore greater than 357.77 and less than or equal to 420.07); level 2 (a score greater than 420.07 and less than orequal to 482.38); level 3 (a score greater than 482.38 and less than or equal to 544.68); level 4 (a score greaterthan 544.68 and less than or equal to 606.99); level 5 (a score greater than 606.99 and less than or equal to669.3); level 6 (a score greater than 669.3). The OECD average is the average of the national averages of theOECD member countries with data available. Because PISA is principally an OECD study, the results fornon-OECD countries are displayed separately from those of the OECD countries and are not included inthe OECD average. s.e. means standard error. Detail may not sum to totals because of rounding.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for InternationalStudent Assessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
80
Table B-8. Percentage distribution of 15-year-old students scoring at each proficiency level onthe mathematics literacy space and shape subscale, by country: 2003
United Kingdom1 8.6 0.64 14.1 0.98 21.4 0.78 24.3 0.82See notes at end of table.
81
PISA 2003 Results From the U.S. Perspective
Table B-8. Percentage distribution of 15-year-old students scoring at eachproficiency level on the mathematics literacy space and shapesubscale, by country: 2003—Continued
United Kingdom1 17.9 0.61 9.7 0.65 3.9 0.40† Not applicable.# Rounds to zero.1Due to low response rates, data for the United Kingdom are not discussed in this report.NOTE: In order to reach a particular proficiency level, a student must have been able to correctly answer amajority of items at that level. Students were classified into mathematics literacy levels according to theirscores. Exact cut point scores are as follows: below level 1 (a score less than or equal to 357.77); level 1 (ascore greater than 357.77 and less than or equal to 420.07); level 2 (a score greater than 420.07 and less than orequal to 482.38); level 3 (a score greater than 482.38 and less than or equal to 544.68); level 4 (a score greaterthan 544.68 and less than or equal to 606.99); level 5 (a score greater than 606.99 and less than or equal to669.3); level 6 (a score greater than 669.3). The OECD average is the average of the national averages of theOECD member countries with data available. Because PISA is principally an OECD study, the results fornon-OECD countries are displayed separately from those of the OECD countries and are not included inthe OECD average. s.e. means standard error. Detail may not sum to totals because of rounding.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for InternationalStudent Assessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
82
Table B-9. Percentage distribution of 15-year-old students scoring at each proficiency level onthe mathematics literacy change and relationships subscale, by country: 2003
United Kingdom1 5.7 0.57 11.2 0.85 20.3 0.89 24.6 0.85See notes at end of table.
83
PISA 2003 Results From the U.S. Perspective
Table B-9. Percentage distribution of 15-year-old students scoring ateach proficiency level on the mathematics literacy changeand relationships subscale, by country: 2003—Continued
United Kingdom1 21.4 0.73 11.7 0.72 4.9 0.46† Not applicable.# Rounds to zero.1Due to low response rates, data for the United Kingdom are not discussed in this report.NOTE: In order to reach a particular proficiency level, a student must have been able to correctly answer amajority of items at that level. Students were classified into mathematics literacy levels according to theirscores. Exact cut point scores are as follows: below level 1 (a score less than or equal to 357.77); level 1 (ascore greater than 357.77 and less than or equal to 420.07); level 2 (a score greater than 420.07 and less than orequal to 482.38); level 3 (a score greater than 482.38 and less than or equal to 544.68); level 4 (a score greaterthan 544.68 and less than or equal to 606.99); level 5 (a score greater than 606.99 and less than or equal to669.3); level 6 (a score greater than 669.3). The OECD average is the average of the national averages of theOECD member countries with data available. Because PISA is principally an OECD study, the results fornon-OECD countries are displayed separately from those of the OECD countries and are not included inthe OECD average. s.e. means standard error. Detail may not sum to totals because of rounding.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for InternationalStudent Assessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
84
Table B-10. Percentage distribution of 15-year-old students scoring at each proficiency level onthe mathematics literacy uncertainty subscale, by country: 2003
United Kingdom1 3.8 0.42 10.1 0.64 20.4 0.69 25.7 0.76See notes at end of table.
85
PISA 2003 Results From the U.S. Perspective
Table B-10. Percentage distribution of 15-year-old students scoring ateach proficiency level on the mathematics literacy uncertaintysubscale, by country: 2003—Continued
United Kingdom1 22.3 0.65 12.8 0.71 4.8 0.51† Not applicable.# Rounds to zero.1Due to low response rates, data for the United Kingdom are not discussed in this report.NOTE: In order to reach a particular proficiency level, a student must have been able to correctly answer amajority of items at that level. Students were classified into mathematics literacy levels according to theirscores. Exact cut point scores are as follows: below level 1 (a score less than or equal to 357.77); level 1 (ascore greater than 357.77 and less than or equal to 420.07); level 2 (a score greater than 420.07 and less than orequal to 482.38); level 3 (a score greater than 482.38 and less than or equal to 544.68); level 4 (a score greaterthan 544.68 and less than or equal to 606.99); level 5 (a score greater than 606.99 and less than or equal to669.3); level 6 (a score greater than 669.3). The OECD average is the average of the national averages of theOECD member countries with data available. Because PISA is principally an OECD study, the results fornon-OECD countries are displayed separately from those of the OECD countries and are not included inthe OECD average. s.e. means standard error. Detail may not sum to totals because of rounding.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for InternationalStudent Assessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
86
Table B-11. Average selected mathematics literacy subscale scores of 15-year-old students, by country: 2000 and 2003
CountrySpace and shape Change and relationships
2000 2003 2000 2003
Average s.e. Average s.e. Average s.e. Average s.e.OECD average — — 498.8 0.70 — — 496.3 0.65
United Kingdom1 504.9 2.58 496.0 2.50 519.2 2.21 512.9 2.54—Not available.1Due to low response rates, data for the United Kingdom are not discussed in this report.NOTE: The OECD average is the average of the national averages of the OECD member countries with data available. BecausePISA is principally an OECD study, the results for non-OECD countries are displayed separately from those of the OECD countriesand are not included in the OECD average. s.e. means standard error.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2000 and 2003.
87
PISA 2003 Results From the U.S. Perspective
Table B-12. Average problem-solvingscores of 15-year-old students, by country: 2003
United Kingdom1 510.2 2.381Due to low response rates, data for the UnitedKingdom are not discussed in this report.NOTE: The OECD average is the average of thenational averages of the OECD member countrieswith data available. Because PISA is principally anOECD study, the results for non-OECD countries aredisplayed separately from those of the OECD coun-tries and are not included in the OECD average. s.e.means standard error.SOURCE: Organization for Economic Cooperation andDevelopment (OECD), Program for InternationalStudent Assessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
88
Table B-13. Problem-solving scores of 15-year-old students, by percentiles and country: 2003
Country 5th percentile 10th percentile 25th percentile 50th percentile
United Kingdom1 576.5 3.07 628.9 3.69 659.2 3.981Due to low response rates, data for the United Kingdom are not discussed in this report.NOTE: The OECD average is the average of the national averages of the OECD member countries withdata available. Because PISA is principally an OECD study, the results for non-OECD countries are dis-played separately from those of the OECD countries and are not included in the OECD average. s.e.means standard error.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for InternationalStudent Assessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
90
Table B-14. Standard deviations of 15-year-oldstudents' problem-solving scores,by country: 2003
Country Standard deviation s.e.OECD average 100.0 0.44
United Kingdom1 93.2 1.211Due to low response rates, data for the United Kingdom are notdiscussed in this report.NOTE: The OECD average is the average of the national aver-ages of the OECD member countries with data available.Because PISA is principally an OECD study, the results for non-OECD countries are displayed separately from those of theOECD countries and are not included in the OECD average. s.e.means standard error.SOURCE: Organization for Economic Cooperation andDevelopment (OECD), Program for International StudentAssessment (PISA), 2003.
91
PISA 2003 Results From the U.S. Perspective
Table B-15. Percentage distribution of 15-year-old students scoring at each proficiency level onthe problem-solving scale, by country: 2003
United Kingdom1 13.7 0.75 30.3 1.09 36.6 1.00 19.5 0.971Due to low response rates, data for the United Kingdom are not discussed in this report.NOTE: In order to reach a particular proficiency level, a student must have been able to correctly answer a majority of items at thatlevel. Students were classified into problem solving levels according to their scores. Exact cut point scores are as follows: belowlevel 1 (a score less than or equal to 404.06); level 1 (a score greater than 404.06 and less than or equal to 498.08); level 2 (a score greaterthan 498.08 and less than or equal to 592.10); level 3 (a score greater than 592.10).The OECD average is the average of the nationalaverages of the OECD member countries with data available. Because PISA is principally an OECD study, the results for non-OECD countries are displayed separately from those of the OECD countries and are not included in the OECD average. s.e. meansstandard error. Detail may not sum to totals because of rounding.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
92
Table B-16. Average reading literacy scores of 15-year-old students, by country: 2000 and 2003
Country 2000 2003
Average s.e. Average s.e.OECD average 500.0 0.64 494.2 0.65
United Kingdom1 523.4 2.56 507.0 2.46—Not available.1Due to low response rates, 2003 data for the United Kingdom are not discussed inthis report.NOTE: The OECD average is the average of the national averages of the OECDmember countries with data available. Because PISA is principally an OECDstudy, the results for non-OECD countries are displayed separately from thoseof the OECD countries and are not included in the OECD average. s.e. meansstandard error.SOURCE: Organization for Economic Cooperation and Development (OECD),Program for International Student Assessment (PISA), 2000 and 2003.
93
PISA 2003 Results From the U.S. Perspective
Table B-17. Average science literacy scores of 15-year-old students, by country: 2000 and 2003
Country 2000 2003
Average s.e. Average s.e.OECD average 500.0 0.65 499.6 0.60
United Kingdom1 532.0 2.69 518.4 2.52—Not available.1Due to low response rates, 2003 data for the United Kingdom are not discussed inthis report.NOTE: The OECD average is the average of the national averages of the OECDmember countries with data available. Because PISA is principally an OECDstudy, the results for non-OECD countries are displayed separately from thoseof the OECD countries and are not included in the OECD average. s.e. meansstandard error.SOURCE: Organization for Economic Cooperation and Development (OECD),Program for International Student Assessment (PISA), 2000 and 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
94
Table B-18. Average combined mathematics literacy scores of 15-year-old students, by sex and country: 2003
Country
Male-female
score pointdifference s.e.
Male Female
Average s.e. Average s.e.OECD average 505.5 0.76 494.4 0.76 11.1 0.81
United Kingdom1 511.8 2.90 505.1 3.88 6.7 4.901Due to low response rates, 2003 data for the United Kingdom are not discussed in this report.NOTE: The male-female score point difference is calculated by subtracting the average scores of femalesfrom the average scores of males. The OECD average is the average of the national averages of the OECDmember countries with data available. Because PISA is principally an OECD study, the results for non-OECD countries are displayed separately from those of the OECD countries and are not included in theOECD average. s.e. means standard error. Detail may not sum to totals because of rounding.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for InternationalStudent Assessment (PISA), 2003.
95
PISA 2003 Results From the U.S. Perspective
Table B-19. Percentage distribution of 15-year-old students scoring at each proficiency level onthe combined mathematics literacy scale, by sex and country: 2003
United Kingdom1 5.3 0.65 5.2 0.72 12.1 0.96 12.9 0.89See notes at end of table.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
96
Table B-19. Percentage distribution of 15-year-old students scoring at each proficiency level onthe combined mathematics literacy scale, by sex and country: 2003—Continued
United Kingdom1 19.5 1.54 22.7 1.38 25.8 1.52 25.4 1.04See notes at end of table.
97
PISA 2003 Results From the U.S. Perspective
Table B-19. Percentage distribution of 15-year-old students scoring at each proficiency level onthe combined mathematics literacy scale, by sex and country: 2003—Continued
United Kingdom1 21.4 0.87 19.9 1.14 11.6 1.15 10.5 1.14See notes at end of table.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
98
Table B-19. Percentage distribution of 15-year-old students scoring at each proficiency levelon the combined mathematics literacy scale, by sex and country: 2003—Continued
CountryLevel 6
Male Female
Percent s.e. Percent s.e.OECD average 5.1 0.14 2.9 0.10
United Kingdom1 4.4 0.59 3.4 0.68† Not applicable.# Rounds to zero.1Due to low response rates, 2003 data for the United Kingdom are not discussed in this report.NOTE: In order to reach a particular level, a student must have been able to correctly answer a majority of items at that level.Students were classified into mathematics literacy levels according to their scores. Exact cut point scores are as follows:below level 1 (a score less than or equal to 357.7); level 1 (a score greater than 357.7 and less than or equal to 420.07); level 2 (ascore greated than 420.07 and less than or equal to 482.38); level 3 (a score greater than 482.38 and less than or equal to 544.68);level 4 (a score greater than 544.68 and less than or equal to 606.99); level 5 (a score greater than 606.99 and less than or equal to669.3); level 6 (a score greater than 669.3). The OECD average is the average of the national averages of the OECD membercountries with data available. Because PISA is principally an OECD study, the results for non-OECD countries are displayedseparately from those of the OECD countries and are not included in the OECD average. s.e. means standard error. Detail maynot sum to totals because of rounding.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2003.
99
PISA 2003 Results From the U.S. Perspective
Table B-20. Average mathematics literacy subscale scores of 15-year-old students, by sex and country: 2003
Country
Space and shapeMale-
femalescore pointdifference s.e.
Male Female
Average s.e. Average s.e.OECD average 504.6 0.81 487.9 0.79 16.7 0.90
United Kingdom1 523.1 2.90 517.5 3.84 5.6 4.87† Not applicable. # Rounds to zero.1Due to low response rates, 2003 data for the United Kingdom are not discussed in this report.NOTE: The male-female score point difference is calculated by subtracting the average scores of females fromthe average scores of males. The OECD average is the average of the national averages of the OECD membercountries with data available. Because PISA is principally an OECD study, the results for non-OECD countriesare displayed separately from those of the OECD countries and are not included in the OECD average. s.e.means standard error. Detail may not sum to totals because of rounding.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International StudentAssessment (PISA), 2003.
103
PISA 2003 Results From the U.S. Perspective
Table B-21. Average problem-solving scores of 15-year-old students, by sexand country: 2003
Country
Male-female
score pointdifference s.e.
Male Female
Average s.e. Average s.e.OECD average 499.2 0.76 500.9 0.77 -1.7 0.82
United Kingdom1 505.7 2.97 514.1 3.50 -8.4 4.51† Not applicable.# Rounds to zero.1Due to low response rates, 2003 data for the United Kingdom are not discussed in this report.NOTE: The male-female score point difference is calculated by subtracting the average scores of femalesfrom the average scores of males. The OECD average is the average of the national averages of the OECDmember countries with data available. Because PISA is principally an OECD study, the results for non-OECD countries are displayed separately from those of the OECD countries and are not included in theOECD average. s.e. means standard error. Detail may not sum to totals because of rounding.SOURCE: Organization for Economic Cooperation and Development (OECD), Program for InternationalStudent Assessment (PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
104
Table B-22. Average combined reading literacy and science literacy scores of 15-year-old students, by sex and country: 2000 and 2003
Country
Reading literacy in 2000Male-
femalescore pointdifference s.e.
Male Female
Average s.e. Average s.e.OECD average 484.8 0.82 516.5 0.75 -31.7 0.94
United Kingdom1 520.2 3.14 516.8 3.98 3.4 5.16—Not available.1Due to low response rates, 2003 data for the United Kingdom are not discussed in this report.NOTE: The male-female score point difference is calculated by subtracting the average scores of femalesfrom the average scores of males. The OECD average is the average of the national averages of the OECDmember countries with data available. Because PISA is principally an OECD study, the results for non-OECD countries are displayed separately from those of the OECD countries and are not included in theOECD average. s.e. means standard error. SOURCE: Organization for Economic Cooperation and Development (OECD), Program for InternationalStudent Assessment (PISA), 2000 and 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
108
Table B-23. Mean International Socioeconomic Index (ISEI) score of 15-year-old students,by quarters of the ISEI index and country: 2003
CountryMean Bottom quarter Second quarter Third quarter Top quarter
Indexscore s.e.
Indexscore s.e.
Indexscore s.e.
Indexscore s.e.
Indexscore s.e.
OECD average 48.8 0.08 28.2 0.04 42.3 0.08 53.2 0.09 71.2 0.13OECD countries
United Kingdom2 49.6 0.39 28.5 0.14 43.0 0.14 53.2 0.09 71.6 0.191The item response rate for ISEI for New Zealand is below 85 percent. Missing data have not been explicitly accounted for. Seealso table A-2.2Due to low response rates, 2003 data for the United Kingdom are not discussed in this report.NOTE: The OECD average is the average of the national averages of the OECD member countries with data available. BecausePISA is principally an OECD study, the results for non-OECD countries are displayed separately from those of the OECDcountries and are not included in the OECD average. The International Socioeconomic Index (ISEI) is an internationally compa-rable index of occupational status, with a range of approximately 16 to 90, developed by Ganzeboom, De Graaf, and Treiman(1992). s.e. means standard error. SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2003.
109
PISA 2003 Results From the U.S. Perspective
Table B-24. Average combined mathematics literacy scores of 15-year-old students, byquarters of the International Socioeconomic Index (ISEI) and country: 2003
CountryMathematics literacy
Bottom quarter Second quarter Third quarter Top quarter
Average s.e. Average s.e. Average s.e. Average s.e.OECD average 455.5 0.92 493.2 0.75 516.1 0.73 547.7 0.84
United Kingdom2 469.1 2.92 500.0 3.06 519.1 3.53 555.1 3.431The item response rate for ISEI for New Zealand is below 85 percent. Missing data have not been explicitly accounted for. Seealso table A-2.2Due to low response rates, 2003 data for the United Kingdom are not discussed in this report.NOTE: The OECD average is the average of the national averages of the OECD member countries with data available. BecausePISA is principally an OECD study, the results for non-OECD countries are displayed separately from those of the OECDcountries and are not included in the OECD average.The International Socioeconomic Index (ISEI) is an internationally compa-rable index of occupational status developed by Ganzeboom, De Graaf, and Treiman (1992). s.e. means standard error. SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
110
Table B-25. Change in the combined mathematics literacy and problem-solvingscores per one standard deviation change in the InternationalSocioeconomic Index (ISEI) score, by country: 2003
Country Mathematics literacy Problem solvingChange s.e. Change s.e.
United Kingdom2 31.8 1.46 29.7 1.63—Not available.1The item response rate for ISEI for New Zealand is below 85 percent. Missing data have not been explicitlyaccounted for. See also table A-2.2Due to low response rates, 2003 data for the United Kingdom are not discussed in this report.NOTE: The OECD average is the average of the national averages of the OECD member countries with dataavailable. Because PISA is principally an OECD study, the results for non-OECD countries are displayedseparately from those of the OECD countries and are not included in the OECD average.The InternationalSocioeconomic Index (ISEI) is an internationally comparable index of occupational status, with a range ofapproximately 16 to 90, developed by Ganzeboom, De Graaf, and Treiman (1992). The overall linkage of ISEIto mathematics literacy and problem solving is examined using the specific change in score on the com-bined mathematics literacy scale or problem solving in response to a one standard deviation change in theISEI index score for each country. A greater increase in achievement score in a country implies a strongerrelationship between socioeconomic status and performance in that country. s.e. means standard error. SOURCE: Organization for Economic Cooperation and Development (OECD), Program for InternationalStudent Assessment (PISA), 2003.
111
PISA 2003 Results From the U.S. Perspective
Table B-26. Average combined mathematics literacy, problem-solving, reading literacy, and science literacy scores of U.S. 15-year-old students, by race/ethnicity: 2003
Race/ethnicityMathematics
literacyProblem solving
Reading literacy
Science literacy
Average s.e. Average s.e. Average s.e. Average s.e.Total 482.9 2.95 477.4 3.13 495.2 3.22 491.3 3.08
White 511.6 2.51 505.7 2.54 524.8 2.57 521.6 2.60Black 417.3 5.08 413.2 5.69 429.9 5.62 422.7 4.69Hispanic 442.7 5.13 435.6 5.54 452.6 5.86 448.1 5.63Asian 506.3 9.79 505.3 9.94 513.1 9.22 508.9 10.59More than one race 502.2 6.36 497.5 7.05 515.2 7.35 517.0 7.21OECD average 500.0 0.63 500.0 0.65 494.2 0.65 499.6 0.60NOTE: Reporting standards not met for American Indian/Alaska Native and Native Hawaiian/Other Pacific Islander; thus, they areincluded in the total, but not reported separately. Black includes African American and Hispanic includes Latino. Racial categoriesexclude Hispanic origin. s.e. means standard error. SOURCE: Organization for Economic Cooperation and Development (OECD), Program for International Student Assessment(PISA), 2003.
International Outcomes of Learning in Mathematics Literacy and Problem Solving
112
Appendix C: TIMSS-PISA 2003Expert Panelists
113
PISA 2003 Results From the U.S. Perspective
Ramesh Gangolli
Professor, Department of MathematicsUniversity of WashingtonSeattle, WA
Patricia Harvey
SuperintendentSaint Paul Public SchoolsSaint Paul, MN
Director of Education Indicator ProgramsCouncil of Chief State School OfficersWashington, DC
Betsy Brand
Co-DirectorAmerican Youth Policy ForumWashington, DC
Nancy R. Bunt, Ed.D.
Program DirectorMath & Science CollaborativeAllegheny Intermediate UnitHomestead, PA
Rodger Bybee
Executive DirectorBiological Sciences Curriculum StudyColorado Springs, CO
Joan Ferrini-Mundy
Professor, Department of MathematicsMichigan State UniversityEast Lansing, MI
International Outcomes of Learning in Mathematics Literacy and Problem Solving
114
Appendix D:PISA Online Resourcesand Publications
115
PISA 2003 Results From the U.S. Perspective
International PublicationsThe following publications are intended toserve as examples of some of the numerousreports that have been produced in relationto PISA by the OECD and other internation-al organizations. All of the publications list-ed here are available athttp://www.pisa.oecd.org.
Summary and Achievement ReportsOrganization for Economic Cooperation andDevelopment (OECD). (2001). Knowledge andSkills for Life: First Results from the OECDProgramme for International StudentAssessment. Paris: Author.
Thematic ReportsArtelt, C., Baumert, J., Julius-McElvany, N.and Peschar, J. (2003) Learners for Life:Student Approaches to Learning. Results fromPISA 2000. Paris: OECD.
Kirsch, I., de Jong, J., Lafontaine, D.,McQueen, J., Mendelovits, J., and Monseur,C. (2002). Reading for Change: Performanceand Engagement Across Countries. Resultsfrom PISA 2000. Paris: OECD.
Willms, J.D. (2003). Student Engagement inSchool: A Sense of Belonging andParticipation. Results from PISA 2000. Paris:OECD.
Technical Reports and FrameworksAdams, R (Ed.) (2003). PISA 2000 TechnicalReport. Paris: Organization for EconomicCooperation and Development (OECD).
Organization for Economic Cooperation andDevelopment (OECD). (2000). MeasuringStudent Knowledge and Skills: The PISA 2000Assessment of Reading, Mathematical andScientific Literacy. Paris: Author.
Organization for Economic Cooperation andDevelopment (OECD). (1999). MeasuringStudent Knowledge and Skills: A NewFramework for Assessment. Paris: Author.
Organization for Economic Cooperation andDevelopment (OECD). (2002). Programme forInternational Student Assessment (PISA):Manual for the PISA 2000 Database. Paris:Author.
Online ResourcesThe PISA NCES website(http://nces.ed.gov/surveys/pisa) providesbackground information on the PISA sur-veys, copies of NCES publications thatrelate to PISA, and sample PISA items fromprevious assessments.
NCES PublicationsThe following publications are intended toserve as examples of some of the numerousreports that have been produced in relationto PISA by NCES. All of the publicationslisted here are available athttp://nces.ed.gov/surveys/pisa.
Summary ReportsLemke, M., Calsyn, C., Lippman, L., Jocelyn,L., Kastberg, D., Liu, Y., Roey, S., Williams, T.,Kruger, T., and Bairu, G. (2001). Highlightsfrom the 2000 Program for InternationalStudent Assessment (NCES 2002–116). U.S.Department of Education, National Centerfor Education Statistics. Washington, DC:U.S. Government Printing Office.
Lemke, M., Calsyn, C., Lippman, L., Jocelyn,L., Kastberg, D., Liu, Y.Y., Roey, S., Williams,T., Kruger, T., Bairu, G. (2001). Outcomes ofLearning: Results from the 2000 Program forInternational Student Assessment of 15-Year-Olds in Reading, Mathematics, and ScienceLiteracy (NCES 2002–115). U.S. Departmentof Education, National Center for EducationStatistics. Washington, DC: U.S.Government Printing Office.
Thematic ReportsLemke, M., Sen, A., Pahlke, E., Williams, T.,Kastberg, D., and Jocelyn, L. (forthcoming).Characteristics of U.S. 15-Year-Old LowAchievers in an International Context:Findings from PISA 2000 (NCES 2002–005).U.S. Department of Education, NationalCenter for Education Statistics. Washington,DC: U.S. Government Printing Office.
Data ProductsU.S. Department of Education, NationalCenter for Education Statistics. (2004).Program for International StudentAssessment (PISA) 2000 Data File (NCES2004–006). Washington, DC: Author.