-
EVALUATION REVIEW / AUGUST 2001Phelps / WORLDS BEST IN
MATHEMATICS
This article describes the education quality control systems
(for mathematics) used by thosecountries that performed best on the
Third International Mathematics and Science Study (TIMSS).Enforced
quality control measures are defined as decision pointswhere
adherence to thecurriculum and instruction system can be
reinforced. Most decision points involve stakes for thestudent,
teacher, or school. They involve potential consequences for failure
to adhere to the sys-tem and to follow the program at a reasonable
pace. Generally, countries with more decisionpoints perform better
on the TIMSS. When the number of decision points and TIMSS test
scoresare adjusted for country wealth, the relationship between the
degree of (enforced) quality controland student achievement appears
to be positive and exponential. Conclusion: The more
(enforced)quality control measures employed in an education system,
the greater is students academicachievement.
BENCHMARKING TOTHE WORLDS BEST IN MATHEMATICS
Quality Control in Curriculum and InstructionAmong the Top
Performers in the TIMSS
RICHARD P. PHELPSWestat
We have made considerable progress because we resisted the
temptation to put our faithin any single gimmick or formula for
school improvement. School systems are com-plexand looking for a
simple solution is, well, simple-minded.
Rod Paige, as Superintendent of theHouston Independent School
District(currently, he is U.S. Secretary of Education)
Integrated systems that work well together are the essence of
civilization.
Irving Wladawsky-Berger, general manager,Internet Division,
IBM
391
AUTHORS NOTE: The author would like to acknowledge the
contribution of others to thiswork. Lois Peak of the U.S.
Department of Educations Planning and Evaluation Service con-ceived
the data collection, helped design the data collection instrument,
and recruited the coun-try expert respondents. She is not
responsible, however, for the analysis and, therefore,
shouldEVALUATION REVIEW, Vol. 25 No. 4, August 2001 391-439 2001
Sage Publications
-
The United States has participated in five international
assessments of stu-dent achievement in mathematics and science
since the 1960s. Each time, thecomparison of U.S. student
performance to their international counterpartshas provoked
widespread interest from researchers, policy makers, and thepublic
at large. The occasions have prompted wholesale critiques
anddefenses of the U.S. education system in the popular press. The
scholarlypress, in the meantime, has been filled with studies of
U.S. relative achieve-ment in the context of various background
factors, such as the average educa-tional attainment level or
socioeconomic status of the test-takers parents orthe level of
public education funding.
Most attention has focused on the validity of country-average
test scorecomparisons in the light of differences in the mechanics
of test administra-tions and sample selection across countries,
with critics claiming that the dif-ferences nullify valid
comparisons. Defenders of the country-average testscore comparisons
have argued that the differences in the test
administrationmechanics do not invalidate comparisons because they
are not large enoughor they should average out over time. They
argue that comparative U.S. math-ematics performance at the
8th-grade level has been relatively consistent overfive assessments
and three decades.
The background analyses probing the deepest have searched for
explana-tions of relative achievement in the curriculum of each
country. The SecondInternational Mathematics and Science Study
(SIMSS) in the early 1980sspawned The Underachieving Curriculum, a
critique of the prevailing U.S.mathematics curriculum written by
some of the U.S. researchers directlyinvolved in building and
analyzing the SIMSS database (McKnight et al.1987). Some of the
same researchers were involved in building and analyzingthe
database for the Third International Mathematics and Science
Study(TIMSS), administered in the 1994-1995 school year. Their main
curriculumanalysis studies, A Splintered Vision: An Investigation
of U.S. Science andMathematics Education, Many Visions, Many Aims:
A Cross-National Inves-tigation of Curricular Intentions in School
Mathematics, and CharacterizingPedagogical Flow: An Investigation
of Mathematics and Science Teaching,echoed the critical refrain of
Underachieving Curriculum (Schmidt et al.1996a, 1996b, 1997). The
U.S. mathematics curriculum, by comparison withits international
counterparts, lacked focus and depth. One of the most widelyquoted
phrases from one of the studys authors characterized the U.S.
mathcurriculum as a mile wide and an inch deep.
392 EVALUATION REVIEW / AUGUST 2001
not be held liable for any of its errors. Ellen Pechman and Rolf
Blank reviewed early drafts of thequestionnaires used in this study
and provided helpful comments on them. The author retains
allresponsibility for any errors in this article.
-
Other studies have looked deeply at instructional practices
across coun-tries. Over the past two decades, Harold Stevenson and
James Stigler (1992)have observed and compared classroom culture
and instructional practices inthe United States and East Asian
countries and have discovered some highlyenlightening contrasts.
Coincident with the TIMSS, George Stigler video-taped many hours of
secondary-level mathematics classroom instruction insamples of
German, Japanese, and U.S. schools. The contrasts in
instructionalstyle, demeanor, and content are striking (Office of
Educational Research andImprovement 1997a).
Still other studies have looked more explicitly at the benefits,
methods,and feasibility of benchmarking curricular and
instructional practices acrosscountries. To this effort, some
researchers have focused on content standards(Beatty 1997; Resnick,
Nolan, and Resnick 1995; Nolan 1997; Louis andVersloot 1996) and
others on performance standards (Britton and Raizen 1996;Eckstein
and Noah 1993; Gandal 1997; Stevenson and Lee 1997). Still
otherresearchers have argued for more comprehensive comparisons of
educationsystems across countries and the impact of many systemic
influences on cur-riculum and instruction (Bishop 1997; Mullis
1997a), or they have advocatedefforts toward benchmarking entire
systems of curriculum and instruction(Cross and Stempel 1995;
Shanker 1996; U.S. Department of Education 1995).
This report aims to supplement the aforementioned curriculum
andinstruction studies with a look behind the scenes at the
formation and imple-mentation of both. It takes one giant step back
in the process to better under-stand the superstructure of other
countries curriculum and instruction sys-tems and the glue that
holds that superstructure together, to betterunderstand how other
countries see to it that the curriculum they intend isattained.
Essentially, it focuses on how top-performing countries
controlquality in their curriculum and instruction systems.
This article exploits information gathered in study sponsored by
the U.S.Department of Educations Office of Educational Research and
Improve-ment and National Center for Education Statistics, and a
variety of othersources, in an attempt to capitalize on the
occasion and the wealth of informa-tion provided by the TIMSS, to
better understand U.S. mathematics and sci-ence education in its
international context.
THE TIMSS
There are perhaps no singular events that elicit more public
judgment ofthe quality of U.S. elementary and secondary education
than the periodicrelease of results from international student
assessments. The TIMSS,
Phelps / WORLDS BEST IN MATHEMATICS 393
-
administered in 1994-1995, was the largest such assessment ever,
with morethan 40 countries participating at one or more of three
grade levelstherough equivalents of our 4th, 8th, and 12th grades.
Results for the grade levelat which the most countries
participated8th gradewere released first.
When the mathematics performance of U.S. 8th graders was
compared totheir international counterparts in the summer of 1996,
it seemed to reaffirmin the minds of many U.S. observers the legacy
of pessimism from earlierinternational assessments. Among the 40
countries with student scores meet-ing minimal statistical
requirements for comparison, U.S. 8th graders scoredlower than 8th
graders in 20 other countries and higher than those in only 7,when
measured by a multiple comparison procedure involving all
participat-ing countries. U.S. students scores were on a par with
those of students in13 remaining countries (Beaton 1996, 23).
The performance of U.S. 4th graders, made public the following
summer,seemed much better. A multiple comparison procedure showed
U.S. 4thgraders scoring below their counterparts in 7 countries,
above those in 12,and on a par with those in 6 other countries
(Mullis 1997b, 25).
In between the relatively strong U.S. 4th-grade performance and
the rela-tively weak U.S. 8th-grade performance were three grade
levels and a steepdecline in U.S. relative performance. Among all
the 25 countries that partici-pated at both the 4th- and 8th-grade
levels and met minimal statisticalrequirements for comparison, the
synthetic gain in mathematics achieve-ment between the 4th and 8th
grades appeared to be the smallest in the UnitedStates (Mullis
1997b, 43). One could speculate that the longer studentsstayed in
U.S. schools, the less they learned, by comparison with average
aca-demic progress in other education systems.
The release of the 12th-grade results in 1998 only seemed to
confirm themost pessimistic predictions. The unfortunate trend in
relative U.S. studentperformance continued downward through the
upper secondary years(Mullis et al. 1998).
EXPLANATIONS FOR THE U.S. TEST PERFORMANCE
Ultimately, however, test score comparisons alone do not tell
the wholestory. There can, after all, be many explanations for any
countrys disappoint-ing test performance. An explanation might lay
in the mechanics of the testadministration, perhaps, if one
countrys students were younger in age fortheir grade level, or the
test was given earlier in the academic year. Likewise,an
explanation might lay in the social background from which each
student
394 EVALUATION REVIEW / AUGUST 2001
-
emerges if one country has relatively higher proportions of
nonnative speak-ers of the primary language or households in
poverty, for example.
Likewise, some explanation might lay in the structure and
procedures ofeach countrys education system. The aforementioned
reports from the U.S.TIMSS Committee argued that the U.S.
mathematics curriculum lacks thefocus and depth often found in
other countries. One could argue that the vid-eotape studies of
George Stigler showed the same to be true in the conduct
ofclassroom mathematics instruction.
Ina Mullis, of the International TIMSS Center at Boston
College,observed that the top-performing countries at the 8th-grade
level were morelikely to have high-stakes examination systems than
were other countries(Mullis 1997a). John Bishop, the Cornell labor
economist, has found statisti-cally significant effects from the
existence of high-stakes examination sys-tems on student test
performance using data sets of the 1991 InternationalAssessment of
Educational Progress (IAEP) across countries or across Cana-dian
provinces, of the Scholastic Assessment Test across U.S. states,
and nowacross countries with the TIMSS. His discovery of
significant effects is all themore remarkable because the
high-stakes tests in some of the countries,states, or provinces are
upper secondary-level exit examinations, given to stu-dents when
they are 17, 18, or 19 years old, whereas the tests providing
hismeasures of achievement in the case of the IAEP and the TIMSS
were admin-istered to 13-year-olds. He calls the alleged effect of
high-stakes upper sec-ondary exit exams on the behavior of students
and teachers at the lower sec-ondary level a backwash effect
(Bishop 1997).
In another study, Impacts of School Organization and Signaling
onIncentives to Learn in France, England, Scotland, the
Netherlands, and theUnited States, Bishop (1993) expanded his
analysis to include signals ofstudent performance and expectations
other than those derived from exami-nations, such as the
publication of exam results, retention in grade, selectionof
students for different curricular tracks (e.g., academic,
vocational, gen-eral), amount of homework required, looping of
teachers over severalgrade levels with same students so that the
person responsible for teachingparticular students was
identifiable, and so on.
BENCHMARKING
Coincident with the student performance comparisons of the past
decade,several groups have studied the curricula of other countries
and comparedthem with curricula typically found in the United
States. Most commonly,
Phelps / WORLDS BEST IN MATHEMATICS 395
-
these studies have focused on the content of mandated,
large-scale examina-tions as the most concise representations of a
curriculum. Under SecretaryLynn Cheney in 1991, the National
Endowment for the Humanities translatedand published side-by-side
comparisons of secondary-level history examina-tions from France,
Germany, Japan, England and Wales, and Belgium (NationalEndowment
1991). The New Standards Project (1994) and the National Cen-ter on
Education and the Economy (1994) translated and compared
severalcountries mathematics examinations. The National Center for
ImprovingScience Education translated and compared several
countries science exam-inations (Britton and Raizen 1996). The
American Federation of Teachershas done the same in several subject
areas (e.g., American Federation ofTeachers 1995a, 1995b). The
National Center for Education Statistics spon-sored work by the
Pelavin Research Institute (1996) comparing nationalassessments in
Canada, England and Wales, France, and the United States.
The Council for Basic Education has gone a step further in its
SchoolsAround the World Project, enlisting the cooperation of
classrooms in eightcountries to participate in an exercise that
will compare several kinds of stu-dent work, including homework and
term papers, rather than just examina-tions (Council for Basic
Education 1996).
The American Federation of Teachers has proposed
institutionalizing effortssuch as these while providing an ongoing
reference source for U.S. schools ina U.S. national benchmarking
institute. The institute would assist U.S. statesand local school
districts to conduct systematic exercises in benchmarking ele-ments
of their curriculum and instruction to those in other countries,
states,and districts (American Federation of Teachers 1995a,
1995b).
All these groups have searched for appropriate benchmarks for
help indesigning U.S. curriculum and instruction to appropriate
levels of depth anddifficulty. All these groups realize, however,
that benchmarking simply to aresult does little, in and of itself,
to help achieve the result. To usebenchmarking to achieve a desired
result, one must benchmark to a behaviorthat one believes will
produce the result.
RESEARCH FOCUS AND SURVEY
This study attempts to understand the superstructure of the
education sys-tems that support curriculum and instruction leading
to high performance.What is the glue that holds that superstructure
together? Given the intendedcurriculum in each country, how is the
intended curriculum implementedand attained? How do top-performing
countries control quality in their cur-riculum and instruction
systems?
396 EVALUATION REVIEW / AUGUST 2001
-
In 1997, a detailed, 15-plus page questionnaire on this topic
was assem-bled, and knowledgeable experts in their respective
countries education sys-tems were asked to fill out and return
them. The questionnaire, with abbrevi-ated versions of each
countrys responses, is available from the author uponrequest.
The long, but accurate, title of the survey was Exploratory
Survey on theRelationships Among Content Standards, Textbooks,
Student PerformanceStandards, and Examinations in Secondary School
Mathematics. The titleemphasizes the interest in the connections
between the main elements ofany countrys curriculum and instruction
system. The intent of the surveywas to learn how and to what degree
these elements were integrated in top-performing countries.
Given limited resources, the survey focused on mathematics
alone. Thereader should realize that conclusions drawn from
studying curriculum andinstruction in one subject area are not
necessarily wholly applicable to others.
The questionnaire consisted of two parts. Part 1 contained
questions per-taining to content standards, textbooks, student
performance standards, andinternational benchmarking activities.
Part 2 focused on the application ofstudent performance standards
at decision points. Experts filled in a separatePart 2 for every
decision point their country used. A decision point wasdefined as
an occasion when a student performance standard is actuallyapplied:
a judgment is madefor example, that a student achieves or doesnot
achieve a standardand an appropriate consequence results.
Mostoften, decision points consist of high-stakes tests or
selective admissions tocertain schools or curricular tracks.
SELECTION OF FOCUS GROUP OF COUNTRIES
Countries from which the United States could learn something on
thetopic of education system integration and quality control were
selected. Thesize of the group was limited to nine.1 The first
criterion for selection was asuperior performance on the TIMSS
8th-grade mathematics test.
Rather than just pick the nine countries ranked highest by
average8th-grade TIMSS mathematics score, however, other criteria
were imposedon the selection. It was deemed important, for example,
to make sure thatsome countries with some basic education system
characteristics similar toour own, such as large size and a federal
structure, were picked. Singaporesstudents scored higher than any
other countrys on 8th-grade mathematics,but even though we might be
able to learn a lot from Singapores educationsystem, the United
States cannot become very much like Singapore. Singa-pore is of
relatively small size and has a highly centralized education
system,
Phelps / WORLDS BEST IN MATHEMATICS 397
-
both natural advantages for creating cohesion. So, although
Singapore wasincluded in the focus group, there was felt to be no
need to include more coun-tries like Singapore but some need to
include countries more like the UnitedStates (i.e., large, diverse,
with federal system).
So, moving down the list of top-performing countries, selection
wasbiased in favor of countries that could both diversify the focus
group andensure that some countries more like us were included.
Thus, Australia wasselected over Austria and Hungary, for example,
because of its large size andfederal structure and because other
countries like Austria and Hungary hadalready been selected.
Table 1 lists the 13 countries with the highest average
8th-grade TIMSSmathematics scores and other criteria by which
countries were selected forinclusion in the focus group.
398 EVALUATION REVIEW / AUGUST 2001
TABLE 1: Focus Group of Countries Ranked in Order of Average
Eighth-GradeThird International Mathematics and Science Study
MathematicsScore, by Reason for Selection
Given the Characterof Countries AboveAlready Picked, the
Addition of ThisEducation Country Offers
System Diversity in ItsThat Is Not Geographic LocationCentrally
Diverse or Governance
Country Controlled Large Population Structure
Singaporea YesKoreaa YesJapana Yes YesHong KongBelgiuma, Flemish
Yes YesCzech Republica YesSlovak RepublicSwitzerlanda Yes Yes
YesThe Netherlandsa Yes Yes YesSloveniaBulgariaAustriaFrancea Yes
Yes YesHungaryRussia (changing) Yes YesAustraliaa Yes Yes Yes
YesIreland
a. Country selected for our focus group.
-
PROGRESS OF THE SURVEY
In time, some very detailed, thoughtful responses were returned;
otherbrief, but still very thoughtful, responses were returned; and
two countries,Australia and the Netherlands, did not respond.
Survey results were then sup-plemented with information from other
sources.
Responses were received from experts in Singapore, Korea,
France,Japan, Switzerland, the Czech Republic, and the Flemish
Community of Bel-gium. Other sources were consulted to learn about
the Netherlands because itprovides such an interesting contrast to
Flemish Belgium and shares so manyimportant governance
characteristics with the United States. Not enoughinformation was
gathered to provide a representative picture of
Australia,unfortunately, and it had to be dropped from the
group.
All countries that returned questionnaires provided fairly
complete andthoughtful responses to Part 1, which posed questions
on standards, text-books, and benchmarking, with the exception of
Section C on student perfor-mance standards. Part 2, which posed
questions regarding the application ofstudent performance standards
at decision points, received a fairly poorresponse. One cannot be
certain of the reason, but some respondents may nothave well
understood what was meant by student performance
standard.Fortunately, some country experts provided equivalent
information in theirother responses to Part 1. Information provided
in the questionnaires was ver-ified by country experts in the
United States or from written sources.
For the remaining countries, and to fill in any missing
information fromthe responding countries, other sources of
information were sought. Theseother sources are listed by country
in the appendix.
In the end, the exploratory survey provided results that traced
the outlineof the curriculum and instruction picture, but,
ultimately, no informationfrom the survey alone was used to draw
any conclusions in this analysis.
ANALYSIS: HOW COUNTRIES CONTROL QUALITYIN CURRICULUM AND
INSTRUCTION
COHERENCE
The analysis adopts the common and useful framework of vertical
andhorizontal coherence, widely used by education policy analysts
in recentyears as a rough device for measuring the degree to which
curriculum andinstruction systems are integrated. A completely
coherent system would be
Phelps / WORLDS BEST IN MATHEMATICS 399
-
one with a seamless integration among the various system
elements: contentstandards (the intended curriculum) represented
completely and preciselyin textbooks, student performance
standards, and examinations, and evalua-tions of performance
representing completely and precisely the mastery ofthe
content.
A system with complete vertical coherence is one in which the
intentionsof educators at the top of the system (e.g., in the
country or state educationministry) are represented completely and
precisely in the classroom. A sys-tem with complete horizontal
coherence is one in which the content standardsare represented
completely and in precisely the same way in every
classroomthroughout the country or state.
No country- or state-level education system can have complete,
absolutecoherence in curriculum and instruction, of course. Only a
system consistingof a single classroom with a single teacher who
also serves as education min-ister could offer that. But some
education systems make a greater effort thanothers to maintain
coherence, and some are more successful than others inthat
effort.
Of course, maintaining coherence may be easier in some contexts
than inothers. Education systems that are small and highly
centralized (e.g., Singa-pore) probably pose the least amount of
difficulty. Education systems that arelarge and highly fragmented
among levels of government and types of gover-nance (e.g., United
States) probably pose the greatest amount of difficulty.Some might
argue, however, that the system of governance in education
itselfshould be considered as a characteristic that can be altered,
along with others,if need be, to improve system coherence.
VERTICAL COHERENCE
Vertical coherence implies a process whereby there is a match
between theintended curriculum and the attained curriculum: what
students learn.Between the initial writing of content standards and
the final mastery by stu-dents of subject matter, there may be many
interim steps, several layers ofgovernment, several organizations
involved, a long time lag, and other poten-tial barriers to
complete coherence. How does an education system maintaincoherence
in the face of natural entropy?
Singapore provides a good example of a country with a high
degree of ver-tical coherence. The Ministry of Education (1993)
writes content standards,curriculum guides, and some textbooks.
Some content is prescribed by theUniversity of Cambridge syndicate,
of which Singapore is a member. Theministry trains the teachers in
a single, in-house training institute. The ministry
400 EVALUATION REVIEW / AUGUST 2001
-
has jurisdiction over all schools, both government and
government-dependentprivate schools. The ministry sends out subject
specialist inspectors to moni-tor classroom instruction. Whenever
there are curriculum changes, teachersattend workshops on these
changes run by the ministry. Teachers participatein writing and
scoring national examinations. There are lots of examinations:on
exiting primary school (and getting places in a secondary school
ofchoice), on exiting lower secondary school, on exiting upper
secondaryschool, and for selection to preferred curricular tracks
at various points.
In other words, in Singapore, the Ministry of Education (1993)
controlsmost aspects of the process itself, closely monitors
classroom instruction,and ties teachers to the examination program
by involving them in writingand scoring them.
Koreas system has more variety and diversity in some ways. There
aremore curricular tracks, particularly for vocational education.
Regional gov-ernments have some say in how the system is run.
Still, the curriculum andinstruction process is highly centralized,
course content is prescribed by theministry, and the ministry
administers standardized, high-stakes examinations.
Another avenue, outside a single, centralized authority, for
maintaining ahigh degree of vertical coherence is within subject
areas rather than over thesystem as a whole. For example, in some
countries, mathematics departmentsin universities train mathematics
teachers, grant teacher certifications, writecontent and student
performance standards, write texts, inspect classes, andwrite and
score examinations with teachers help. Elements of this kind
ofvertical coherence exist in the Netherlands and Switzerland.
HORIZONTAL COHERENCE
Horizontal coherence implies a process whereby the curriculum
andinstruction in one part of a country or state matches that in
another part of thecountry or state. How does a country maintain
horizontal coherence? It canmandate a common core curriculum; use
common, unique textbooks; trainteachers in a single institution or
in multiple institutions with one prescribed,standardized program;
centralize the approval of curriculum plans, timeta-bles, and
inspections; inspect school classrooms with subject area experts
tosee if curriculum and timetables are followed; establish networks
of subject-area professionals and involve them in writing
standards, doing inspections,and writing and scoring examinations;
and advertise standards to the publicso they can hold their local
schools accountable.
The Netherlands provides a good example of an education system
thatmaintains a high degree of horizontal coherence. There are few
limitations on
Phelps / WORLDS BEST IN MATHEMATICS 401
-
forming a school; most any religious or nonreligious
organization can do it.Any one school may have no necessary
connection with any schools at loweror higher levels of education
nor any administrative connection with the cen-tral government.
Moreover, there are no systemwide content standards orcore
curriculum. The Netherlands maintains horizontal coherence
primarilythrough frequent administrations of nationally
standardized high-stakesexaminations.
Flemish Belgium maintains horizontal coherence without
standardizedtests but with common texts and curriculum guides and
widespread publicrelations efforts that educate the public about
what to expect from their localschools.
Table 2 lists various methods that each of the countries in the
focus groupuse to maintain vertical or horizontal coherence. Yes
means that a countryuses the method, no means they do not, and a
blank cell represents a lack ofsufficient information to make a
judgment.
Few of the quality control methods implementation listed in
Table 2 areprevalent in the United States.
TWO GROUPS OF COUNTRIES
The focus group of countries divides into two natural groups, as
character-ized by their governance and their methods for
maintaining coherence.
Group 1: Highly centralized systems with highly prescribed
content and perfor-mance standardsSingapore, Korea, Czech Republic,
France, Japan.
It is perhaps easy to understand how these countries manage
quality con-trol and maintain coherence in curriculum and
instruction. Many of the fac-tors involved are controlled
centrally. For example, the already-describedSingaporean and Korean
systems are highly centralized.
France also has highly centralized standard-setting procedures,
and allteachers are employees of the central government. There is
some variety toexamination writing from regional centers and some
variety of textbooks.Still, examinations are mostly similar, they
are high stakes, and they arenumerous and prominent. The Conseil
National de Programmes operatesmuch like an Inspector Generals
office, with inspectors drawn from amongthe ranks of their office,
of secondary school teachers, of university profes-sors, and of
Ministry of Education (1993) officials.
In the Czech Republic and other formerly communist Eastern
Europeancountries, they are in the process of moving away from this
model. There are
402 EVALUATION REVIEW / AUGUST 2001
-
TABLE 2: Education System Practices That Produce Vertical or
Horizontal Coherence in Curriculum and Instruction,by Country
andPractice
Quality Belgium, Czech TheControl Practice Flemish Republic
France Korea Japan Netherlands Singapore Switzerland
Practices that produce both vertical and horizontal
coherenceContent standards are fixed and are expectedto be followed
as a core curriculum Yes Yes Yes Yes Yes Yes Yesa
Teachers are required to teach core curriculum Yes Yes No Yes
Yes Yes NoCommon or unique textbooks are requiredto adhere closely
to the content standards Yes Yes Yes Yes Yes Yesa
Centralized approval of curriculum plans,course timetables, and
inspections Yes Yes Yes Yes Yes Yes Yes
Selective admission to curricular tracksbased on standards Yes
Yes Yes Yes Yes Yes Yes Yes
Inspections are done in classrooms, in somecases by curricular
experts, and arestandards based Yes Yes Yes Yes Yes Yes
Train teachers in a single institution or inmultiple
institutions with standardized,prescribed programs Yes Yes
High-stakes exit examinations from lowersecondary level are
standardized Yes Yes Yes Yes Yes Yes
High-stakes exit examinations from uppersecondary level are
standardized Yes Yes Yes Yes Yes Yes Yes
Practices that produce vertical coherenceSome teachers have the
same group of studentsfor more than 1 year Yes Yes Yes Yes Yes
Curricular tracking by school Yes Yes Yes Yes Yes Yes Yes
(continued)
403
-
TABLE 2 Continued
Quality Belgium, Czech TheControl Practice Flemish Republic
France Korea Japan Netherlands Singapore Switzerland
All students in a school (which may have acurricular focus and
be selective) follow thesame course of study Yes Yes Yes Yes
Establish networks of subject-area professionalsand involve them
in writing standards, doinginspections, and writing and
scoringexaminations Yes Yes Yes Yes Yes
Employers are directly involved in some aspectsof the process
Yes Yes Yes Yes Yes
Practices that produce horizontal coherenceSchoolwide curriculum
plans with target goalsare used to standardize and integrate
curriculumand instruction Yes Yes Yes
Students do not begin homework during class timeas instruction
time is used to keep a set pace(> 50% of classrooms respond Yes)
Yes Yes Yes Yes Yes
Involve educators from around the country indeveloping the
standards Yes Yes Yes Yes Yes Yes
Involve educators from around the country inwriting and revising
the textbooks Yes Yes Yes
Advertise common standards to public so theyhold local schools
accountable Yes Yes
Selective admission criteria to curricular tracksare
standardized Yes Yes Yes Yes Yes Yes Yes
a. Yes for lower secondary, no for upper secondary.
404
-
discussions of lowering the required proportion of the core
curriculum from80% to 100% of what is taught to 50% of the
curriculum or less, allowingmore local control over the curriculum
and reducing emphasis on math andscience to make room in the
curriculum for more social studies and humani-ties courses. It will
be interesting to see if the high performance in math andscience
holds up in these countries after these changes are made.
Group 2: Decentralized systems with unprescribed aspects to the
process of con-tent or performance standard settingSwitzerland,
Flemish Belgium, theNetherlands.
Of the focus group countries, Switzerland is closest in its
governancestructure to the United States but is different in other
ways. For example, eachSwiss teacher is supervised by an inspector;
there are several curricular tracksand all have high-stakes exit
examinations (some cantons also have exitexams at three levels:
primary, lower secondary, and upper secondary); someof these tracks
are also very selective in their entry; the national governmentdoes
have some say over certification requirements at the upper
secondarylevel; there are several national organizations, such as
the Cantonal Directorsof Education Pedagogical Commission, whose
aim is to coordinate commonstandards, textbooks, and manuals across
the country; teacher salaries arevery high, and the occupation has
much respect; and university experts super-vise the examination
process.
Contrasting Flemish Belgium and the Netherlands. Flemish Belgium
isunique in our focus group of countries in that it does not have
high-stakes exitexaminations. To maintain coherence, they must
control quality at the frontend of the process. By contrast, the
character of the education system in theNetherlands requires that
quality control be maintained at the back end of thecurriculum and
instruction process.
In Flemish Belgium, the expert respondent claimed a 100%
matchbetween the content of textbooks and teaching materials and
the content stan-dards. The textbooks are written by the same
people who develop the curricu-lum guides. The curriculum
objectives are made public by the media andthrough public relations
campaigns of the education ministry, complete withleaflets and
brochures printed on a large scale and disseminated widely.
Withthis, parents and the public can better judge their schools
performancebecause they can know what they are supposed to be
teaching. Curriculum-based inspections are pervasive and are used
to see if teachers are teaching thecorrect material and doing it on
time, although it has been proposed thatinspections be done only at
the school level rather than at the classroom level.
Phelps / WORLDS BEST IN MATHEMATICS 405
-
That Flemish Belgium does not have high-stakes exit examinations
doesnot, in itself, mean that students never risk rejection.
Flemish Belgium main-tains separate upper secondary level
curricular tracks, some of which arehighly selective. Getting into
the track of ones choice may require a betterschool record than
those of other students who wish entry into the same
track.Moreover, teachers can still fail students, even without
high-stakes standard-ized tests, and indeed, some educators in
Flemish Belgium perceive a prob-lem of too many grade repeaters at
the upper secondary level.
In the neighboring Netherlands, one could describe the structure
of thequality control system as the converse of Flemish Belgiums.
The Nether-lands maintains a very open system of school choice and
a great variety ofschools. There are Catholic schools, Protestant
schools, Islamic schools, andGreen schools; virtually any group can
start a school and receive full publicfunding. These schools use a
wide variety of textbooks and curriculum mate-rials. Schools can
choose their own curriculum, and the implementation ofcurricula is
unsupervised by the government. Indeed, the national constitu-tion
prevents the establishment of an official curriculum.
The national government does offer guidance on a voluntary
basis, main-taining local and regional advisory guidance centers, a
national CurriculumDevelopment Institute, a semiautonomous test
development organization,tight subject-area networks of teachers
who help to develop and score exami-nations, and university
departments that have taken over some quality controlfunctions
within each respective subject area.
Also, the Netherlands administers high-stakes standardized
examina-tions, prominently and frequently. The government allows
much public inputas to the content of the examinations, and topics
that are culturally sensitive(e.g., evolution) might not be
included. But once the content domain of theexaminations is set,
schools are required to administer them, and students arerequired
to pass them.
As one spokesperson has written (Encyclopedia of Comparative
1988,504),
The strongly differentiated Dutch system requires a radical
decision about every pupil atthe end of every school phase, a
decision which, to a large extent, determines the pupilsfuture
profession, income, and social standing.
Promotion from grade to grade in primary school is decided by
norm-referenced tests.Those in the bottom quartile are not promoted
while the others are.
Like the education system in Flemish Belgium, the Netherlands
also cre-ates a high number of failing students, which worries some
educators.
406 EVALUATION REVIEW / AUGUST 2001
-
COMPARING QUALITY CONTROL IN HIGH-ACHIEVINGCOUNTRIES TO THAT IN
THE UNITED STATES
There are some characteristics of the curriculum and instruction
qualitycontrol systems common to all or most of the countries in
our focus group thatcontrast markedly with systems common in the
United States.2
1. CLASSROOM- AND CURRICULUM-BASED INSPECTIONS
In the United States, school inspections are infrequent and are
done on aschoolwide basis, with the school as a whole attaining or
not attaining accred-itation based on schoolwide measures of inputs
or performance. In some ofour group of high-achieving countries,
classroom-level and/or curriculum-based inspections also exist.
It is more common in our focus group of high-achieving countries
to findthe systemwide responsibility for curriculum and instruction
quality controlassumed by subject-area experts. In mathematics,
this usually means mathe-matics professors at universities or
mathematicians in the education minis-tries. This stands in
contrast to the typical situation in the United States wherethere
are few mathematics experts in state education agencies or local
schooldistricts, and they are likely education school rather than
mathematics depart-ment graduates. Most university mathematics
departments in the UnitedStates have no connection or involvement
in mathematics teaching at the pri-mary and secondary levels.
2. CONTENT STANDARDS THAT ARE FIXED ANDEXPECTED TO BE FOLLOWED
AS A CORE CURRICULUM
These curriculum-based inspections in our focus group of
high-achievingcountries can be rather standardized because,
everywhere but the Nether-lands, teachers are expected to follow a
common curriculum according to acommon timetable. The inspectors,
then, can judge the teacher against a com-mon curricular standard.
In the United States, curricula and texts are sodiverse and
timetables so anomalous that it would be difficult to conduct
aclassroom-level, curriculum-based inspection. How would the
teachers per-formance be measured? There is no clear standard.
What happens to teachers in these high-achieving countries who
deviatefrom the standard program? One of our respondents asserted,
They do notdeviate. The common curriculum typically occupies 80% to
100% of the
Phelps / WORLDS BEST IN MATHEMATICS 407
-
instructional time. Our respondents in Singapore, France, and
the CzechRepublic pointed out that teachers were free to depart
from the common cur-riculum if their class was ahead of schedule;
they wished to provide practical,everyday examples of abstract
content; or they wanted to use examples frommagazines or videos to
motivate interest. But in all countries, students wouldstill be
held accountable for mastering the core curriculum.
3. MORE HIGH-STAKES SELECTION POINTS
Most of our high-achieving countries have few, several, or many
high-stakes selection points. Most administer one, two, three, or
several high-stakesentrance or exit examinations. Most are also
selective in their admissions tocertain programs or curricular
tracks, with low-achieving students at onelevel of education denied
their first choice of curricular track at the next levelof
education. Flemish Belgium is unique in lacking the examinations,
butthey still maintain selective admissions to certain programs and
curriculartracks, selective based on academic performance.
This stands in contrast to the United States, where most states
withhigh-stakes examinations have only low-level minimum competency
liter-acy tests for high school graduation. Curricular tracking is
also uncommon.Only in the small proportion of school districts with
magnet programs orcareer academies with selective admissions do
such stakes apply in theUnited States.
4. EXAMINATIONS THAT ARE CURRICULUM-BASED AND HIGH STAKES
U.S. states with low-level minimum competency literacy tests for
highschool graduation may be said to have high-stakes
curriculum-based tests,but they are genuinely of high stakes only
for a small proportion of students atrisk of failing them, and they
are typically based on curriculum from the pri-mary or lower
secondary level. Take away minimum competency tests andfew U.S.
states have high-stakes tests. A study by the U.S. General
Account-ing Office in 1993 concluded that only one quarter of tests
administered dis-trict- wide in the United States had high stakes
for students. The large major-ity of them were statewide minimum
competency tests. Surely, thatproportion is higher now but still
not as high as in most European countries.
High-achieving countries tend to have high-stakes examinations
of somevarietyat varying levels of difficulty or in different
curricular tracks. Singa-pore offers the British-inspired O level
(O is for ordinary) and A level(A is for advanced) examinations.
France requires passage of exit
408 EVALUATION REVIEW / AUGUST 2001
-
examinations in several academic tracks of differing curricular
emphases(e.g., language and humanities, natural science, physical
science and mathe-matics, economics, technology), as well as some
vocational and professionaltracks.
5. SECONDARY SCHOOLS ORGANIZED BY CURRICULAR FOCUS
Organizing secondary schools by curricular focus can aid quality
controlbecause it helps to focus the efforts of those authorities
responsible for moni-toring curriculum. A French inspector, expert
in the math/physics/chemistrycurriculum series can attend classes
in that subset of schools that offer thiscurriculum series.
Curriculum experts at the national ministry, likewise,
canspecialize in that particular mathematics curriculum and focus
on those par-ticular schools.
6. OTHER PRACTICES THAT REINFORCE COHERENCE
Other practices that reinforce coherence and are common in our
group ofhigh-performing countries but not in the United States
include the following:high school-level standards for promotion to
the next grade, as evidenced bya relatively high rate of
redoublement, or retention in grade; ability grouping;passage of
subject-area standardized tests required of teachers; looping(i.e.,
teachers in lower grades may keep the same group of students for
multi-ple years and thus are held more accountable and have an
incentive to makecertain all students make reasonable progress);
and employers use of gradesor test scores in their hiring
decisions, reinforcing the importance ofstudying.
DECISION POINTS
A country may profess to many methods of quality control, but if
there areno consequences for a failure to adhere to them, they may
well be ignored.
Thus, another way to contrast different countries quality
control systemsfor curriculum and instruction is to identify the
type and number of decisionpoints, or quality control measures,
where adherence to the curriculum andinstruction system can be
reinforced. Most decision points involve stakes forthe student,
teacher, or school. They involve potential consequences for
fail-ure to adhere to the system and to follow the program at a
reasonable pace.Students may be denied promotion if they do not
study. Teachers may be
Phelps / WORLDS BEST IN MATHEMATICS 409
-
denied employment if they do not pass exams demonstrating
subject-areaexpertise. Schools may suffer sanctions if it is shown
that their students arenot keeping up with their studies or
studying the correct materials.
DECISION POINTS OF TOP-PERFORMING COUNTRIES
Table 3 contrasts the decision points used in the focus group of
countriesto those used in the United States. Yes is written if a
country used a certaindecision point to monitor or maintain
coherence to a curriculum and instruc-tion system, No is written if
it could be determined that a country did notuse that decision
point, and blank cells indicate no information was found forthat
country during the study.3 Most decision points involve selection;
somestudents or teachers are or are not selected if they do or do
not maintain adher-ence to the program.
Table 3 consists only of systemwide decision pointsthose
universallymaintained. Nonsystemwide or local decision points are
those that are en-forced only at the local, school, or classroom
level, such as retention in grade.
Counting the number of Yes cells that indicate the existence of
a deci-sion point, one can see that each of the focus group
countries maintains 10 ormore decision points, while the United
States maintains 6. The categorySome was counted as one half. The
mean number of decision points amongthe top-performing countries is
13.88, more than double the United States 6.
Comparing the average number of systemwide decision points of
the top-performing countries (13.88) to the United States 6, one
finds the U.S. totalto be more than 2 standard deviations (s =
3.14) below the top-performersaverage.
Table 4 contrasts the prevalence of the local decision points of
retention ingrade among the focus group of countries and the United
States. The averagerates of retention in grade for the focus group
of countries were 0.86 studentsper school for Grade 4 and 2.54
students per school for Grade 8 (rates arelisted for each country
and each grade level in Note under Table 4). For theUnited States,
the rate of retention was higher for Grade 4 (1.01) and lowerfor
Grade 8 (1.65) (TIMSS, unpublished computations). More than half
thetotal average number of students retained for the eight
countries comes fromFrance.
The U.S. rate of retention in grade was not significantly
different than thetop-performing countries rates. (Some readers may
be tempted to assumefrom looking at Table 4 that low retention
rates are the norm for East Asia;rates range from 0 to 0.6 in
Japan, Korea, and Singapore. To provide some
410 EVALUATION REVIEW / AUGUST 2001
(Text continues on p. 414)
-
TABLE 3: Systemwide Decision Points (activities with stakes and
consequences for student, teacher, or school), by
Country:1994-1995
Belgium, Czech The UnitedFlemish Republic France Korea Japan
Netherlands Singapore Switzerland States
Level of education exit examPrimary level No No No No No Yes8
Yes2,3,6,7 Yes8,21 NoLower secondary No No Yes3,4 No Yes21 Yes3
Yes3,7,a Yes8,21 NoUpper secondary No Yes1,2,21 Yes1,4 Yes1 Yes1,a
Yes1,3,5 Yes1,3,6,7 Yes1,8,21 Some
Level of education entrance examLower secondary No Yes9 No No
Yes21 Yes6 Yes11,21 NoUpper secondary No Yes2,3,7,9,21 Yes8 Yes3,10
Yes4,8,10,a Yes21 Yes6 Yes8,21 NoHigher education Yes7
Yes2,5,7,9,21 Yes3,4 Yes3,21 Yes3,4,a Yes21 Yes3,6 Yes8,21 Yes
Other types of standardized examsAssessments Yes12 No Yes2,7,11
Yes3,21 Yes12 Yes10 Yes15 Yes8 YesEnd-of-course No No No Yes21
Yes10 Yes15 Yes21 NoOthers Yes21 Yes21 Yes21 Yes21 Yes21 Yes
Selection of schools or students for certain curricular
tracksLower secondary Yes21,a Yes9,11 Yes10 No No Yes10,15
Yes6,12,a Yes10,21 NoUpper secondary Yes7,21 Yes10 Yes3,10,15 Yes3
Yes3,a Yes10,15 Yes12,a Yes10,15,21,a NoHigher education Yes7
Yes9,11 Yes10 Yes3 Yes3,a Yes21 Yes6,12,a Yes10,15,21,a Yes
Ability grouping common within schoolsPrimary level21 No No No
Yes SomeLower secondary21 Yes Yes Yes No No Yes Yes Yes SomeUpper
secondary21 Yes Yes Yes Yes Yes Yes Yes Yes
(continued)411
-
TABLE 3 Continued
Belgium, Czech The UnitedFlemish Republic France Korea Japan
Netherlands Singapore Switzerland States
Large nonpublic sector makes more school selection possible
(> 25%)Primary level Yes21 No21 No21 No21 No21 Yes1,21 Yes1,21
No21 NoSecondary level Yes1 No9 Yes1,21 Yes1,21 Yes1,21 Yes1,21
Yes1,21 Yes1,21 No
School system and classroom practicesClassroom instruction is
inspected Yesa Yes9 Yes14 Yes3 Yesa Yesa Yes NoExamination required
in subject areafor teachers1 Yes Yes Yes Yes1,3 Yes Yes Yes Yes
No1
Total Yesb 11 10 13 12 13 16 19 17 6
NOTE:1.Beaton (1996);2.Bishop (1997);3.Postlethwaite
(1996);4.Stevenson and Lee (1997);5.Peak (1997);6.Yeoh
(1996);7.Postlethwaite(1988); 8. Phelps (1996); 9. Organisation for
Economic Co-operation and Development, Czech Republic; 10. Schmidt;
11. Kreeft (1990); 12.Phelps (2000); 13. National Center on
Education and the Economy (1994); 14. Resnick, Nolan, and Resnick
(1995); 15. Bishop (1993); 16.Organisation for Economic
Co-operation and Development, France; 18. U.S. Department of
Education (1992); 19. Organisation for EconomicCo-operation and
Development, Belgium; 20. Resnick, Nolan, and Resnick (1995); 21.
Robitaille (1997); 22. Third International Mathematicsand Science
Study, unpublished computations; 23. Organisation for Economic
Co-operation and Development, Spain; 24. Asia-Pacific Eco-nomic
Cooperation (1998); 25. Organisation for Economic Co-operation and
Development, Investing in Education: Analysis of the 1999
WorldEducation Indicators (2000); 26. Organisation for Economic
Co-operation and Development, Greece.a. Source is response to this
studys survey.b. Scoring: yes = 1, some = 0.5, no = 0.
412
-
TABLE 4: Local Decision Points (activities with stakes and
consequences for student, teacher, or school), by Country:
1994-1995Belgium, Czech The UnitedFlemish Republic France Korea
Japan Netherlands Singapore Switzerland States
Retention in grade is commonPrimary level22 (> 3%) Yes19 No
Yes3,15,22 No No Yes No Yes3,22 NoSecondary level22(> 5%)
No19,22 No Yes3,15,22 No No Yes No No No
Total Yesa 1 2 2 1
NOTE:On citations and superscripts, blank cell means no
information found or not applicable; cell (and row) with no
superscript means no infor-mation source declares the information,
but a lack of information to the contrary from several sources
implies it, or the information is commonknowledge; superscript for
row title means all cells have information from the same source
document, unless otherwise indicated in the cell.Mean rate for 3rd
and 4th grades (Czech Republic, 1.00; Japan, 0.0; Korea, 0.14; the
Netherlands, 3.16; Singapore, 0.02) = 0.86 students pergrade per
school; U.S. rate = 1.01 students; mean rate for 7th and 8th grades
(Belgium, 2.95; Czech Republic, 1.19; France, 10.33; Japan,
0.0;Korea, 0.06; the Netherlands, 3.29; Singapore, 0.6;
Switzerland, 1.93) = 2.54 students per grade per school; U.S. rate
= 1.65 students; overallmean rate = 1.70 students, U.S. rate = 1.33
students. 3. Postlethwaite (1996); 10. Schmidt; 15. Bishop (1993);
22. Third International Mathemat-ics and Science Study, unpublished
computations.a. Scoring: yes = 1, no = 0.
413
-
perspective, however, Hong Kong retains 1.58 students per school
per year in4th grade and 2.71 students per school per year in 8th
grade.)
Thus far, we have seen that the United States is different. It
seems to main-tain less quality control over its curriculum and
instruction system than do thetop performers in the TIMSS. For all
we know, however, the United Statesmay be different from most other
countries, regardless of whether they aretop performers. If the
bottom performers in the TIMSS also use more qualitycontrol
measures than the United States, we will have learned nothing
aboutthe relationship between quality control and student
achievement.
To check this possibility, information adequate to fill in
tables like the twoimmediately above was gathered for the bottom
performers in the TIMSS.
DECISION POINTS OF THE BOTTOM PERFORMERS IN THE TIMSS
Again, in Table 5, we contrast a focus group of sorts, the dozen
countriesscoring worst on the TIMSS. In this case, we get quite
different results. Thetotal number of quality control measures
ranges from two to seven. The coun-tries with the most quality
control measures in this list, Iran and Latvia, stilluse three
fewer than the country in the top performers focus group with
thefewest measures. The United States, with six quality control
measures, fitsright into this group of bottom performers, tied with
Germany and thePhilippines.
Comparing the average number of systemwide decision points of
thebottom-performing countries (4.42) to the United States 6, one
finds the U.S.total to be between 1 and 2 standard deviations (s =
1.88) above the bottom-performers average. The average number of
decision points of the bottom-performing group is statistically
significantly different from that of the top-performing group, as
determined by a t test (t = 7.69, p < .0001) between thetwo
means of 13.88 and 4.42 (s = 3.14).
(Some readers may be tempted to assume from looking at Table 5
thatMediterranean countries tend to use few quality control
measures; Cyprus,Greece, Portugal, and Spain represent four of the
five countries with the few-est measures used. To provide some
perspective, however, Italy, which didnot participate in the TIMSS,
is a Mediterranean country that requires pas-sage of high-stakes
examinations at three different levels of education andselection to
curricular tracks at both secondary levels. Italy offers a
rigoroussystem with a relatively high number of decision points;
thus, the Mediterra-nean climate does not necessitate a lack of
rigor.)
The average rates of retention in grade for the focus group of
countries(see Table 6) were 3.89 students per school for Grade 4
and 6.34 students
414 EVALUATION REVIEW / AUGUST 2001
(Text continues on p. 418)
-
TABLE 5: Systemwide Decision Points (activities with stakes and
consequences for student, teacher, or school), by
Country:1994-1995
UnitedColumbia Cyprus Germany Greece Iceland Iran Latvia
Lithuania Philippines Portugal Romania Spain States
Level-of-education exit examPrimary level No3 No3 No No26 No3
No3 No3 No3 No24 No3 No3 No NoLower secondary No3 No21 No No26
No3,21 Yes3,21 Yes3,21 Yes3,21 Some24 Yes3 No3 Yes3 NoUpper
secondary No2 No2,3 Yes No3 No3 Yes2,3 Yes3,21 Yes3,21 Some2,21
Yes3 No3 No3 Some
Level-of-education entrance examLower secondary No3 No3 No No26
No3 No3 No3 No3 No24 No3 No3 No3 NoUpper secondary Some21 No3 No
No26 No3 Yes3 Yes3 No3 No24 No3 Yes3 No3 NoHigher education Some21
No3 Yes Yes No3 Yes3 No3 No3 Some21,24 Yes3 Yes3 Yes3 Yes
Other types of standardized examsAssessments No No3 No No No No3
Yes21 No3,21 Yes21 No No No YesEnd-of-course No No3 No No No No3
No3 No3 Yes21 No No No NoOthers No No No No No No No No No No No No
Yes
Selection of schools or students for certain curricular
tracksLower secondary No21 No3 Yes No3 No No3 No3 No3 No24 No No No
NoUpper secondary No21 No3 No3 No Yes3,21 Yes3 No21 No25 Yes
NoHigher education Yes3 No3 Yes Yes3 No Yes3 Yes3 Yes3 Yes24
Some
Ability grouping common within schoolsPrimary level21 No No No
No No No No No No No SomeLower secondary21 No No No Some No Some No
Yes SomeUpper secondary21 Yes Yes No Some Yes Yes Yes Yes
Large nonpublic sector makes more school selection possible
(> 25%)Primary level21 No No No No No No No No No No NoSecondary
level21 Yes3 No3,22 No No3,22 No No No3,22 No3,22 Yes No No23
No
(continued)
415
-
TABLE 5 Continued
UnitedColumbia Cyprus Germany Greece Iceland Iran Latvia
Lithuania Philippines Portugal Romania Spain States
School system and classroom practicesClassroom instruction
inspected3 No3,21 Yes Yes No No No Yes Some NoTeacher exam in
subject area required1 No No Yes No Yes No Yes No1,3 Yes No Yes Yes
Some
Total Yesa 4 2 6 2 2 7 7 5 6 3 5 4 6
NOTE:On citations and superscripts, blank cell means no
information found or not applicable;cell (and row) with no
superscript means no information source declares the information,
but a lackof information to the contrary from several sources
implies it, or the information is common knowledge;superscript for
row title means all cells have information from the same source
document,unless otherwise indicated in the cell.1.Beaton
(1996);2.Bishop (1997);3.Postlethwaite (1996);4.Stevenson and Lee
(1997);5.Peak (1997);6.Yeoh (1996);7.Postlethwaite
(1988);8.Phelps(1996); 9. Organisation for Economic Co-operation
and Development, Czech Republic; 10. Schmidt; 11. Kreeft (1990);
12. Phelps (2000); 13. National Center on Education and the
Economy;14.Resnick, Nolan, and Resnick (1995);15.Bishop
(1993);16.Organisation for Economic Co-operation and Development,
France;18.U.S.Department of Education (1992);19.Organisationfor
Economic Co-operation and Development, Belgium; 20. Resnick, Nolan,
and Resnick (1995); 21. Robitaille (1997); 22. Third International
Mathematics and Science Study, unpublishedcomputations; 23.
Organisation for Economic Co-operation and Development, Spain; 24.
Asia-Pacific Economic Cooperation (1998); 25. Organisation for
Economic Co-operation and Devel-opment, Investing in Education:
Analysis of the 1999 World Education Indicators (2000); 26.
Organisation for Economic Co-operation and Development, Greece.a.
Scoring: yes = 1, some = 0.5, no = 0.
416
-
TABLE 6: Local Decision Points (activities with stakes and
consequences for student, teacher, or school), by Country:
1994-1995United
Columbia Cyprus Germany Greece Iceland Iran Latvia Lithuania
Philippines Portugal Romania Spain States
Retention in grade is commonPrimary (> 3%)22 Yes3,22 No3,22
No5 No No Yes Yes Yes NoLower secondary (> 5%)22 Yes3,22 Yes22
Yes Yes No Yes Yes No Yes No Yes No
Total Yesa 2 1 1 1 2 2 2 1
NOTE: On citations and superscripts, blank cell means no
information found or not applicable; cell (and row) with no
superscript means no information sourcedeclares the information,
but a lack of information to the contrary from several sources
implies it, or the information is common knowledge; superscript for
row titlemeans all cells have information from the same source
document, unless otherwise indicated in the cell. Not all focus
group countries filled in this information in theschool background
questionnaires (i.e., the Philippines did not); mean rate for 3rd
and 4th grades (Cyprus, 1.0; Greece, 1.0; Iceland, 0.54; Iran,
4.92; Latvia, 4.54;Portugal, 10.41) = 3.89 students per grade per
school; U.S. rate = 1.01 students; mean rate for 7th and 8th grades
(Columbia, 8.46; Cyprus, 4.56; Germany, 6.06;Greece, 8.81; Iceland,
1.55; Iran, 11.66; Latvia, 3.69; Lithuania, 2.82; Portugal, 8.39;
Romania, 2.94; Spain, 10.78) = 6.34 students; U.S. rate = 1.65
students; overallmean rate = 5.12 students per grade per school;
U.S. rate = 1.33 students. 3. Postlethwaite (1996); 5. Peak (1997);
22. Third International Mathematics and ScienceStudy, unpublished
computations.a. Scoring: yes = 1, some = 0.5, no = 0.
417
-
per school for Grade 8 (rates are listed for each country and
each gradelevel in Note under Table 6). For the United States, the
rates of retentionwere lower for Grade 4 (1.01) and for Grade 8
(1.65) (TIMSS, unpublishedcomputations).
Comparing the average rate of retention in the 7th and 8th
grades amongthe bottom-performing countries (6.34) to the United
States 1.65, one findsthe U.S. rate to be between 1 and 2 standard
deviations (s = 3.46) below thebottom-performers average. The
average rate of retention of the bottom-performing group is
significantly different from that of the top-performinggroup,
however, as determined by a two-tailed t test (t = 2.39, p <
.05)between the two means of 2.54 and 6.34 (s = 3.38).
DECISION POINTS: SUMMARY
Table 7 displays a concise summary of the decision point
discussion. TheUnited States uses fewer quality control measures
(i.e., decision points)systemwide than top-performing countries do,
but slightly more than bottom-performing countries use, on average.
The United States, on average, has alow rate of retention in grade
(1.65 students per class per year for 7th and 8thgrades and 1.33
for both primary and secondary school), the single exampleof local
quality control measure used in this analysis. Top-performing
coun-tries have a somewhat higher rate of retention in grade,
whereas bottom-performing countries have a much higher average rate
of retention in grade(6.34 students per class per year in Grades 7
and 8, and 5.12 for both primaryand secondary school).
Figure 1 contrasts the top- and bottom-performing groups of
countries(here, the United States fits neatly into the bottom
group) on the relationshipbetween their number of systemwide
decision points and average percentageof correct answers on the 7th
and 8th grade level TIMSS tests. The scatterplot
418 EVALUATION REVIEW / AUGUST 2001
TABLE 7: Summary of Decision Point Information
Top BottomPerformers Performers United
(mean) (mean) States
Systemwide measuresNumber of decision points 13.88 4.42 6Local
measureNumber of decision points 0.75 1.00 0Rate of retention in
grade (percentage)
(Grades 7 and 8) 2.34 6.34 1.65
-
implies a positive relationship between more quality control
measuresenforced (i.e., decision points) and higher test scores
(the Pearson product-moment correlation is 0.78).
Phelps / WORLDS BEST IN MATHEMATICS 419
0
10
20
30
40
50
60
70
80
0 5 10 15 20
Number of Quality Control Measures Used
Ave
rage
Per
cent
Cor
rect
(gra
des 7
&8)
Top-Performing Countries Bottom-Performing Countries
Figure 1: Average Third International Mathematics and Science
Study Scoreand Number of Quality Control Measures Used, by
Country
Seventh and Eighth GradeCountry Decision Points Average
Percentage Correct
Singapore 19 76Switzerland 17 57The Netherlands 16 58Japan 13
70France 13 56Korea 12 70Czech Republic 11 62Belgium 10 60Latvia 7
48Iran 7 35Germany 6 52United States 6 51Lithuania 5 43Romania 5
46Spain 4 46.5Columbia 4 27.5Portugal 3 40Iceland 2 47Greece 2
44.5Cyprus 2 45
p = .776712.
-
A skeptic might speculate that wealthier countries have a
considerableadvantage in promoting student achievement, such that
country wealth mightbe the key driver of achievement, not quality
control measures, or anythingelse. Indeed, there does appear to be
some correlation (p = .54) between coun-tries 8th-grade TIMSS
mathematics scores and their GDP per capita. More tothe point,
however, if the implementation of quality control procedures
requiresmore resources, and quality control procedures improve
student achievement,then is it not really wealth that is improving
student achievement? The Pearsonproduct-moment correlation
coefficient between the number of quality controlmeasures (i.e.,
decision points) used and GDP per capita is 0.47 for the group
ofcountries included here.
In order to adjust for country wealth, then, both of the factors
deployed inFigure 1 were divided by GDP per capita. The derived
factors are measures oftest scores and quality control procedures
per unit of wealth (i.e., average per-cent correct [TIMSS 8th-grade
math] per GDP per capita, and number ofquality control measures
used per GDP per capita). With the factor of wealthremoved, do we
still find a positive correlation between student achievementand
quality control? Indeed, we do; see Figure 2.
Figure 2 suggests an exponential relationship between quality
controlmeasures and student achievement. It would appear that, up
to a certain point,quality control implementation makes some
difference in student achieve-ment, even when the resources
available for quality control implementationare taken into account.
But, after that point, if an extra effort is made to imple-
420 EVALUATION REVIEW / AUGUST 2001
Number of Quality Control Measures Used (per GDP/capita)
Aver
age
Perc
ent C
orre
ct (g
rades
7& 8)
(pe
r GDP
/capit
a)
Figure 2: Average TIMSS Score and Number of Quality Control
Measures Used(each adjusted for GDP/capita), by Country
NOTE: TIMSS = Third International Mathematics and Science
Study.
-
ment quality control procedures in spite of limited resources,
student achieve-ment can really take off.
Judging from all the information considered thus far related to
the preva-lence of decision points (a.k.a., quality control
measures), it would appear,
Top-performing countries use more systemwide quality control
measures. TheU.S. number lies in between the averages of the top
and bottom performers but iscloser to the bottom.
The bottom performers use more of the local quality control
measure, retentionin grade, perhaps as a substitute for the
systemwide measures they lack.
The United States is low on all summary statisticscloser to the
bottom perform-ers on systemwide measures and lower than both top-
and bottom-performingcountries on local measures.
Opponents of local quality control measures, such as retention
in grade,perhaps, could increase their chances of achieving its
abolition if they advo-cated for more systemwide measures of
quality control, such as high-stakestests. It would appear that the
presence of an integrated system of systemwidequality control
measures might reduce the need for local control measures.
COMPARISONS TO THE U.S. SYSTEM
Although one can observe a good deal of similarity in curriculum
amongU.S. classrooms, there is little uniformity. U.S. textbooks in
1994-1995, forexample, share a large degree of similarity in
appearance and content but arenot deliberately alike and not alike
enough to represent a common curriculumor to form a common item
pool for high-stakes testing at more than a minimallevel of
competency. Some even argue that they are dumbed down to a low-est
common denominator to be salable to the largest possible population
ofclassrooms. Moreover, there is no assurance in most of the United
States,even with common textbooks, that two teachers in different
classrooms areinterpreting the content the same way, at the same
pace, or even at the samegrade level.
One might argue that the United States benefits from a great
diversity incurriculum and instruction. One defense of the U.S.
system might be that ifdifferent students learn different content,
then the country as a whole benefitsbecause no matter what the
topic, we are more likely to have citizens whopossess the
knowledge, than are other countries where all their citizens
learnall the same content. Another defense is that each teacher
gets to tailor
Phelps / WORLDS BEST IN MATHEMATICS 421
-
curriculum and instruction to his or her own particular
strengths and to his orher students particular needs.
Critical responses to the first defense could include the
following: TheU.S. curriculum actually appears to be burdened with
a great deal of repeti-tion and superficiality (see McKnight et al.
1987; Schmidt et al. 1996a).Another response is that there is a
great deal of variety in curriculum andinstruction in
high-performing countries, too, but it is organized more
ratio-nally. Separate schools exist with curricular focus and
students who wish toshare that focus attempt to enter those
schools.
The second defense of the U.S. systemabout tailoring classroom
curric-ulum and instruction to the personal characteristics of the
teacher and the stu-dentsis heard often. Most of our expert
respondents from top-performingcountries thought that it was
important that teachers have some flexibility totailor curriculum
and instruction to their classes. To do that, the required
corecurriculum typically takes up only 80% of classroom time. A
buffer of 20%of the school year is conserved to allow slower moving
classes to catch upwith their faster moving colleagues by the end
of each school session. Thefaster moving classes use the buffer
time for enrichment exercises, such asexercises in the practical
applications of mathematics concepts in real life,with examples
provided from daily life or the popular press. So these
top-performing countries typically do not demand 100% uniformity,
only 80%.One could argue that in the United States, the equivalent
figure is 0%.4
The most commonly experienced drawback to the heterogeneity of
curric-ulum in the United States is experienced by the children of
families whomove. These children can discover that in their new
school district, they arebehind schedule, ahead of schedule, not
prepared, overly prepared, and so on.Commonly, they enter a
completely different curriculum in the absence ofcommon system
standards, and they waste time. For kids in families thatmove
often, the kids can suffer academically. In France, with its
uniform cur-riculum nationwide, there simply is no such
problem.
This examination of quality control over curriculum and
instruction intop-performing countries suggests another drawback.
Without common,enforceable standards, there may be no good way to
affect performancesystemwide other than through high-stakes
standardized tests (as in the Neth-erlands). Without either common
standards or high-stakes standardized tests,there may be no
effective way at all to monitor performance systemwide.Some U.S.
teachers may be doing a wonderful job in their totally
customizedclasses, but some may be doing an awful job. How is one
to know or tellwhich?
In the United States, one must hope that teachers will face down
the natu-ral incentives of their students, parents, schools, and
themselves to avoid
422 EVALUATION REVIEW / AUGUST 2001
-
accountability by holding themselves and their students to high
standards ofperformance. One must also hope that teachers will know
how.
The tight networks of subject-area professionals in
top-performing coun-tries provide classroom-level inspections. Some
teachers might feel threat-ened by these inspections, but they
might also benefit from advice the inspec-tors have to offer. With
a common core curriculum, inspectors can offeradvice from a deep
pool of knowledge about what works, because all teachersare
teaching the same material. With no common core curriculum in
theUnited States and every class taught in a unique, customized
manner, anyclassroom-level curriculum-expert inspectors, were there
to be any in theUnited States, would have less to say, and it would
be less specific.
In one nationwide survey of U.S. teachers, 99% responded that
theythought subject matter knowledge should be considered in their
performanceevaluation, whereas only 65% said it was (Nolan 1997,
iii, 8, 27). Even then,where performance evaluations are conducted
by school principals, odds arethat the principal is not expert in
most teachers subject matter.
LESSONS FOR THE UNITED STATES
Top-performing countries tend to use a lot of quality control
measures,such as high-stakes examinations, selection for curricular
tracks, abilitygrouping, and other devices considered anathema by
many U.S. educationprofessors. The progressives in the United
States who oppose testing,tracking, and ability grouping may wish
to ignore most of the top-performingcountries and embrace Flemish
Belgium for solace.
How much will they find? If they are honest with themselves, not
much.First, the Flemish community of Belgium uses ability grouping
and selectionfor curricular tracks; it is only high-stakes tests
that they do not use or, rather,did not use until the late 1990s,
when they started development of an uppersecondary school exit
exam.
Second, Flemish Belgium is just one country, alone among the
top-performing countries in its absence of high-stakes
examinations. Most coun-tries eschewing high-stakes tests scored
poorly on the TIMSS.
Third, Flemish Belgium does not compare well to the larger U.S.
states; itis just too small. Some of its key quality control
features, such as the constantand close interaction of teachers,
and the highly visible public disseminationof information on
standards, are probably easier to implement in
smallerjurisdictions.
Phelps / WORLDS BEST IN MATHEMATICS 423
-
Nonetheless, progressives may wish to look to some U.S.
adaptations ofthe Flemish Belgium sort. For a state model, they
could look to Connecticut,which attempts to maximize the amount and
the public visibility of informa-tion on school and student
performance without using high-stakes examina-tions. They do,
however, contrary to the Flemish Belgium of 1995, adminis-ter lots
of standardized tests, but although some of those tests have stakes
forthe studentse.g., the 4th, 6th, and 8th grade Mastery
TestsConnecticuthas no high-stakes exit examination.
Moreover, Connecticut maintains some other quality control
features sim-ilar to those found among the TIMSS top
performers:
Connecticut is one of the few U.S. states to have long retained
a detailed statecurriculum, in place long before the current
standards movements, that wastaken seriously by local school
districts.
Connecticut employs master teachers to review and critique new
teachers inthe classroom. New teachers are reviewed often, through
direct classroom ob-servation and videotape. Critical evaluations
from master teachers can cost newteachers their jobs.
The state publishes a statewide report card that compares
districts on a numberof indicators of progress and success (or,
lack thereof).
For a model quality control measure, progressives may wish to
look to theuse of school and district report cards in the United
States. Statistical correla-tions between improvement over time on
state National Assessment of Edu-cational Progress (NAEP) scores
and the existence of school and districtreport cards in the state
are as strong as the correlations between the existenceof state
high-stakes tests and improvement on state NAEP scores. This
sug-gests that public glory and embarrassment may be as effective a
quality con-trol inducement as the genuine consequences of
high-stakes testing.
I suspect, however, that many U.S. progressives would not accede
even tothe use of school and district report cards or high-stakes
master teacher evalu-ations; such behavior runs counter to the
beliefs of more radical construc-tivists and egalitarians, who
would regard both as invalid and unfair.
What are the lessons of this study for those progressives who
want no deci-sion points and no quality control measures? Be
prepared to accept last placein the Fourth International
Mathematics and Science Study, below Cyprusand Greece (Iceland is
currently busy building a rather comprehensive exam-ination system,
from scratch). If their goals for the U.S. education systemtend
toward what they regard to be noble public goods, such as the
imparta-tion of beliefs in egalitarianism and their version of
moral and civic con-sciousness, they may genuinely not care that
U.S. academic achievementdives toward the bottom.
424 EVALUATION REVIEW / AUGUST 2001
-
What is the lesson of this study for the traditionalists?
Probably, it con-firms what they have suspected all along. How much
of a thorough, inte-grated quality control system do we see in the
United States? Are we at leastheading in the direction of building
such a system? For at least half of themeasures, yes.
Many U.S. states are now in the process of implementing
systemwidequality control measures (i.e., decision points). More
than a few states haveor are implementing high-stakes examinations
at several levels. Some stateshave or are implementing examinations
at the same level with more than onelevel of difficulty, for a
regular diploma and an honors diploma (e.g., NewYork). Some states
have or are implementing curricular choices in those exitexams
(e.g., passage of 5 subject-area exams among a choice of 10),
andthose choices may eventually lead to the adoption of curricular
tracking.Ability grouping is already common in most of the United
States, althoughmany education professors claim that the research
shows it to be a badthing.
Subject-area mastery for teachers, with education-school exit
exams basedon subject-area knowledge as well as pedagogical
concepts, is fast becominga standard requirement in the United
States.
The remaining aspects of fully integrated quality control
systems maystill elude U.S. school systems for some time to come.
We may never seeclassroom- and curriculum-based teacher instruction
inspections to thedegree that they exist in other countries. Such
systems would need to be builtfrom scratch. Some states have been
experimenting with programs that pro-mote the best to be master
teachers, who no longer teach a full class loadthemselves but visit
other teachers classrooms and give them advice. Butfew states are
as far along in using this technique as Connecticut, which usesit
only with new teachers.
More likely, it would appear based on current trends, that
teachers will bejudged based on their students gains, in scores on
curriculum-based tests. Inthe examination systems most fair to
teachers (e.g., Tennessee), student testscores are adjusted for
background factors, such as demographic profiles,and the students
are tested frequently, so that the pressure is distributed
acrossteachers in all grades, not just a few testing grades.
Given the choice, teachers would probably prefer
classroom-basedinspections. Indeed, when former president Al
Shanker was urging his Amer-ican Federation of Teachers to
enthusiastically support high standards andhigh-stakes standardized
tests, he often cited European countries as a model.There, he found
high standards, high-stakes tests (for students) and high lev-els
of professionalism in classroom instruction and school
administration,alongside teacher corps that were completely
unionized, highly paid, and
Phelps / WORLDS BEST IN MATHEMATICS 425
-
high in social status (the latter point quite a contrast to most
of the UnitedStates).
Separating classrooms and schools along the lines of different
curriculartracks may be difficult to implement in the United States
and encounter muchopposition. It might seem antidemocratic to some.
If the charter school move-ment really takes hold, however, the
adoption of curricular tracking withinand by schools will only be a
matter of time. If parents and students are givena choice, most
will probably choose some clear curricular or occupationaldirection
over the current bland generality. Even in the public school
systems,career academies and magnet schools already offer
curricular tracking, andmany of these programs are very
selective.
CONCLUSION
All other factors being equal, quality control must be more
difficult in theabsence of common standards. This study of
top-performing countries sug-gests that the most successful quality
control efforts manage rather thor-oughly the entire chain of
elements that make up the curriculum and instruc-tion system.
An interesting study managed by David Cohen at Michigan State
Univer-sity tells the story of a Michigan State effort to change
curriculum andinstruction in mathematics through a standardized
program. Very careful andthorough, the program seemed to consider
every essential aspect. The storyfollows activities at the state
level, public relations level, and local districtlevel. Everything
seemed to work, all the pieces seemed to be in place, and ahigh
degree of coherence and ownership seemed to be maintained. Thefinal
piece of the study consisted of observation evaluations of
classroominstruction by teachers participating in the program. The
teachers were gen-erally strong supporters of the program, but the
evaluations showed that mostwere not following the common
curriculum nor adhering to the commonstandards; each teacher was
following his or her own path. However, eachteacher thought he or
she was sticking with the program. Left on their own tointerpret
the curriculum their own way, without any outside monitoring,
veri-fication, or support, they each went their own way (Cohen
1993; Grant 1993).
Work conducted by National Center for Education Statistics and
JamesStigler, involving videotapes of 8th-grade classroom
instruction in Japan,Germany, and the United States, seconded the
conclusion. U.S. teachersthink they are implementing curricular
reforms, but generally, they are not(U.S. Department of Education
1996, 44-47).
426 EVALUATION REVIEW / AUGUST 2001
-
Richard Elmore (1996) reviewed two attempts at large-scale U.S.
schoolreform and, combining his reviews with his readings of the
failures of otherU.S. curricular reform projects, concluded that
schools and their incentivestructures are organized in such a way
as to thwart reform in curriculum andinstruction.
David F. Labaree (1999, 19) offered several compelling reasons
for thechronic failure of curriculum reform:
Loose coupling of school systems: . . . Administrators have
little power to maketeachers toe the line instructionally [because
they] can fire teachers only withthe greatest difficulty, and pay
levels are based on years of service and graduatecredits, not job
performance.
Adaptability of the school system: . . . Teachers adopt the
language and the feelof a reform effort without altering the basic
way they do things [and] the differ-entiation of subjects frees
schools to adopt new programs and courses by thesimple process of
addition. . . . They can always tack on another segment in
thealready fragmented curriculum [without changing any of the
rest].
Weak link between teaching and learning: . . . Students, after
all, are willful ac-tors who learn only what they choose to learn.
. . . The law says they have to at-tend school until they are 16
years old; the job market pressures them to stay inschool even
longer than that. . . . But these forces guarantee only attendance,
notengagement in the learning process.
Note that these three problems either do not exist or are far
less potent inhighly integrated systems with many enforced quality
controls where teach-ers are evaluated based on actual performance;
reforms to a required, corecurriculum cannot just be tacked on as
an elective; and students have to listenand study if they want to
graduate.
It could be, then, that U.S. reforms in the past have faded
before theyreached the student due to poor quality control in
curriculum and instructionsystems that were not fully
integrated.
APPENDIXSources of country-specific information, by country
BELGIUM (FLEMISH COMMUNITY)Brusselmans-Dehairs, C. 1995, June.
Methods of educational monitoring in the Flemish Com-
munity of Belgium. Ghent, Belgium: University of Ghent.Dunon,
Rita. 1991. Belgium, Special survey on standards and assessments.
National Center for
Education Statistics, U.S. Department of Education.Eurostat.
1995. Belgium Structures of the Education Systems in the European
Union.
EURYDICE/CEDEFOP.
Phelps / WORLDS BEST IN MATHEMATICS 427
-
Georis, P., and M. Vilain. 1995. Belgium. In Handbook of world
education, 81-85.Monseur, C., and C. Brusselmans-Dehairs. 1997.
Belgium. In National contexts for mathemat-
ics and science education: An encyclopedia of the education
systems participating in TIMSS,edited by D. F. Robitaille.
Vancouver, Canada: Pacific Educational Press.
Organisation for Economic Co-operation and Development. 1993.
Belgium: Reviews of nationalpolicies for education. Paris:
Organisation for Economic Co-operation and Development.
Philipparat, A. 1996. Belgium. In International encyclopedia of
national systems of education,2d ed., edited by T. N.
Postlethwaite. Oxford, UK: Pergamon.
Standaert, R. 1994, September. The question of national
standards in secondary education in thelight of Belgium as a
federal state. Brussels, Belgium: Department of
EducationalDevelopment.
Vanbergen, P. 1988. Belgium. In The encyclopedia of comparative
education and national sys-tems of education, edited by T. N.
Postlethwaite. Oxford, UK: Pergamon.
COLUMBIA
Diaz, C. J., E. Solarte, and J. Arce. 1997. Columbia. In
National contexts for mathematics andscience education: An
encyclopedia of the education systems participating in TIMSS,
editedby D. F. Robitaille. Vancouver, Canada: Pacific Educational
Press.
Mora, J. 1996. Columbia. In International encyclopedia of
national systems of education, 2d ed.,edited by T. N.
Postlethwaite. Oxford, UK: Pergamon.
CYPRUS
Papanastasiou, C. 1996. Cyprus. In International encyclopedia of
national systems of education,2d ed., edited by T. N.
Postlethwaite. Oxford, UK: Pergamon.
Papanastasiou, C. 1996. Cyprus. In National contexts for
mathematics and science education:An encyclopedia of the education
systems participating in TIMSS, edited by D. F.
Robitaille.Vancouver, Canada: Pacific Educational Press.
CZECH REPUBLIC
Bishop, J. H. 1997. Do curriculum-based external exit exam
systems enhance student achieve-ment? Working paper no. 97-28,
Center for Advanced Resource Studies, School of Industrialand Labor
Relations, Cornell University, Ithaca, NY.
Kotsek J., and J. vecov. 1996. Czech Republic. In International
encyclopedia of national sys-tems of education, 2d ed., edited by
T. N. Postlethwaite. Oxford, UK: Pergamon.
Organisation for Economic Co-operation and Development. 1996.
Czech Republic, reviews ofnational policies for education. Paris:
Organisation for Economic Co-operation andDevelopment.
Pelavin Research Institute. 1996. The educational systems of
eight countries. Council for BasicEducation.
Petrek, S. 1988. Czechoslovakia. In The encyclopedia of
comparative education and nationalsystems of education, edited by
T. N. Postlethwaite. Oxford, UK: Pergamon.
vecov, J., and J. Strakova. 1997. Czech Republic. In National
contexts for mathematics andscience education: An encyclopedia of
the education systems participating in TIMSS, editedby D. F.
Robitaille. Vancouver, Canada: Pacific Educational Press.
428 EVALUATION REVIEW / AUGUST 2001
-
FRANCEAmerican Federation of Teachers. 1995. Defining world
class standards. Washington, DC:
American Federation of Teachers.Bishop, J. H. 1993. Impacts of
school organization and signaling on incentives to learn in
France,
the Netherlands, England, Scotland, and the United States.
Working paper no. 93-21, Centerfor Advanced Resource Studies,
School of Industrial and Labor Relations, Cornell Univer-sity,
Ithaca, NY.
Britton, E. D., and S. A. Raizen. 1996. Examining the
examinations: An international compari-son of science and
mathematics examinations for college-bound students. Boston:
KluwerAcademic.
Eicher, J. C. 1988. France. In The encyclopedia of comparative
education and national systemsof education, edited by T. N.
Postlethwaite. Oxford, UK: Pergamon.
Medrich, E. A., S. Kagehiro, and J. Houser. 1994. Vocational
education in