Top Banner
Pergamon Reproductive Toxicology, Vol. 9, No. 1, pp. 61-95, 1995 Copyright 0 1995 Elsevier Science Ltd Printed in the USA. All rights reserved 0890-6238/95 $9.50 + .OO l zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA Special Article 0890-6238(94)00057-3 AN EVALUATIVE PROCESS FOR ASSESSING HUMAN REPRODUCTIVE AND DEVELOPMENTAL TOXICITY OF AGENTS JOHN zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA A. MOORE,* GEORGE P. DASTON,~ ELAINE FAUSTMAN,@ MARI S. GOLUB,$ WILLIAML. HART,§ CLAUDEHUGHES JR.,~ CAROLE A. KIMMEL,~~ JAMES C. LAMB IV,** BERNARD A. SCHWETZ,~~ and ANTHONY R. SCIALLISS *Institute for Evaluating Health Risks, Washington, DC; tProctor and Gamble Company, Miami Valley Laboratories, Cincinnati, Ohio; $California Environmental Protection Agency, Office of Environmental Health Hazard Assessment, Sacramento, California; §Eastman Kodak Co., Rochester, New York; llDuke University Medical Center, Durham, North Carolina; IIU.S. Envirnomental Protection Agency, Washington, DC; **Jellinek, Schwartz and Connolly, Inc., Arlington, Virginia; ttNationa1 Center for Toxicological Research, Jefferson, Arkansas; $$Georgetown University Medical Center, Washington, DC; §§University of Washington, Seattle CONTENTS CHAPTER I. INTRODUCTION Introduction I. 1 General Use of data and judgment Weight of evidence Threshold assumption I .2 Communication Primary audience Narrative statement Certainty I.3 Data Sources and Acceptability Use all relevant data Good laboratory practices 1.4 Data Variability Limitations of current data Data needs Characterizing data as sufficient or insufficient I.5 The Expert Committee CHAPTER II. THE EVALUATIVE PROCESS II. 1 General Description II.2 Details of the Evaluative Process 11.2.1 Exposure Data The opinions expressed in this article are those of the authors and do not necessarily reflect policy positions of the organizations at which the individual scientists are employed. Address correspondence to John A. Moore, Institute for Evaluating Health Risks, 1101 Vermont Avenue, NW, Suite 608, Washington, DC 20005-3521. 61 11.2.2 General Toxicologic and Biologic Parameters 11.2.2.1 Chemistry 11.2.2.2 Basic Toxicity Acute studies Repeated-dose studies Genetic toxicity Other end points 11.2.2.3 Pharmacokinetics 11.2.3 Developmental and Reproductive Toxicity 11.2.3.1 Human Data Utility Types of epidemiologic studies Bias Confounding Timing of exposure Dose-effect outcome 11.2.3.2 Experimental Animal Toxicity Utility and limitations Adverse effect No adverse effect 11.2.4 Integration of Toxicity and Exposure Information 11.2.4. I Interpretation of Toxicity Data 11.2.4.2 Default Assumptions Absorption Cross-species extrapolation Additivity 11.2.4.3 Quantitative Evaluation Identification of the NOAEL and LOAEL
35

An evaluative process for assessing human reproductive and developmental toxicity of agents

May 17, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An evaluative process for assessing human reproductive and developmental toxicity of agents

Pergamon

Reproductive Toxicology, Vol. 9, No. 1, pp. 61-95, 1995

Copyright 0 1995 Elsevier Science Ltd Printed in the USA. All rights reserved

0890-6238/95 $9.50 + .OO

l zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBASpecial Article

0890-6238(94)00057-3

AN EVALUATIVE PROCESS FOR ASSESSING HUMAN

REPRODUCTIVE AND DEVELOPMENTAL TOXICITY OF AGENTS

JOHN zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAA. MOORE,* GEORGE P. DASTON,~ ELAINE FAUSTMAN,@

MARI S. GOLUB,$ WILLIAM L. HART,§ CLAUDE HUGHES JR.,~

CAROLE A. KIMMEL,~~ JAMES C. LAMB IV,** BERNARD A. SCHWETZ,~~

and ANTHONY R. SCIALLISS *Institute for Evaluating Health Risks, Washington, DC; tProctor and Gamble Company, Miami Valley Laboratories, Cincinnati, Ohio; $California Environmental Protection Agency, Office of Environmental

Health Hazard Assessment, Sacramento, California; §Eastman Kodak Co., Rochester, New York; llDuke University Medical Center, Durham, North Carolina; IIU.S. Envirnomental Protection Agency, Washington, DC; **Jellinek, Schwartz and Connolly, Inc., Arlington, Virginia; ttNationa1 Center for

Toxicological Research, Jefferson, Arkansas; $$Georgetown University Medical Center, Washington, DC; §§University of Washington, Seattle

CONTENTS

CHAPTER I. INTRODUCTION

Introduction

I. 1 General

Use of data and judgment

Weight of evidence

Threshold assumption

I .2 Communication

Primary audience

Narrative statement

Certainty

I.3 Data Sources and Acceptability

Use all relevant data

Good laboratory practices

1.4 Data Variability

Limitations of current data

Data needs

Characterizing data as sufficient or insufficient

I.5 The Expert Committee

CHAPTER II. THE EVALUATIVE PROCESS

II. 1 General Description

II.2 Details of the Evaluative Process

11.2.1 Exposure Data

The opinions expressed in this article are those of the authors and do not necessarily reflect policy positions of the organizations at which the individual scientists are employed.

Address correspondence to John A. Moore, Institute for Evaluating Health Risks, 1101 Vermont Avenue, NW, Suite 608, Washington, DC 20005-3521.

61

11.2.2 General Toxicologic and Biologic

Parameters

11.2.2.1 Chemistry

11.2.2.2 Basic Toxicity

Acute studies

Repeated-dose studies

Genetic toxicity

Other end points

11.2.2.3 Pharmacokinetics

11.2.3 Developmental and Reproductive Toxicity

11.2.3.1 Human Data

Utility

Types of epidemiologic studies

Bias

Confounding

Timing of exposure

Dose-effect outcome

11.2.3.2 Experimental Animal Toxicity

Utility and limitations

Adverse effect

No adverse effect

11.2.4 Integration of Toxicity and Exposure

Information

11.2.4. I Interpretation of Toxicity Data

11.2.4.2 Default Assumptions

Absorption

Cross-species extrapolation

Additivity

11.2.4.3 Quantitative Evaluation

Identification of the NOAEL and LOAEL

Page 2: An evaluative process for assessing human reproductive and developmental toxicity of agents

62 Reproductive Toxicology

Calculation of the Benchmark Dose(s) (BMD)

Calculation of the Margin of Exposure (MOE)

Uncertainty Factors (UFs)

Calculation of the Unlikely Effect Level

(UEL) Definitions

11.2.5 Critical Data Needs

11.2.6 Summary

11.2.6.1 Background

II.2.6.2 Human Exposure

11.2.6.3 Toxicology

11.2.6.4 Quantitative Evaluation

11.2.6.5 Certainty of Judgment and Data Needs

11.2.7 References

CHAPTER III. END POINT DESCRIPTORS

III. 1. Developmental Toxicity

III. 1.1 Manifestations

III. 1. I. 1 Definitions

Developmental toxicity

Structural abnormalities

Altered growth

Functional developmental toxicity

III. 1.1.2 Other Considerations

III. 1.2 Human Data

III. 1.2.1 Measures of Potential Adverse Effects

III. I .3 Experimental Animal and In Vitro Studies

III. 1.3.1 Types of Studies

Laboratory animal toxicity studies

Short-term tests

III. 1.3.2 Interpretation

111.2. Male Reproductive Toxicity

111.2.1 Manifestations

111.2.2 Human Data

111.2.2.1 Measures of Potential Adverse Effects

Endocrine parameters

Sexual behavior and interest

Semen evaluations

Biochemical markers

111.2.2.2 Interpretation

111.2.3 Experimental Animal and In Vitro Studies

III. 2.3.1 Potential Measures of Determining

Adverse Reproductive Effects

Single-generation test systems

Multigeneration test systems

Continuous-breeding test systems

Male dominant lethal test

Subchronic toxicity test

Chronic toxicity test

1X2.3.2 Interpretation

Fertility indices

Volume 9, Number 1, 1995

Organ weights

Organ morphology

Sexual behavior

Sperm evaluation

Endocrine evaluations

Biochemical markers of reproductive

exposure and effect

In vitro methods

111.3. Female Reproductive Toxicity

III.3.1 Manifestations

111.3.2 Human Data

11X.3.2.1 Measures of Potential Adverse Effects

Standardized fertility ratio

Standardized birth ratio

Infertility rate

Time to pregnancy

Age at puberty

Age at menopause

Menstrual cycle parameters

Incidence of early pregnancy loss

Incidence of ectopic pregnancy

Endocrine parameters

Sexual behavior and interest

Breast milk

111.3.2.2 Interpretation

111.3.3 Experimental Animal and In Vitro Studies

III.3.3.1 Types of Studies

Single-generation test systems

Multigeneration test systems

Continuous-breeding test systems

Cyclicity

Structural reproductive organ alterations

Biochemical reproductive organ changes

Timing of puberty or reproductive

senescence

Reproductive endocrine parameters

Culture methods

Organ perfusion

Breast milk

III.3.3.2 Interpretation

Indices

Cytology abnormalities

Weight and morphology changes

Biochemical changes

Alterations in age at puberty or reproductive

senescence

Endocrine parameters

In vitro and perfusion systems

Breast milk

REFERENCES

Page 3: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process 0 J. A. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAMOORE ET AL. 63

I. INTRODUCTION

Agents that may affect reproductive and develop-

mental toxicity are of great concern to the general

public. Despite this, both the regulatory and public

health arenas have made somewhat haphazard use

of the existing data when interpreting these health

effects. Appropriate information is often unavailable

to lay citizens, and even when it is, may be inter-

preted very differently by regulators, public health

officials, physicians, or others. In December 1989,

the Institute for Evaluating Health Risks (IEHR)

convened an ad hoc group of scientists to discuss

the evaluation of chemicals for their potential repro-

ductive or developmental toxicity. The group agreed

that there was a clear need for an evaluation process

and found, somewhat to their surprise, a strong con-

sensus on many elements that should be incorpo-

rated into such a process. Working with the results

of this meeting, IEHR succeeded in securing a bal-

anced source of funding,’ and through an iterative

committee effort developed this written document,

The Evaluative Process for Determining Human Re-

productive and Developmental Toxicity of Agents.

The committee benefited greatly from two particular

activities: broad public review and comment on a

draft in the Spring of 1992; and the experiences of

an expert committee that used a revised draft to

evaluate a selected number of chemicals that pro-

vided a broad representation of data types and toxi-

cologic effects. The evaluations of lithium, boric

acid, ethylene glycol, and diethylhexyl phthalate and

its major metabolites are published separately in

appropriate journals.

The Evaluative Process represents scientific

consensus among individuals from regulatory, in-

dustrial, and academic sectors and calls for the sys-

tematic application of knowledge and judgment in a

practical, open, and informative manner. Several

principles and objectives that are embodied in the

Evaluative Process are described below.

‘Funds for this project were provided by the following organi- zations: American Industrial Health Council, Ashland Chemical Co., BP America, Bechtel, Bristol-Myers Squibb Co., Chevron, Coca Cola Co., the U.S. Department of Agriculture, Dow Chemi- cal Co., Eastman FMC Corp., Kodak Co., U.S. Environmental Protection Agency, Exxon Corp., Ford Motor Co., General Elec- tric Foundation, Hoechst Celanese Corp., Merck Company Foundation, Mobile Research and Development Corp., Mon- santo Co., Occidental Chemical Corp., Olin Corp., OxyChem, Pacific Gas and Electric Co.. Proctor and Gamble Co.. Pfizer Inc., Rhone-Poulec Inc., Rohm & Haas Co., SC Johnson Wax, Syntex, Texaco Foundation, and United Technologies.

I. 1 GENERAL

Use of Data and Judgment

The Evaluative Process uses both scientific data

and scientific judgment. According to its principles,

the data for a toxicant should adequately demon-

strate adverse-effect and dose-response relation-

ships for general toxicologic responses as well as

for reproductive and developmental effects. Fur-

thermore, there is a significant need for data that

characterize human exposure. The essence of the

evaluative process is that the interpretation of these

data should reflect the expert judgment of a broad

range of scientists from government, academia, and

the private sector. Overall, the process should en-

gender a desire to interpret the data, rather than to

acquiesce to the passive use of a repetitive series of

default assumptions. Requiring that an evaluation

of a chemical include a statement of “what is known

and the certainty with which it is known” should

lead to the identification of critical data needs. The

intent is that identifying critical gaps in the data will

stimulate investigations to yield useful information

that will enhance the certainty of judgment and bet-

ter serve the public.

Weight of Evidence

With a weight-of-evidence approach that con-

siders both toxicity and human exposure informa-

tion, evaluators can determine whether human or

experimental animal data can reasonably be used

to predict reproductive or developmental effects in

humans under particular exposure conditions. The

approach must distinguish those chemicals for which

there is firm evidence about human risk potential,

based on relevant data, from those for which the

potential for human effects is uncertain or even

remote. It will, thus, aid public policy officials in

setting priorities and developing programs to pro-

tect the public from undue exposure to known

toxic agents or from undue costs of inappropriate

regulation.

Using a weight-of-evidence approach to com-

municate a judgment about human risk diminishes

reliance on the simplistic assumption that “an effect

observed in animals predicts an effect in humans.”

Because the Evaluative Process requires a judgment

about human risk potential based on weight-of-evi-

dence, its approach and its results will be more use-

ful to our primary audience. This approach differs

from several programs that assess carcinogenic po-

tential, including (a) the International Agency for

Page 4: An evaluative process for assessing human reproductive and developmental toxicity of agents

64 Reproductive Toxicology Volume 9, Number 1, 1995

Research on Cancer (IARC) Monographs, which in-

voke “sufficiency of evidence” determinations for

experimental data; (b) the Science Advisory Panel

for the California Proposition 65 listing process,

which follows a similar procedure in its review of

carcinogenicity data; and (c) the Annual List of Car-

cinogens produced by the National Toxicology Pro-

gram (NTP), which primarily lists the results of ex-

perimental animal studies. Although IARC and NTP

clearly state that their deliberations do not represent

a complete assessment of human risk potential, their

monographs and lists continue to be misused for this

purpose.

Threshold Assumption

It is assumed that there is a threshold for the

chemical induction of reproductive and develop-

mental effects as for other types of toxicity. For

this reason, human risk is a result of some defined

exposure and must be determined both in terms of

an individual’s characteristics at the time of expo-

sure and in terms of such factors as route, duration,

chemical form, and concentration (dose). Thus, the

creation of a list of chemicals that cause reproduc-

tive and developmental toxicity as a means of tabu-

lating the results of evaluations is rejected in favor

of a clear narrative about each chemical.

I.2 COMMUNICATION

Primary Audience

The Evaluative Process should provide scien-

tific judgments that will be of primary use to environ-

mental, occupational, and public health officials, and

useful to management officials in the public and pri-

vate sectors. The information summarized and criti-

cally assessed through the Evaluative Process may

also serve as a valued reference to physicians in-

volved in medical counseling. Because the Eualuu-

tiue Process must be fully consonant with the ap-

plication of current scientific knowledge, its

acceptance by the targeted users will depend on

review and endorsement by medical and scientific

experts.

Narrative Statement

Communicating the results of a weight-of-evi-

dence evaluation is best accomplished through a nar-

rative document. A narrative permits expression of

the degree of certainty associated with a judgment

about the scientific evidence. The document must

use terms that are meaningful to a policy official

or decision maker with a modest level of science

education, define these terms carefully, and use

them consistently throughout. The narrative must

use explicit candor in explaining the basis of the

judgment, the breadth of expert support, the degree

to which the judgment reflects the actual informa-

tion, and the assumptions made in the absence of

information.

Certainty

Documents produced under the Evaluative Pro-

cess will clearly enunciate the level of confidence

in the evaluative judgment. Any need to invoke a

series of default assumptions will signify progres-

sively greater degrees of uncertainty. Certainty that

a judgment is correct based on the interpretation of

essential data should be distinguished from “cer-

tainty based on defaults,” where default assump-

tions force evaluators to designate an agent as having

toxic potential. The conservative default assump-

tions, based on prudent public health concerns, have

a rightful place in the options available to risk asses-

sors and managers. Such assumptions will be used as

part of the Evaluative Process only where absolutely

necessary, and always openly.

Finally, because the Evaluative Process adopts

an open, candid, narrative form of communication,

it minimizes the dissemination of inappropriate sim-

plistic statements that are commonly misused and

are needlessly alarming to the public.

I.3 DATA SOURCES AND ACCEPTABILITY

Use All Relevant Data

In reaching a determination about an agent’s

potential toxicity to humans, the public’s need is

met best by a consideration of all relevant data.

Unfortunately, however, publication in the open sci-

entific literature does not a priori qualify data as

acceptable for evaluation. Many published articles

commonly present data in insufficient detail to allow

them to be of use in risk evaluations. Furthermore,

scholarly peer review, often touted as a valued pre-

requisite for publication, has proven over the years

to vary widely in effectiveness.

In this Evaluative Process, decisions to use ei-

ther published or unpublished data will depend upon

on the quality and completeness of the data set. All

data used by Expert Committees evaluating particu-

lar chemicals must be accessible for evaluation by

other interested parties. To enable evaluators to use

data that are considered confidential for legal or pro-

prietary reasons, the Expert Committee can set up

mechanisms on a case-by-case basis to allow these

data to be an integral part of the Evaluative Process.

Page 5: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process l J. A. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAMOORE ET AL. 65

Good Laboratory Practices

Whether data are judged acceptable from the

perspective of sound scientific design and interpreta-

tion will depend heavily on the actual review of

specific studies. Good Laboratory Practices have

been promulgated by the Organization for Economic

Cooperation and Development (OECD) (1), the

Food and Drug Administration (FDA) (2), and the

U.S. EPA (3). These Good Laboratory Practices

can serve as a useful guide in assessing the quality

and completeness of reported data. Comparing the

test design and completeness of data reporting to

those outlined in test guidelines and procedures may

be of particular value.

only after their effects have occurred in humans.

An approach that emphasizes proper testing as a

prerequisite for human exposure is preferable, be-

cause it would prevent disease. The current ap-

proach only identifies disease.

Characterizing Data as Sufficient or Insufficient

The zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAEvaluative Process uses three generic crite-

ria for judging data insufficient.

There are no data.

I.4 DATA VARIABILITY

Limitations of Current Data

The studies are of limited utility as a result of

deficiencies in their design and execution, or be-

cause there is insufficient detail in the available

data to allow an independent analysis.

Or, the available studies are acceptable, but the

data are insufficient to reach a definitive conclu-

sion; the study may, however, offer useful supple-

mental information.

Developmental toxicity studies typically assess

whether structural abnormalities are associated with

administration of an agent to a pregnant female dur-

ing major organogenesis in the developing embryo.

Very few studies, however, permit reasoned judg-

ments about the potential of agents to affect postna-

tal function and development.

In the area of reproduction, one can assess gen-

eral effects through analysis of two-generation stud-

ies. Specific parameters of male reproduction can

also be assessed through histologic examination of

testis and epididymis in a subchronic or chronic tox-

icity study. Sperm parameters, such as number,

morphology, motility, and ability to penetrate ova,

can also be evaluated.from other types of studies.

There is, however, almost a complete lack of data

that specifically assess female reproductive func-

tion. Few acceptable noninvasive laboratory proce-

dures for assessing female reproductive toxicity

exist, and those that are available are rarely well

validated.

Data sets that are insufficient for evaluating re-

productive or developmental toxicity do not arise

solely from studies that are unreliable and, there-

fore, unworthy of consideration. Information from

in vitro or nontraditional in vivo studies, for exam-

ple, frequently provides enough experimental evi-

dence to corroborate other evidence for an adverse

effect. Alone, however, these studies may not pro-

vide enough evidence to be considered as sufficient

to identify an adverse effect.

A judgment that data are insufficient to establish

an adverse effect does not mean that they are suffi-

cient to establish lack of an adverse effect. Such a

presumption would be erroneous. Sufficiency is a

designation with stringent criteria; these are defined

and discussed in later sections of the Evaluative

Process.

I.5 THE EXPERT COMMITTEE

Data Needs

If assessments of reproductive toxicity are to

be meaningful, future research must give much more

emphasis to the development and validation of test

procedures; this is particularly critical for the assess-

ment of female reproduction. Procedures that evalu-

ate postnatal development need to be refined and zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAuse d more consistently. In clinical settings, investi-

gators have developed sophisticated procedures for

health and safety assessments in humans; to date,

counterparts to these tests have not been developed

or validated for use in laboratory animals. Thus,

where there is no test in animals, some chemicals

that affect reproductive processes may be identified

Any evaluation of a chemical should use com-

mittees of experts to provide the breadth of expertise

that is rarely found in any one individual and to

ensure that the views held by any one person are

subjected to the scrutiny and acceptance of scientific

peers. The positive experience of the International

Agency for Research on Cancer, which uses groups

of experts to develop the IARC Monographs on the

Evaluation of Carcinogenic Risks to Humans, is sim-

ilarly applicable to a process that evaluates repro-

ductive and developmental toxicity.

The scientists selected for a particular Expert

Committee should include experts in the chemicals

and in the toxicologic effects to be evaluated, as

well as in human exposure to the chemicals of inter-

est. The desirability of having continuity in the Eval-

Page 6: An evaluative process for assessing human reproductive and developmental toxicity of agents

66 Reproductive Toxicology Volume 9, Number 1, 1995 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

uative Process should also influence the selection Sections 11.2.1-11.2.6 of the Evaluative Process de-

of experts. A rotating core of scientific members velop judgments for each of three general develop-

who serve for a fixed period of time on a series of mental or reproductive toxicity effects: develop-

working groups will enhance consistency of reviews. mental toxicity, female reproductive toxicity, and

The members of the rotating core are selected for male reproductive toxicity. Brief summaries of the

their expert knowledge in a relevant scientific area sections that describe each step appear below, fol-

and for their experience in public and environmental lowed by more detailed presentations in the rest of

health practice. the chapter.

Each Expert Committee member is required to

participate as an independent scientist, and not as

an emissary of government, industry, or any other

organization. Any potential conflict of interest must

be ascertained and, when necessary, individuals

must state that a possible conflict exists when a

particular chemical is being discussed and not par-

take of any formal decision on its evaluation.

Section 11.2.1, Exposure Data, discusses the

pattern and degree of human exposure to the agent.

It considers Consumer, Environmental, and Occu-

pational exposures, and develops numerical esti-

mates of exposure from what is known about these

uses and exposures.

The Evaluative Process for Assessing Repro-

ductive And Developmental Toxicity of Agents was

developed with the expectation that its main use

would be for the evaluation of industrial chemicals,

pesticides, and drugs. The basic principles of the

process can also be used, however, in the assess-

ment of infectious or physical agents or as a model

for the evaluation of other forms of toxicity. More

generally, the Evaluative Process can offer an op-

portunity for the wider scientific community to be-

come familiar with the process of risk assessment.

The hope is that a broader appreciation of the need

for scientific data and knowledge to enhance the

certainty of judgments formulated during a risk as-

sessment will stimulate additional research that will

advance the quality of risk assessment procedures.

Section 11.2.2, General Toxicologic and Bio-

logic Parameters, reviews and summarizes the

chemical data and basic toxicity information avail-

able on the agent of interest, and also reviews data

associated with absorption, distribution, metabo-

lism, and excretion. These latter data are summa-

rized later in Section 11.2.4, Zntegration of Toxicity

and Exposure Information.

Section 11.2.3, Developmental and Reproduc-

tive Toxicity, reviews data on developmental and

reproductive toxicity from experimental human

studies and animal studies. To ensure adequate as-

sessments of both types of data, members of an

expert review committee review each type of data

independently and prepare synopses of individual

studies that are then integrated with other data in

the next step in the evaluation.

In the Integration of Toxicity and Exposure In-

formation, described in Section 11.2.5, the existing

data on developmental and reproductive toxicity ob-

tained from experimental animal and human studies

are evaluated together for evidence of complemen-

tarity or inconsistency. These evaluations are then

assessed in terms of the known data on basic toxicity

and pharmacokinetics. The result is an integrated

judgment about the relevance of all the data for pre-

dicting potential risk for humans. If the review com-

mittee members judge that the toxicity data are rele-

vant to humans, they then undertake a quantitative

evaluation, drawing upon information presented in

the Exposure section.

II. THE EVALUATIVE PROCESS

II. 1 GENERAL DESCRIPTION

The Evaluative Process describes a systematic,

sequenced procedure for reviewing data on repro-

ductive and developmental toxicity, on the chemi-

cal’s general toxicologic and biologic parameters,

and on the conditions of use that result in exposure.

The goal is to determine whether an agent has the

potential to cause reproductive or developmental

toxicity in humans. Expert judgment is applied in a

series of steps that are reviewed in the following

sections of this chapter: 11.2.1. Exposure Data;

11.2.2. General Toxicologic and Biologic Parame-

ters; 11.2.3. Developmental and Reproductive Tox-

icity; 11.2.4. Integration of Toxicity and Exposure

Information; 11.2.5. Critical Data Needs; 11.2.6.

Summary; and 11.2.7. References.

These steps reflect the systematic thought se-

quences used by most experienced risk assessors.

The next step in the evaluation, described in

Section 11.2.5, is the identification of Critical Data

Needs. When the data reviewed are deficient, the

ensuing judgments usually involve a large degree of

uncertainty. This step in the evaluation will identify

deficiencies in the existing data only if research to

fill those data gaps will materially enhance the cer-

tainty of future judgments about an agent’s risk

potential.

Page 7: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process l J. A. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAMOORE ET AL. 67

The zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBASummary, described in Section 11.2.6, re-

views the scientific judgments and conclusions

formed in the steps above, and conveys the degree

of confidence in the judgment. So it will be clearly

understandable to its intended audiences, which in-

clude public officials as well as public health and

environmental health professionals, the Summary is

written in a narrative style. The narrative is central

to the accurate interpretation of the scientific judg-

ments and conclusions about the agent of interest.

Agents that are potential reproductive or develop-

mental toxicants present a risk to human health only

under certain conditions. Cryptic designations, such

as “positive” or “negative,” cannot effectively

communicate this critical fact. Nor can essential

facts about such parameters as frequency, duration,

and route of exposure, susceptible populations, age,

and reproductive status be conveyed without some

sense of context. For these reasons, a narrative form

of summary is crucial.

The last step, described in Section 11.2.7, is a

presentation of references for papers and studies of

the agent of interest. The first section provides a list

of the references that were cited in the Evaluation,

while the second part lists all references considered

for the evaluation.

Chapter III of this document, Endpoint De-

scriptors, provides a terse description of the core

data, and their interpretation, commonly used to

evaluate developmental and reproductive toxicity.

Several documents were found to be of particular

value in developing the Evaluative Process. The

U. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAS . Environmental Protection Agency (U.S. EPA)

has published Revised Guidelines for Develop-

mental Toxicity Risk Assessment (4); Guidelines for

Reproductive Toxicity Risk Assessment were pro-

posed in 1988, (5,6) and are currently being com-

pleted. These guidelines lay out the general princi-

ples for the interpretation and use of data for risk

assessment.

11.2. DETAILS OF THE EVALUATIVE

PROCESS

The sections below detail the steps of the Evalu-

ative Process. Figure 1 illustrates the structure of

an Evaluation (Lithium) in outline form.

11.2.1 Exposure Data

In this step, human exposure data are evaluated

to achieve three goals:

1. To ascertain whether there are patterns of use

that result, or probably result, in human ex-

posure.

To describe the parameters associated with each

pattern of use. These include route, dose, fre-

quency, age, and number of people potentially

exposed.

To estimate the range of exposure and, thus, ob-

tain quantitative estimates of the exposures asso-

ciated with patterns of use.

Although human exposure data are essential for

accurate evaluation of an agent’s risk potential, data

of sufficient quality and quantity are frequently un-

available. Thus, there is uncertainty in the exposure

component of the evaluative process, even as there

is in the hazard identification step of risk assessment.

In instances where toxicity data indicate potential

for an adverse effect, the need to estimate the nature

of human exposure becomes imperative. In these

instances, gaps in the exposure data trigger the need

to employ one or more default assumptions about

human exposure. The greater the number of default

assumptions employed, the greater the uncertainty

about the accuracy of the expert judgment.

A chemical may have a variety of uses in our

society. For each use, the concentration, route, and

frequency of exposure may be quite different. The

physical form of the chemical and the presence of

other agents may also vary with use. These factors

can dramatically influence both the probability that

exposure will lead to absorption into the body and

the rate at which absorption occurs. Some uses may

lead to indirect exposures, which may result from

either deliberate or incidental environmental re-

leases of the chemical. Pesticide residues in food are

an example of exposure that arises from a deliberate

environmental release. Incidental or deliberate re-

leases of pesticides, through normal use or accident,

may lead to exposure through drinking water or the

air we breathe. Some exposures are deliberate: ex-

amples include consuming a chemical as a drug,

bathing in a pool that contains chemicals to control

algae, pH, and clearness, or using chemicals to mask

odors. Although the frequency and intensity of expo-

sure to an agent is typically greatest in occupational

settings, sometimes consumer use of certain prod-

ucts may lead to episodes of exposure intensity that

approach or exceed occupational exposures. Exam-

ples include some pesticide uses in the home, furni-

ture refinishing, cosmetics, nonprescription drugs,

and home remodeling.

For many patterns of chemical use, reliable data

that quantify exposure either do not exist or are

not publicly available. In such instances, exposure

estimates are developed through the use of default

assumptions. When risk assessors estimate expo-

sure to pesticide residues in foods, for example, they

Page 8: An evaluative process for assessing human reproductive and developmental toxicity of agents

68 Reproductive Toxicology Volume 9, Number 1, 1995

PREFACE

INTRODUCTION

1. EXPOSURE DATA

1.1 Consumer Exposure

1.2 Environmental Exposure

1.3 Occupational Exposure

1.4 Exposure Estimates

2. GENERAL TOXICOLOGIC AND BIOLOGIC

PARAMETERS

2.1 Chemistry

2.2 Basic Toxicity

2.3 Pharmacokinetics

3. DEVELOPMENTAL AND REPRODUCTIVE

TOXICITY Data

3.1 HumanData

3.1.1 Developmental Toxicity

3.1.1.1 Register Studies

3.1.1.2 Prospective Studies

3.1.1.3 Retrospective Studies

3.1.1.4 Clinical Case Reports

3.1.2 Reproductive Toxicity

3.2 Experimental Animal Toxicity

3.2.1 Developmental Toxicity

3.2.1.1 Studies in Mice

3.2.1.2 Studies in Rats

3.2.1.3 Studies in Rabbits, Monkeys & Pigs

3.2.2 Reproductive Toxicity

3.2.2.1 Female Reproductive Toxicity

3.2.2.2 Male Reproductive Toxicity

Fig.

4. INTEGRATION OF TOXICITY & EXPOSURE

INFORMATION

4.1 Interpretation of Toxicity Data

4.1.1 General Toxicity & Pharmacokinetics

Conclusions

4.1.2 Developmental Toxicity

4.1.2.1 Conclusions

4.1.3 Reproductive Toxicity

4.1.3.1 Female Reproductive Toxicity

4.1.3.2 Male Reproductive Toxicity

4.1.3.3 Conclusions

4.2 Default Assumptions

4.3 Quantitative Evaluation

4.3.1 Developmental Toxicity

4.3.2 Reproductive Toxicity

5. CRITICAL DATA NEEDS

5.1 Developmental Toxicity

5.2 Female Reproductive Toxicity

5.3 Male Reproductive Toxicity

6. SUMMARY

6.1 Background

6.2 Human Exposure

6.3 Toxicology

6.3.1 Developmental Toxicity

6.3.2 Reproductive Toxicity

6.3.2.1 Female Reproductive Toxicity

6.3.2.2 Male Reproductive Toxicity

6.4 Quantitative Evaluation

6.4.1 Developmental Toxicity

6.4.2 Reproductive Toxicity

6.5 Certainty of Judgments and Data Needs

6.5.1 Developmental Toxicity

6.5.2 Reproductive Toxicity

7. REFERENCES

1. Example table of contents from an evaluation (lithium).

commonly assume that the consumer absorbs 100%

of the residue, that the residue is present in all of the

food crop under consideration, and that the residue

concentration is at the legal maximum. Each as-

sumption usually represents a conservative decision

about the value that is to be used when no specific

information is available. In addition to these chemi-

cal-specific default assumptions, a wide range of

other exposure-related default values are used in

estimating exposures. These include the amount of

water consumed each day, volume of air inhaled,

amount of soil a child ingests, average surface area of

skin, and the frequency of consumption and portion

sizes of certain foods. The quality of the studies

from which these values are derived varies greatly.

For consistency, these default values do not vary

on a case-by-case basis unless there is a compelling

reason to change them. Generic revisions of these

values take place as more definitive data become

available, but these revisions are infrequent.

It is beyond the scope and ability of this zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAEuafua-

rive Process to propose the review of each exposure

parameter on a chemical- and use-specific basis. The

document will employ exposure paradigms and val-

ues that are in regular use in governmental agencies

or that are recommended by scientific organizations.

The specific paradigm employed and values selected

will be referenced in each instance. Consistent with

our stated preference to supplant default assump-

tions with actual data, the process will make reason-

able efforts to ascertain the availability and quality

of such data and will explicitly state where it makes

use of default assumptions or actual data.

The process will use the following general refer-

Page 9: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process l J. A. MOORE ET AL. 69

ences: the Exposure Factors Handbook (7), the revi-

sion to the EPA Exposure Assessment Guidelines

(8), and Quantitative Risk Assessment for Environ-

mental and Occupational Health (9). In analyzing

exposures in certain occupational settings, the docu-

ment will use the Pesticide Assessment Guidelines,

Subdivision U, “Applicator Exposure” (10). For

estimating dietary exposure to pesticides, it will use

the Dietary Risk Evaluation System (11).

Each evaluation will provide basic information

for each agent, including Chemical Abstracts Ser-

vice Registry Number, Chemical Abstracts Primary

Name, IUPAC Systemic Name, major synonyms

and common names, basic physical and chemical

properties, and technical products that contain the

agent. The evaluations will describe patterns of use,

with specific emphasis on the potential for direct or

indirect exposure. Data on real or estimated levels

of exposure will be of the highest interest, as will

information on the population distribution, inten-

sity, routes, and durations of exposure. Other data

of value include industrial hygiene measurements at

the point of manufacture, materials balance (inputs,

products, and waste) at the point of production, ship-

ping patterns and methods of transportation, indus-

trial hygiene measurements at point of use if rele-

vant, and ambient air monitoring data. Where

multiple patterns of use or routes of exposure occur,

an effort will be made to determine whether certain

patterns account for greater magnitudes of expo-

sure. The evaluations will also consider data on envi-

ronmental fate and transport to ascertain whether

these might play a significant role in the estimation

of exposure. They will also seek Toxic Release In-

ventory data as well as such data as residues in food

and potable water. For drugs, the evaluations will

gather basic dosimetry information, including a pro-

file of the user population.

11.2.2 General Toxicologic

and Biologic Parameters

11.2.2.1 Chemistry

Generic chemical class data are often relevant

to assessing potential toxicity and should be a part

of the evaluation. This type of information includes

structure-activity relationships and physical chemi-

cal properties, such as melting point, boiling point,

solubility, and octanol-water partition coefficient.

An agent’s fate and transport, that is, which break-

down products can be found in different environ-

mental media under various conditions, can be espe-

cially relevant to estimations of exposure. Other

useful chemical data are the soil sorption constant,

the bioconcentration factor, and the bioaccumula-

tion factor.

11.2.2.2 Basic toxicity

Reproductive or developmental toxicity end-

points must be interpreted in the context of nonre-

productive toxicity that may also occur in the same

animals. Nonreproductive toxic effects reported

from other studies can be of particular value because

excessive nonreproductive toxicity could signifi-

cantly confound the interpretation of a reproductive

or developmental toxicity study. Observations from

studies of nonreproductive toxicity may either

strengthen or weaken the conclusions that might be

drawn from a reproductive or developmental study.

Relevant toxicity data typically originate from

acute (single dose) or repeated dose studies of up

to 90 d duration. Protocols for these studies have

been developed by federal regulatory agencies, such

as the Food and Drug Administration (FDA) and the

Environmental Protection Agency (EPA). Interna-

tional entities such as the Organization for Economic

Cooperation and Development (OECD) also pro-

mulgate broadly accepted test guidelines.

Acute studies. Acute studies, where the primary

endpoint is lethality, are most often conducted in

rats or mice by the oral or inhalation routes, or in

rabbits by the oral or dermal routes of exposure.

Although acute lethality data are not predictive

of either reproductive or developmental toxicity, the

data are useful indicators of divergences in species

or route sensitivity. Where there are significant spe-

cies differences in acute toxicity, for example, one

would also expect differences between species in

the doses that possibly cause reproductive or devel-

opmental toxicity. Dose levels for reproductive and

developmental toxicity studies are selected on the

basis of general toxicity parameters, such as mortal-

ity, body weight, organ weight, and gross necropsy

findings. Because the No Observed Adverse Effect

Level (NOAEL) (see section 11.2.4.3) and Lowest

Observed Adverse Effect Level (LOAEL) (see sec-

tion 11.2.4.3) are defined within the range of doses

tested, species differences in the dose range for test-

ing will be manifested as species differences in

NOAELs and LOAELs for most chemicals.

Acute toxicity data may also suggest the extent

of absorption through different routes of exposure.

If, for example, systemic toxicity or death can occur

as a result of significant absorption of the chemical

through dermal exposure, it must be assumed that

dermal exposure might also cause reproductive or

developmental toxicity.

Page 10: An evaluative process for assessing human reproductive and developmental toxicity of agents

70 Reproductive Toxicology Volume 9, Number 1, 1995

Repeated-dose studies. In subchronic studies,

animals are exposed for periods that typically range

over 14,28,60, or 90 d. These exposures may occur

by oral, inhalation, or dermal routes of exposure.

Most commonly, these studies use rats or mice, but

data are sometimes available from studies in rabbits

(especially the dermal route), dogs, or subhuman

primates.

Repeated-dose studies identify the organs that

are principally affected by toxicity. Data from these

studies are also used to define the slopes of dose-

response curves and NOAELs and LOAELs for the

toxic end points, and to identify sex and species

differences in toxicity at sublethal levels of expo-

sure. Measurements and observations during these

studies include body weight, clinical signs, feed and

water consumption, and clinical pathology parame-

ters (hematology and clinical chemistry measures of

organ function). At the termination of the study, all

animals undergo gross necropsy examination, and

selected organs (usually the liver, kidney, brain, go-

nads, ovaries, uterus, spleen, and thymus) are

weighed. Portions of all organs are preserved for

histopathologic examination.

Repeated-dose studies can easily be modified

to provide valuable information on reproductive or-

gans and, to a limited extent, function. For example,

accessory sex organs and the epididymides can be

weighed in males. Sperm can be examined for con-

centration, motility, and sperm abnormalities. Sper-

matid head counts are a useful measure of sperm

production. Proper fixation, embedding, and stain-

ing of the testis (beyond routine formalin-fixed, par-

affin-embedded sections; see section 111.2.3.2) can

permit detection of disrupted spermatogenesis.

Monitoring of the vaginal cyclicity of females can

be a useful complement to histologic or endocrine

data.

Changes in the weight or morphology of the

reproductive organs must be interpreted in the con-

text of other systemic or general toxicity. Such ef-

fects must be considered in the overall evaluation

of reproductive toxicity, especially if there is no

evidence of other systemic toxicity. The predictive

value of these observations for changes in reproduc-

tive function have been reviewed (12,13).

The dose at which toxicity to adult females is

observed in a reproductive study should be com-

pared with the corresponding doses at which toxicity

was observed in other toxicity studies. These com-

parisons should determine whether the pregnant or

lactating female may be more sensitive to an agent

than are nonpregnant females. The sensitivity of the

paternal animal, if the exposure takes place before

mating, should also be compared.

Genetic toxicity. Genetic damage is one possi-

ble mechanism, but not the only mechanism, by

which reproductive or developmental toxicity oc-

curs. Mutagens that cause reproductive or develop-

mental toxicity may, in fact, act through mecha-

nisms other than mutagenesis. For this reason,

results of genetic toxicity screens may not be useful

predictors of either reproductive or developmental

toxicity. Nonetheless, such in vivo procedures as

the dominant lethal test may provide relevant infor-

mation about reproductive or developmental toxic-

ity, aside from their assessment of mutagenicity.

Other end points. A review of human epidemio-

logic evidence on other nonreproductive and nonde-

velopmental diseases, such as cancer, acute toxic-

ity, neurotoxicity, or immunotoxicity, may provide

meaningful perspectives on target organ effects and

dose relationships. Such human data may also per-

mit meaningful comparisons between effects in hu-

mans and effects seen in experimental systems.

11.2.2.3. Pharmacokinetics

Data on the pharmacokinetics of a particular

agent, both in the species tested and in humans, can

be a great aid in making an extrapolation of toxic

dose levels between species. Information on absorp-

tion, half-life, steady-state, and peak plasma concen-

trations of the parent compound and metabolites,

placental metabolism and transfer, number of meta-

bolic pathways, and comparative metabolism may

be useful in predicting the risk of reproductive or

developmental toxicity in humans. Such data may

also be helpful in defining the dose-response curve,

developing a more accurate comparison of species

sensitivities (14,15), determining dosimetry at target

sites, and comparing pharmacokinetic profiles for

various dose regimens or routes of exposure. Al-

though there have been substantial advances in our

understanding of pharmacokinetics, there is still

considerable uncertainty about when and how hu-

man pregnancy changes chemical metabolism.

Pharmacokinetic studies in developmental toxi-

cology are most useful if they are conducted in ani-

mals during the stages in which developmental in-

sults occur. The correlation of pharmacokinetic

parameters and developmental toxicity data may en-

hance both our understanding of the effects ob-

served and their predictive value (16).

Because human pharmacokinetic data are often

lacking, absorption data for studies conducted in

laboratory animals, by any relevant route of expo-

sure, may assist those who must interpret animal

toxicity data for risk assessments. Results of a der-

ma1 developmental toxicity study that shows no ad-

Page 11: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process 0 J. A. MOORE ET AL. 71

verse developmental effects, but that also shows

no evidence of dermal absorption, are potentially

misleading. Such a study would be insufficient for

risk assessment, especially if it were interpreted as

a “negative” study, i.e., one that showed no adverse

effect. In studies that have detected developmental

toxicity, regardless of the route of exposure, skin

absorption data can be used to establish the internal

dose in the dams for risk extrapolations. For specific

guidelines on both the development and the applica-

tion of pharmacokinetic data, risk assessors can con-

sult the conclusions of the Workshop on the Accept-

ability and Interpretation of Dermal Developmental

Toxicity Studies (17).

Effective management of human risk is most

likely to be accomplished through management of

human exposure. Animal toxicity studies typically

define a response as a function of exposure. Com-

mon descriptions of exposure (mg/kg, ppm X h, mg/

m3 x h, etc.) are sometimes a poor surrogate for

the toxicologically important target-organ dose of

the active metabolites. This is particularly true for

the inhalation and dermal routes of exposure. Thus,

the extrapolation of potential human risk from ani-

mal toxicity data without further knowledge of inter-

nal dosage will involve uncertainty. This uncertainty

can be reduced by generating data that:

l Predict differences or similarities in toxicity by

route- and species-specific information on absorp-

tion, distribution, metabolism, and elimination.

l Describe blood and tissue levels of the active toxi-

cant at the target site (if possible) and relate these

to the corresponding levels of toxicity.

. Identify the active toxicant.

l Identify target organs or cells.

The greater the depth of understanding of toxic-

ity and toxicant disposition in animals and humans,

the less will be the uncertainty of extrapolation

across species and routes. If, for example, there is

a known biomarker of effect for a particular toxicant,

the biomarker could be used to quantify human ex-

posure and, thus, to define a relatively direct means

of minimizing human risk. Target organ or cell do-

simetry would also be helpful, but such data are

usually available only for drugs. In practice, dosime-

try data are available only for blood measurements

(peak level and area under the curve [AUC], for both

parent material and metabolites), and these may or

may not correlate well with target organ toxicity.

Where sufficient data are available, physiologically

based pharmacokinetic models may provide the best

means for reducing uncertainty in extrapolation of

dosimetry.

Interpretation of data from studies in which ma-

ternal animals are exposed during lactation should

take into account possible interactions of the agent

with maternal behavior, pup suckling behavior, and

milk composition. The analysis should further con-

sider possible direct exposure of pups via nursing,

dosed feed or water, and the dam’s skin, hair, or

feces (4).

11.2.3 Developmental and Reproductive Toxicity

Data for assessing reproductive or develop-

mental toxicity are derived either from observations

of humans or from experimental studies. It is beyond

the scope of this document to enumerate the kinds

of data that can permit a complete assessment of

reproductive and developmental toxicity that covers

all situations. The definition of a sufficient data set

changes as scientific knowledge accumulates on spe-

cific agents and as the understanding of the pre-

dictive capabilities of animal models and other pro-

cedures improves. Chapter III of this document

describes the types of studies that commonly pro-

vide such information and offers guidance in their

interpretation.

The reproductive and developmental toxicity

data component of the Evaluative Process deter-

mines one or the other of two judgments: (a) the

collective toxicity data are sufficient (or insufficient)

to ascribe an adverse effect under specified condi-

tions, or (b) the data are sufficient (or insufficient)

to conclude that there is no adverse effect under

specified conditions. To ensure a degree of system-

atic rigor, the process evaluates the experimental

animal data and the human toxicity data indepen-

dently. Each assessment uses a standardized format

to summarize the conditions of the test (species,

doses, dose route, and duration) in which the effect

(decreased sperm count, increased length of estrous

cycle, altered sexual differentiation of offspring,

etc.) was or was not observed.

The independent consideration of animal data

and human data is only an initial and incomplete

step. Only when these independent assessments are

combined and integrated with analyses from the

chemical and biologic data section does the assess-

ment achieve significance for health evaluation pur-

poses. Section 11.2.4, Integration of Toxicity and

Exposure Information, describes this process.

11.2.3.1 Human data

Utility. Human data are typically found in case

reports or in the results of epidemiologic investiga-

tions. Case reports may describe one or a series of

clinical observations of disease similarities or appar-

ently common exposures to an agent. Although indi-

vidual case reports, by themselves, are never con-

clusive, case series reports that identify a cluster of

Page 12: An evaluative process for assessing human reproductive and developmental toxicity of agents

72 Reproductive Toxicology Volume 9, Number 1, 1995

adverse effects have proven valuable in highlighting

priority areas of research.

Epidemiology has been defined as the study of

the distribution of a disease or a physiologic condi-

tion in human populations and of the factors that

influence this distribution. It lends itself to a two-

stage sequence of reasoning: (a) the determination

of a statistical association between a characteristic

and a disease or (b) the derivation of biologic infer-

ences from such a pattern of statistical associations

(18). Epidemiologic studies that provide statistically

significant evidence of an association between ex-

posure to an agent and an adverse effect in individ-

uals that is consistent with proper biologic infer-

ences, may assume a dominant role in human risk

assessment. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

Types of epidemiologic studies. Good epidemi-

ologic studies rely on careful measures of exposure,

disease, and confounding factors. They typically try

to establish associations between an exposure and a

disease by comparing the exposures and the disease

rates between exposed and unexposed groups. One

type of epidemiologic study, the ecologic study,

compares disease rates in different populations to

generate hypotheses about the causes of disease in

the population, based on differences in environment

or exposure between the two groups under compari-

son. Because the two populations may differ in many

factors other than the exposure of interest, an appar-

ent correlation between the exposure and disease

may, in fact, be the result of some other factor than

the one under study.

The most persuasive epidemiologic studies es-

tablish associations between a characteristic and a

disease within groups of individuals. A study of the

relationship between respiratory disease and air pol-

lution, for example, might combine individual mea-

surements of respiratory disease with aggregate

measurements of criteria pollutants in the ambient

air where the subjects work or live. Measuring indi-

vidual exposures reduces the likelihood of misclassi-

fying exposed individuals as unexposed and vice

versa. Such misclassification can dilute the study’s

ability to detect a true effect and bias the results

towards no effect.

Epidemiologic study designs fall into several

categories: the best-known are cross-sectional stud-

ies, cohort studies, and case-control studies. A

cross-sectional study simply surveys a group of peo-

ple for risk factors (exposures) and disease. Because

a cross-sectional study does not establish when ex-

posure happened in relation to the development of

disease, it cannot establish cause and effect.

In a cohort study, the individuals studied (the

cohort) are selected based on exposure to an agent.

That is, the investigator selects a group of individu-

als, based on their exposure status, and studies them

over time for the development of disease. The ideal

cohort study is prospective; that is, it identifies a

disease-free population and follows the group over

time. Many occupational studies, however, are his-

torical prospective studies that identify a group of

people who were disease free at some point in the

past and then follow their disease or mortality his-

tory up to the present.

In a case-control study, investigators select the

subjects on the basis of disease: disease cases and

comparable controls without the disease. Once the

subjects are selected, the investigators try to deter-

mine and compare the subjects’ exposure histories.

The case-control study, an extremely important de-

sign in epidemiology, is often the only feasible way

to study rare disease risks in humans. Case-control

studies are always retrospective; that is, they look

back at the past to learn about the exposures that

may have led to the disease.

The criteria for judging epidemiologic studies

include: (a) Strength of association-is the relative

risk significantly greater than l.O? (b) Consistency

and specificity of observations-does the observed

association appear in other studies, conducted with

other study designs? (c) Evidence of dose re-

sponse -are disease levels higher with higher expo-

sure? (d) Appropriate temporal relationships-does

the exposure precede the disease? (e) Biologic plau-

sibility-are the results of the study consistent with

what is known about the pathology and natural his-

tory of the disease and the agent?

Most well-conducted epidemiologic studies also

address such issues as statistical power, confound-

ing, validity of exposure measures, validity of out-

come measures, and ability to generalize to other

populations at risk. Whether or not these issues need

to be addressed depends on the design of the study,

what was measured, and the findings. Careful mea-

surements of exposure and effects are necessary

to avoid misclassification of exposure and disease

status of the subjects. These careful measurements

are usually obtained through the use of validated

questionnaires, medical records, pathologic verifi-

cation of diagnosis, biomarkers, and industrial hy-

giene or other environmental measurement tech-

niques. Because random misclassifications of

exposure reduce a study’s ability to find true effects,

it is vital to the evaluative process that available

epidemiologic studies be carefully reviewed by

epidemiologists who are familiar with state-of-

Page 13: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process l J. A. MOORE ET AL. 73

the-art reproductive epidemiology and exposure

assessment.

Bias. Bias in epidemiologic studies is a nonran-

dom mislabeling of observations that gives rise to

incorrect effect estimates. Common types of study

bias include selection bias, misclassification, and

confounding. Selection bias may result from meth-

ods used to identify and recruit subjects. Random

misclassification of individuals by disease outcome

or exposure levels leads to an underestimate of the

true effect. Differential (nonrandom) misclassifica-

tion leads to higher or lower estimates of risk, de-

pending on the direction, frequency, and magnitude

of the errors. In general, reports of epidemiologic

studies should discuss any potential for bias.

Confounding. A confounder is a factor associ-

ated with both the exposure and the outcome of

interest. Because unmeasured confounders can lead

to an inaccurate measure of a risk, it is important

to measure confounders accurately and control for

them in the analysis.

Timing of exposure. To ensure biologic plausi-

bility, an epidemiologic study should be able to show

that exposures occurred before the adverse effects

and at an appropriate time in the reproductive pro-

cess. Depending on the timing of exposure, one

might observe different adverse outcomes. Expo-

sure at an early gestational age, for example, may

result in a specific malformation, while exposure at

a later stage might result in premature delivery. It

is well known, for example, that a human fetus ex-

posed to thalidomide during the first trimester is at

risk for phocomelia, while the same exposure during

the last trimester is without such effect. The study

should consider exposure assessments for those pe-

riods when the adult male and female and the fetus

are considered most susceptible to the adverse effect

under investigation.

chemical’s toxic potential in humans. As a result,

investigators have developed a standard series of

animal test procedures that domestic and interna-

tional regulatory bodies require for such chemicals

as drugs, pesticides and, to a lesser degree, other

industrial and commercial chemicals. Despite the

proven utility of animal and other laboratory data,

a number of factors can limit their use. Although

there are close similarities in such biologic processes

as fetal and embryonic development, sperm produc-

tion, and estrous cycle, distinct differences and vari-

ations also exist between mammalian species. Such

differences can confound the certainty of predicting

that an effect seen in a laboratory species could

occur in humans. It is not uncommon, for example,

to observe an adverse effect in one animal species,

while a second species either shows no effect or

shows effects only at markedly different doses.

There are practical constraints on the number of

animals that can be studied; this places statistical

limits on the certainty of some test results. Poor

study design or laboratory practices can also com-

promise the data. Thus, because experimental ani-

mal toxicity data can be fallible, it is imperative

that the Evaluative Process include a review and

interpretation of animal data by scientists with ap-

propriate training and experience. The logic that un-

derpins their interpretation of data should be stated

clearly in the evaluation so that other experts can

understand the basis for the evaluative judgment.

Adverse effect. In general, three criteria must

be met in order to conclude that the animal data are

sufficient to indicate an adverse effect in animals

under the specified conditions of the experiment:

Dose-effect outcome. Although evidence of a

dose response is an important criterion for judging

the adequacy of a study, in rare instances an appar-

ent inverse dose response may occur in studies of

development toxicity. Low exposures, for example,

may cause a certain incidence of malformations,

while higher exposure levels that cause a high inci-

dence of fetal loss may result in fewer observed

malformations.

11.2.3.2 Experimental animal toxicity

Utility and limitations. The study of chemical

exposure in animals has proven to be a reasonably

efficient and effective means for ascertaining a

At least one well-conducted study must show re-

productive or developmental toxicity in a mamma-

lian species. Instances where available study data

are not sufficient usually reflect improper design

or execution of a study, inadequate doses or dura-

tion of exposure, poor survival, or too few animals

to achieve statistical power.

Studies may be considered adequately conducted

but insufficient because the endpoints are not

clearly related to an adverse effect. Such data

should still be cited. For example, only one study

may be relevant to reproductive toxicity. This

study may have noted a decrease in production

of progesterone by cultured granulosa cells. Al-

though the study is adequate in every technical

respect, the data themselves are insufficient for

rendering an assessment of animal hazard because

the physiologic impact in vivo cannot be pre-

dicted. In such an instance, the evaluators should

Page 14: An evaluative process for assessing human reproductive and developmental toxicity of agents

14 Reproductive Toxicology Volume 9, Number 1, 1995

consider more definitive test data as a critical data

requirement in Section 5.

The data must be interpreted as having biologic

signi$cance. In most instances, biologically sig-

nificant data will also meet conventional statistical

criteria. Although the Evaluative Process strongly

endorses the application of appropriate and rigor-

ous statistical methods, it must be clear that, when

the study meets conventional statistical criteria,

it must also yield data that reflect an effect that is

both biologically plausible and considered ad-

verse.

In the occasional instance where there is statisti-

cal, but not biologic significance, the evaluation

must clearly articulate the basis for concluding

that the evidence is insufficient. For example,

pair-wise comparison to controls yields a statisti-

cally significant difference, but gives no indication

that the effect is dose related. The evaluation

should also discuss the degree of uncertainty that

is associated with such data.

Dose response. Evidence of a dose-response rela-

tionship is usually an important criterion in the

assessment of a toxic exposure. However, tradi-

tional dose-response relationships may not al-

ways be observed for some end points. With in-

creasing dose, for example, a pregnancy might

end in a fetal loss rather than in a live birth with

malformations.

No adverse effect. Typically, the demonstration

of no adverse effect requires a larger set of evidence

than the demonstration of an adverse effect. For an

evaluation to be able to conclude that a chemical

does not carry a risk of developmental toxicity, the

available studies must have been conducted in at

least two mammalian species, with no adverse ef-

fects identified. A minimum data set for a conclusion

of no reproductive toxicity would normally consist

of at least one two-generation reproductive toxicity

study that gives no evidence of reproductive tox-

icity.

Additional studies are often warranted, espe-

cially when there is prior knowledge of the general

toxicity of a given agent or chemical class, or knowl-

edge of the pharmacologic activity of the agent. The

following represent some examples where the mini-

mum data set described above should not be relied

upon to demonstrate no adverse effect.

l The presence of an indicator for postnatal func-

tional evaluation in the data base (e.g., adult neu-

rotoxicity or neuropathology, hormonal activity,

etc.) renders the developmental toxicity data base

inadequate without an additional developmental

neurotoxicity study.

A standard reproductive study in rats showing no

effect on male fertility should not be relied upon

to conclude no male reproductive toxicity in all

species. Such a conclusion is not reliable because

rodent fertility is generally not affected unless

sperm counts are reduced by approximately 90%,

while in humans, even a modest decrease in sperm

concentration may significantly reduce human

fertility.

Preexisting knowledge of the unique appropriate-

ness of an unconventional species (dog or subhu-

man primate) that is metabolically similar to hu-

mans may warrant the conduct of reproductive or

developmental toxicity study in that species to

permit a more relevant assessment of toxicity.

As they are performed today, in vitro studies will

not by themselves provide sufficient evidence of

no adverse effect.

Studies in two species are often available in which

pregnant females were exposed to an agent during

pregnancy and killed just before parturition. This

permits full evaluation of adverse effects on

mother and fetus. Such “Segment II” studies de-

signed to determine an agent’s potential to cause

structural abnormalities, growth deficits, or death

are available in two species for a number of chemi-

cals, drugs, and pesticides, in particular. When

such studies demonstrate no adverse effects, the

evaluation should state that the data are sufficient

to conclude there is no evidence that the agent

may cause developmental toxicity manifested at

birth. They are not sufficient to show that other

developmental toxicity does not occur that is man-

ifested as impaired organ system function during

infancy or as an adult.

Similarly, the absence of adverse effects in a two-

generation reproductive study would not preclude

the possibility of significant reproductive toxicity

that is not manifested as a fertility problem.

11.2.4 Integration of Toxicity

and Exposure Information

The integrated evaluation is conducted in two

stages. First, the evaluators examine the data for

relevance to potential human toxicity. Then, if they

determine the data to be relevant, they conduct a

quantitative assessment. At every stage, the evalua-

tion explicitly describes the default assumptions

used, if any.

11.2.4.1 Interpretation of toxicity data

This segment of the Evaluative Process consid-

ers all relevant information in the course of reaching

a judgment about whether or not a chemical has the

Page 15: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process l J

potential to cause developmental or reproductive

toxicity in humans. In most cases, animal data are

considered relevant indicators of human risk, unless

there is modifying information that suggests they are

not. The most common reason for concluding that

no hazard exists for humans is the availability of

sufficient experimental data that do not reveal ad-

verse effects in animal studies. Some experimental

data may demonstrate toxicity that is of limited rele-

vance to humans because of such reasons as species

differences in metabolism or sensitivity, lack of hu-

man exposure, or human evidence of no effect. Ani-

mal data, in which no adverse effects were observed,

do not always preclude human effects, nor do ad-

verse effects in animals inevitably predict human

toxicity.

The Evaluative Process requires an integrated

consideration of a variety of data. The integration

step involves combining the summary statements

that were formulated during the review of animal

and human reproductive and developmental toxicity

data and considering them in the context of systemic

toxicity parameters and pharmacokinetic data. A

weight-of-evidence approach is then used to formu-

late judgments about human hazard potential. In this

process, the evaluating committee develops three

separate statements to address developmental toxic-

ity, female reproduction, and male reproduction. In

each instance, the basis for the judgment is articu-

lated and takes particular note of such critical factors

as replication of effect across species, exposure

routes, dose-response parameters, relationship of

effective dose to doses causing other forms of toxic-

ity, and comparative metabolic data.

To achieve a degree of consistency in the inter-

pretation of experimental animal data, this docu-

ment uses three terms:

Assumed relevant indicates there is no modifying

supplemental information; in this case, human ex-

posure by any route is presumed to be modeled

by treatment of the most sensitive species by any

route.

Relevant identifies a data set in an experimental

animal species for which pharmacokinetic and/or

mechanism information is adequate to demon-

strate a particular similarity to humans.

Zrrelevant means that pharmacokinetic or mecha-

nistic features of the experimental animal model

are known in detail and are demonstrably inconsis-

tent with human exposure or response.

For most agents, there is no detailed under-

standing of absorption, distribution, biotransforma-

tion, and excretion in experimental animals or hu-

mans. In these cases, studies of the most sensitive

A. MOORE zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAET AL. 75

experimental animal species would be assumed rele-

vant, and would, thus, drive the judgment of poten-

tial risk to humans.

Where possible, an evaluation should use meta-

bolic, pharmacokinetic, and mechanism-of-action

information to determine the relevance of experi-

mental data to humans. Should the available data for

a particular species demonstrate a pharmacokinetic

response similar to humans, the data from that spe-

cies will be considered relevant. But if, for example,

an agent given to an experimental animal requires

biotransformation to produce toxicity, and if hu-

mans are known to be incapable of that biotransfor-

mation pathway, then toxicity data from that experi-

mental animal species would be considered

irrelevant to humans.

Toxicity always depends upon exposure condi-

tions, such as route of administration, timing and

duration of administration, and dose. The conserva-

tive default assumption is that, without data to the

contrary, treatment of an experimental animal by

any route is assumed relevant to human exposure by

any route. This default assumption can be dropped

when adequate modifying information is available.

If, for example, an experimental animal study uses

oral dosing, and humans are known not to absorb

the agent by the oral route, then the experimental

data are irrelevant for human oral exposure (but not

necessarily for other routes of human exposure).

11.2.4.2 Default assumptions

Certainty of judgment of toxicity is in large part

proportional to the quality and amount of chemical-

specific data. In many instances, certain desired data

are not available; in such circumstances, it has been

traditional to adopt certain (default) assumptions

and proceed with the assessment. Default assump-

tions should incorporate all available information to

reduce the level of uncertainty as much as possible.

It may be necessary to choose from a range of rea-

sonably plausible default values, such as the volume

of inhaled air for the sedentary individual, for a

worker who performs physically demanding tasks,

and for an active jogger or marathon runner. In such

instances, the common practice is to choose assump-

tions that would estimate higher doses in individuals

who constitute the exposed population. In cases in

which there is little or no information available, the

assumption selected may deliberately represent a

“worst-case” value. In every case, defaults should

be used sparingly and openly, with full disclosure

of the degree of certainty. The general default as-

sumptions proposed for use in this evaluative pro-

cess are summarized below.

Page 16: An evaluative process for assessing human reproductive and developmental toxicity of agents

76 Reproductive Toxicology Volume 9, Number 1, 1995

Absorption. Rates of absorption and elimination

are comparable among species.

If experimental animal absorption has been de-

termined but human absorption is unknown, human

absorption will be assumed to be the same as that

in the species with the highest degree of absorption.

When quantitative absorption data for a route

of exposure indicate differences between humans

and the relevant test species, the NOAEL may need

to be adjusted proportionately.

Cross-species extrapolation. When assessing

manifestations of toxicity, evaluators may base their

conclusions about relevance on the mechanism that

produces a toxicologic effect; however, a basic as-

sumption of the Evaluative Process is that any mani-

festation of reproductive or developmental toxicity

is relevant to humans unless the mechanism by

which it occurs is impossible in humans. For exam-

ple, if a toxic effect occurs in animals through an

inhibition of folic acid synthesis, this effect would

not be considered relevant for humans because hu-

mans do not synthesize folic acid. It is unusual,

however, to have such detailed knowledge about

mechanisms of toxicity in experimental animal

studies.

Additiuity. Exposure by multiple routes is as-

sumed to be additive. The default assumption is that

simultaneous exposure to multiple toxicants having

the same site or mode of action results in additive

effects. Thus, for example, estimates of the develop-

mental toxicity of chlorinated dibenzodioxins and

dibenzofurans should consider the use of toxic

equivalency factors, provided the quantitative value

assigned to each congener is relevant to the toxic

effect under consideration.

11.2.4.3 Quantitative evaluation

It should be noted that the particular type of

effect produced in an experimental animal study

does not generally have a bearing on determinations

of relevance. If an agent causes tail defects in the

offspring of treated mice, for example, this is not

considered irrelevant to humans simply because hu-

mans do not have tails. Instead, the assumption is

that the mouse study demonstrated that the chemical

interfered with vertebral development and, there-

fore, has relevance for vertebral or other features of

human development. Zimbal gland effects in rodents

offer another example of an outcome that has no

direct human tissue corollary, but such effects are

assumed relevant for humans unless it is possible to

demonstrate a mechanistic difference.

Once an assessment has determined that the

data indicate human risk potential, the next step

is to perform a quantitative evaluation. Here,

dose-response data from both human and animal

reproductive and developmental toxicity studies

are analyzed to select LOAELs and NOAELs and

to calculate the benchmark dose or doses (BMD).

The assessment should use quantitative human

dose-response data if the data span a sufficient range

of exposure. Because data on human dose-response

relationships are rarely available, the dose-response

evaluation is usually based on the assessment of

data from tests performed in laboratory animals.

Where there are experimental data from more

than one species, the assumption is that humans are

at least as sensitive as the most sensitive animal

species. If the data indicate, however, that some

particular species is a more relevant surrogate for

humans, either because of physiologic similarity at

the site of interest or because of the pharmacokinetic

parameters associated with the chemical under re-

view, such information will preempt this general as-

sumption.

Evaluators must assume that a single exposure

at a critical time in development or in the reproduc-

tive cycle may produce an adverse effect; that is,

repeated exposure is not necessary for reproductive

or developmental toxicity. The concept that there

are “windows of vulnerability” for developmental

toxicity is generally known and accepted. For exam-

ple, a single exposure to TCDD on gestation day 11

will produce cleft palates. This concept also applies

to reproductive toxicity. In females, for example,

certain stages of the estrous cycle may be more

susceptible to exposure to an agent; in males, certain

stages of spermatogenesis may be prone to toxic

effects.

In the absence of data, intoxication and detoxi- The fact that toxicity may be cumulative with

cation pathways in animals and humans are assumed repeated exposure is another important consider-

to be qualitatively and quantitatively similar. ation. In most cases, the data available for reproduc-

Adjustments of NOAELs from inhalation-expo-

sure studies to a human equivalent concentration

(HEC) (see section 11.2.4.3), based on adjustments

for minute volume, respiratory rate, and other fac-

tors, are appropriate for reproductive and develop-

mental toxicity. Toxicity data are scaled directly

from experimental animals to humans on the basis

of minute vol/kg b. wt. for inhaled materials and body

weight or surface area for other routes of exposure.

The first priority is to use the internal dose at the

target site if available.

Page 17: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process l J. A. MOORE ET AL. 71

tive and developmental toxicity risk assessment are

from studies that used repeated exposures. The

NOAELs and LOAELs for reproductive and devel-

opmental effects are, however, usually based on a

daily dose (e.g., mg/kg body weight/day) that is not

adjusted for duration of exposure.

Identijication of the NOAEL and LOAEL. The

dose-response evaluation defines the range of doses

of an agent that are effective in producing reproduc-

tive and developmental toxicity, the route of expo-

sure, the timing and duration of exposure, the spe-

cies specificity of effects, and any pharmacokinetic

or other considerations that might influence the com-

parison with human exposure scenarios. Much of

the focus is on identification of the critical effects

(the adverse effects observed at the lowest dose

level that shows an adverse effect) and the LOAEL

and NOAEL associated with the effect.

tions in the use of the NOAEL (see U.S. EPA,

1991~ [4] for a summary), the Evaluative Process will

consider other methods for expressing quantitative

dose-response evaluations. In particular, the BMD

approach originally proposed by Crump, 1984 (21)

will be used to model data in the observed range.

This approach has recently been endorsed for use

in quantitative risk assessment for development tox-

icity and other noncancer health effects (22). The

BMD can be useful in interpreting dose-response

relationships because it takes into account all the

dose response data and is not limited to the doses

used in the experiment. In contrast determination

of the NOAEL or LOAEL are limited to the doses

used in the experiment.

Although calculations of the chronic RfD often

average over the total duration of exposure (19), the

NOAEL for reproductive and developmental toxic-

ity should not be adjusted automatically. In the case

of chronic inhalation exposure, for example, if expo-

sure to 500 ppm occurred for 6 h each day, the

adjusted NOAEL for a chronic RfD would be calcu-

lated by multiplying by 6124 to account for continu-

ous exposure, yielding a value of 12.5 ppm. If a repro-

ductive or developmental toxicity study used the

same exposure, the default value would be 500 ppm.

If, however, the human exposure scenario is con-

tinuous and pharmacokinetic data indicate ac-

cumulation with continuous exposure, it would be

necessary to make appropriate adjustments.

Pharmacokinetic information that relates blood lev-

els of chemicals to the toxic response is critical in

defining such dose-response relationships. How-

ever, information on peak blood levels or blood lev-

els with time (area under the curve, AUC) is seldom

available. Examples of agents for which such infor-

mation is available, and for which adjustments in

calculating the RfD could be made, include salicy-

lates (importance of peak blood concentrations) (16)

and chloroform (importance of continued levels with

time, AUC). It is likely that a combination of the

two are important, and their relative importance may

depend on timing of exposure as shown for 2-ethoxy-

ethanol (20).

The BMD is a model-derived estimate of a par-

ticular level of response above background and cor-

responds to a response that is near the lower limit

of the experimentally detected effects. To obtain

the BMD, one begins by modeling the data in the

observed range, resulting in a curve representing the

probability of response for the experimental dose.

The BMD is the lower confidence limit on the dose

resulting in a particular level of response, e.g., 10%

or ED,,,. Figure 2 illustrates the relationship between

the dose-response model, the ED,,, and the BMD,,.

Commercial software is available to model the

dose-response and calculate the BMD (23,24).

Using the BMD approach, one can calculate a

value for each effect of an agent for which sufficient

data are available. In many cases, the data may

be adequate to estimate the ED,, or ED,,,. A level

between the ED,,, and the ED,, usually corresponds

to the lowest level of observed risk that can be esti-

mated for binomial end points without extrapolating

to lower levels. The Allen et al. papers provide a

broader discussion of these issues (23,24).

Calculation of the margin of exposure (MOE).

The margin of exposure (MOE) is the simple ratio

of the dose judged to be without effect to the antici-

pated levels of human exposure. The higher the ratio

number, the greater the numerical distance between

the human exposure estimate and the highest dose

that is without adverse effect. The MOE calculation

does not incorporate any quantitative value to ac-

count for any uncertainty of the data and the judg-

ments derived from them.

As more pharmacokinetic information becomes

available, it is important to minimize the use of de-

fault assumptions and to encourage the use of actual

pharmacokinetic data.

Calculation of the benchmark dose(s) (BMD).

Because the literature describes a number of limita-

Uncertainty factors (UFs). Factors derived

from human and animal data are applied to the

NOAEL to account for various uncertainties. The

total size of the uncertainty factor varies, taking into

account assumed or known interspecies differences,

variability within-species, quality, and quantity of

Page 18: An evaluative process for assessing human reproductive and developmental toxicity of agents

Reproductive Toxicology Volume 9, Number 1, 1995 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

Excess zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAproportion of abnorma responses

Dose

Fig. 2. Illustrating the benchmark dose.

the data, consistency, slope of the dose-response atiue Process showed that committees did not rou-

curve, background incidence of the effects, and tinely apply factors of 100 to the NOAEL. In one

pharmacokinetic data. The relevance of the species, instance, interspecies and intraspecies factors were

type of effect, dose, route, and timing and duration both reduced by half a log; in another instance, inter-

of exposure are additional factors that may influence species uncertainty was reduced by half. In both

the size of the uncertainty factor. A discussion of instances, knowledge of pharmacokinetics was use-

uncertainty factors is provided in the article of Lewis ful in reducing uncertainty in the data for predicting

et al. and that of Renwick (25,26). human risk.

UFs for reproductive and developmental toxic-

ity applied to the NOAEL often include a IO-fold

factor for interspecies variation and a lo-fold factor

for intraspecies variation. In general, an uncertainty

factor is not applied to account for the duration of

exposure, that is, whether the study includes l-d,

10-d, or longer exposure periods. Thus, unless it has

been modified by some other factor of uncertainty,

the UF applied to the NOAEL is generally 100.

Additional factors may be applied to account

for other uncertainties or additional information that

may exist in the database. For example, in circum-

stances in which only a LOAEL is available, it may

be necessary to use an additional uncertainty factor

of up to 10, depending on the sensitivity of the toxi-

cologic effects evaluated, the adequacy of the tested

dose levels, or general confidence in the LOAEL.

If a BMD has been calculated, it may be used to help

interpret how closely the LOAEL approximates a

level that would not be distinguishable from controls

(equivalent to the NOAEL).

The experience gained from assessing several

chemicals using a revised draft version of the Eualu-

Calculation of the unlikely effect level (UEL).

The UEL is derived by applying uncertainty factors

to the NOAEL (or the LOAEL if a NOAEL is not

available). To calculate the UEL, one divides the

uncertainty factor selected into the NOAEL or

LOAEL for the critical effect in the most appropriate

or sensitive mammalian species. This approach is

identical to that used to derive the RfDDT (4) or an

RfD for reproductive toxicity based on less than the

life-time exposure. UEL calculations are appro-

priate for reproductive as well as developmental tox-

icity. The Evaluative Process uses the UEL both to

avoid the connotation that it is the value derived by

the EPA and because it may be calculated differently

by other regulatory agencies (e.g., the acceptable

daily intake [ADI] derived for food additives by the

FDA). The UEL is derived in this way so that it can

be used by a variety of organizations or individuals

for a variety of purposes; these purposes may in-

clude the derivation of additional figures through

the use of different uncertainty or modifying fac-

tors. Other approaches for more quantitative dose-

Page 19: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process 0 J. A. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAMOORE ET AL. 79

response evaluations can be used when sufficient

data are available. For example, when sufficient

data are available, one may use the BMD approach

described above (23,24,27). When more extensive

data are available (on, for example, pharmacokinet-

its, mechanisms, biomarkers of exposure, and ef-

fect), one might use other quantitative modeling ap-

proaches to estimate low levels of risk. Such data

sets are, however, rarely available. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

Dejinitions. The following quantitative vari-

ables have been selected for use in the Evaluative

Process. We have included quantitative assessments

for those who do not have the means to complete

the calculations. However, individual organizations

and agencies are free to use the information con-

tained in a developmental and reproductive toxicol-

ogy review of an agent to construct additional quan-

titative assessments using their own preferred terms.

The 110 observed adverse effect level (NOAEL)

is the highest dose at which there is no biologically

significant increase in the frequency of an adverse

reproductive and developmental effect when com-

pared with an appropriate control group. Biologic

significance is based on expert judgment and consid-

eration of statistical analyses.

The lowest observed adverse effect level

(LOAEL) is the lowest dose at which there is a

biologically significant increase in the frequency of

adverse developmental effects when compared with

the appropriate control group. Biologic significance

is based on expert judgment and consideration of

statistical analyses.

UFs are values that are applied to a no-effect

level to account for variability in response among

individuals within and across species. By tradition,

the values employed are usually factors of 10 for

each area of variability (uncertainty), although each

factor may be reduced or enlarged according to the

quality and amount of data. A factor of 10 is also

commonly applied when the data identify only a

LOAEL instead of a NOAEL.

c/EL is an estimate of the daily exposure of a

human population that is assumed to be without

appreciable risk of reproductive and developmental

effects.

MOE expresses the magnitude of difference be-

tween a level of human exposure and the highest

level at which there is no significant increase in the

frequency of an adverse effect (NOAEL). It is de-

fined as the ratio of the NOAEL for a specific toxic

effect to the estimated human exposure.

Benchmark dose (BMD) is an estimate, derived

from a model, of a particular level of response above

background. The BMD is the lower confidence limit

for the dose for a specified level of effect.

The human equivalent concentration (HEC) is

used to describe the dose of an agent to which

humans are exposed through inhalation. The HEC

is the estimated concentration that is equivalent to

that used in an experimental animal species. The

HEC is estimated using adjustment factors that ac-

count for such species-dosimetric differences as

ventilatory parameters and lung surface areas, as

well as factors related to the gas, aerosol, or particu-

late nature of the agent.

11.2.5 Critical Data Needs

A primary objective of this evaluative process

is to use data to formulate and express judgments

about developmental and reproductive risk potential

for humans. Flawed or nonexistent data compromise

the certainty of scientific judgment. Yet there is no

hard-and-fast definition of what constitutes an ade-

quate database for a particular chemical. Although

guidance or regulations promulgated by agencies of

government serve definite needs, they are at the

same time somewhat rigid. For the evaluative pro-

cess that is proposed here, it seems best to determine

the adequacy of the database in a case-specific man-

ner. It is, for example, far preferable to ascertain

methanol toxicity and dosimetry estimates using a

species in which the folate metabolism pathway par-

allels that of humans, than to try to assess toxicity

and dosimetry in two species selected at random,

whose metabolism is either unknown or is not similar

to that of humans.

During the review of existing information, eval-

uators may identify certain data as insufficient for

judging human risks, either because they do not exist

or because the data are compromised in some key

way for risk assessment. In another chemical evalua-

tion, data may have been judged sufficient to deter-

mine the human risk potential, but in the judgment

of the evaluators there may be major degrees of

uncertainty due to reliance on default assumptions

or inherent uncertainty in some of the data that were

central to the evaluation. In each instance, the evalu-

ators will cite specific data needs if they judge that

such data will materially improve the certainty of

an existing judgment as to human risk.

11.2.6 Summary

The selection of the term “Summary,” instead

of “risk characterization,” to describe the conclud-

ing step in the Evaluative Process was a deliberate

decision. Some people have proposed that risk char-

acterization should be reserved for the summary of

Page 20: An evaluative process for assessing human reproductive and developmental toxicity of agents

80 Reproductive Toxicology Volume 9, Number 1, 1995

a site-specific risk assessment. We find merit in this

recommendation, and, because a lack of detailed

exposure information will be the norm in this type

of evaluative process, we use the term “Summary.”

The Summary communicates scientific judg-

ment on chemical risk to the public for policy mak-

ers, as well as for public and environmental health

practitioners. The degree of certainty of the judg-

ment can, and must, be expressed in terms that are

meaningful to those with a general science educa-

tion. The key to achieving this goal is explicit candor

in explaining the basis of the judgment, its breadth

of support, and especially, the degree to which the

judgment reflects actual information, confident ex-

tensions from closely related data, or the invoking of

assumptions when there is no information available.

The summary, typically two to four pages in

length, will be written from statements developed

in three Sections of the zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAEvaluative Process: human

exposure, integrated evaluation, and critical data

needs. The summary will review the following ele-

ments.

11.2.6.1 Background

This section will provide a brief, readable re-

view of the general chemical, toxicologic, and bio-

logic characteristics of the chemical.

11.2.6.2 Human exposure

This section gives a clear statement of the condi-

tions of use or ambient concentrations that may pro-

duce different levels, routes, or frequencies of hu-

man exposure. It will describe how different patterns

of chemical use will produce differences in the mag-

nitude of exposure.

11.2.6.3 Toxicology

Summaries of developmental toxicity and of

male and female reproductive toxicity will appear

in this section. The discussion will also contain state-

ments about the sufficiency and relevance of the

data.

11.2.6.4 Quantitative evaluation

This section will list the quantitative values de-

rived in the Evaluative Process and state the degree

to which the values are derived from actual data or

reflect the use of default assumptions.

11.2.6.5 Certainty ofjudgment and data needs

The use of default assumptions, while often nec-

essary, represents a tangible expression of uncer-

tainty. To clarify this point, this section will discuss

the magnitude of an assumption’s impact on the

judgments made in this evaluation. Where the im-

pact is large and the uncertainty great, the evaluators

may sometimes defer a judgment. Where a default

assumption has a major effect on the evaluative judg-

ment, the evaluative summary should clearly define

the kind of data needed to supplant the default and

identify this as a critical data need.

Only some aspects of the assessment may in-

volve uncertainty of judgment. For example, al-

though there may be great certainty that the data

qualitatively predicts human health risk potential,

the nature and degree of exposure may be poorly

understood. In this case, the evaluative summary

will clearly state that there is reasonable certainty

of human risk potential and why the quantitative

uncertainty (i.e., missing, inadequate exposure data)

leads to the use of a conservative default assumption

whose accuracy is likely overestimating the degree

of exposure.

11.2.7 References

In any evaluation of this nature, a meticulous

bibliography is imperative. All articles reviewed will

appear in a reference list at the end of the document.

Articles used in the evaluation will be cited in the

appropriate text and appear in an alphabetical list

in the introduction. When a specific chemical is re-

viewed, a separate alphabetical listing of references

reviewed but not used in the evaluation will also

appear in the introduction.

III. END POINT DESCRIPTORS

III. 1. DEVELOPMENTAL TOXICITY

The manifestations of developmental toxicity

that are evaluated by the IEHR Evaluative Process

parallel, in large part, those delineated by the EPA

revised Developmental Toxicity Risk Assessment

Guidelines (EPA, 1991). In many instances, sections

or definitions were taken verbatim from the EPA

Guidelines, reflecting our belief that consistency in

choice of definitions and of terms aids in the commu-

nication and understanding of scientific data.

III. 1.1 Manifestations

III. I. 1 ,I Dejinitions

Developmental toxicity. Adverse effects on the

developing organism that may result from exposure

before conception in either parent, exposure during

prenatal development, or exposure during postnatal

development from birth to sexual maturation. Ad-

verse developmental effects may be detected at any

Page 21: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process 0 J. A. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAMOORE ET AL. 81

point in the life span of the organism. Major manifes-

tations of developmental toxicity include death of

the developing organism, structural abnormality, al-

tered growth, and functional deficiency. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

Structural abnormalities. Structural alterations

in development include both malformations and

variations. A malformation is usually defined as a

permanent structural change that may adversely af-

fect survival, development, or function. The term

teratogenicity refers only to malformations. The

term variation indicates a divergence from the usual

range of structural constitution that may not ad-

versely affect survival or health. Because there is

a continuum of responses from the normal to the

extremely deviant, distinguishing between varia-

tions and malformations can be difficult.

Altered growth. In the exposed offspring, al-

tered growth can result in an alteration in the size

or weight of an organ or in body weight or size.

Changes in one end point may or may not be accom-

panied by other signs of altered growth; for example

changes in body weight may or may not accompany

changes in crown-rump length or skeletal ossifica-

tion. Altered growth may occur at any stage of devel-

opment, and may be reversible or may cause a per-

manent change.

Functional developmental toxicity. Functional

developmental toxicity is the study of alterations or

delays in the physiologic or biochemical competence

of an organism or organ system after exposure to

an agent during pre- or postnatal development. In

any given test species, delayed development can

be assessed in relation to established landmarks for

physical, behavioral, and sexual maturation.

III.1 .I .2 Other considerations

Carcinogenicity is another possible adverse de-

velopmental outcome. From the data collected thus

far, it appears that agents capable of causing cancer

in adults may also cause transplacental or neonatal

carcinogenesis (28). Further, prenatal exposure to

some agents may result in cancers in adulthood.

In humans, for example, carcinogenic effects have

occurred after prenatal exposures to diethylstilbes-

trol (29); in experimental animals, a number of

agents have been shown to cause cancer after prena-

tal exposures. Currently, there is no way to predict

whether the adult or the offspring will be more or

less sensitive to the carcinogenic effects of an agent.

Currently, it is not routine to test for carcinogenesis

after developmental exposure. However, additional

guidelines are available for evaluating chemicals for

their potential to cause carcinogenesis in developing

animals (e.g., OSTP Cancer Principles, Guidelines

for Carcinogen Risk Assessment [30]). In cases that

may involve mutagenesis, one can consult the

Guidelines for Mutagenicity Risk Assessment (3 I),

which specifically address the risks of heritable mu-

tation.

III. I .2 Human Data

III.l.2.1 Measures of potential adverse effects

In principle, human data are preferred for risk

assessment. For many agents, however, human data

are not available. The EPA Guidelines for Develop-

mental Toxicity (4) describe methods for generating

and evaluating human data, and discuss the weight

that human data should be given in risk assessments.

The recent EPA Guidelines on Developmental

Toxicity identify the most useful end points for risk

assessment purposes from epidemiologic studies as:

reproductive histories of certain pregnancy out-

comes (e.g., embryo/fetal loss, birth weight, sex ra-

tio, congenital malformations, postnatal function,

and neonatal growth and survival), and measures of

fertility/infertility including indirect evaluations of

very early embryonic loss. Postnatal outcomes for

examination could include physical growth and de-

velopment, organ or system function, or perfor-

mance on various standardized neurobehavioral

tests for infants and children (32). Factors requiring

control in the design or analysis (such as effect of

modifiers and confounders) may vary, depending on

the specific outcomes selected for study.

The developmental outcomes available for epi-

demiologic examination are limited by a number of

factors, including: (a) the relative magnitude of the

exposure (because differing spectra of outcomes

may occur at different exposure levels), (b) the size

and demographic characteristics of the population,

and (c) the ability to detect the developmental out-

come in humans.

Epidemiologic studies are strengthened by in-

cluding information on the study’s ability to detect

an adverse developmental outcome, potential bias

in data collection, and control of potential effect

modifiers. EPA has defined effect modifiers as fac-

tors that, at different levels, produce different expo-

sure-response relationships (4). For example, ma-

ternal age would be an effect modifier of the risk

associated with a given exposure that increased with

the mother’s age. A confounder is a variable that

is a risk factor for the disease under study and is

associated with the exposure under study, but is not

a consequence of the exposure. A confounder may

Page 22: An evaluative process for assessing human reproductive and developmental toxicity of agents

82 Reproductive Toxicology Volume 9, Number 1, 1995

distort both the magnitude and direction of the mea-

sure of association between the exposure of interest

and the outcome. For example, socioeconomic sta-

tus might be a confounder in a study of the associa-

tion of smoking and fertility, as socioeconomic sta-

tus may be associated with both.

111.1.3 Experimental Animal and In Vitro Studies

III.l.3.1 Types of studies

Laboratory animal toxicity studies. The most

common protocols for assessing developmental tox-

icity in laboratory animals involve administering a

test substance to pregnant animals (usually mice,

rats, or rabbits) during major organogenesis. Mater-

nal responses are monitored throughout pregnancy,

and the dam and uterine contents are examined just

before or at term (33-37). Other study protocols

involve exposure throughout pregnancy, or only

during the late prenatal and early postnatal periods.

Developmental toxicity may also be evaluated in

studies that treat either one or both parents before

conception, or that chronically expose several suc-

cessive generations to the toxicant of interest

(4,33-40).

Appropriate study designs include a number of

important factors. For example, considerations of

species, strain, age, weight, and health status gener-

ally form the basis for the selection of test animals.

Assignment of animals to dose groups by stratified

randomization (by body weight) reduces bias and

provides a basis for performing valid statistical tests.

At a minimum, an appropriate protocol specifies a

high-dose, a low-dose, and an intermediate-dose

group, as well as a concurrent control group. A con-

current control group treated with the vehicle used

for administering the toxic agent is a critical compo-

nent of a well-designed study (4). The high dose is

selected to produce some minimum maternal or

adult toxicity (a level that produces statistically sig-

nificant or specific organ toxicity, but causes no

more than 10% mortality). The low dose is generally

a NOAEL for adults and offspring; if the low dose

produces a biologically or statistically significant in-

crease in toxic effects, it is considered a LOAEL.

Because the developing organism may be more

sensitive than the adult, agents that produce devel-

opmental toxicity at doses that are not toxic to the

maternal animal are of greatest concern. However,

when adverse developmental effects occur only at

doses that cause minimum maternal toxicity, they

still represent developmental toxicity and should not

be discounted as secondary to maternal toxicity. If

an agent causes severe maternal toxicity, however,

it may be difficult to interpret developmental effects

at these maternally toxic levels.

Ideally, a chemical database would include ani-

mal bioassays that test the effects of toxicant expo-

sure through the same routes by which humans are

exposed. However, most animal studies use the oral

route of exposure, and often it is necessary to extrap-

olate their results to account for other routes of

exposure.

A number of end points are possible indicators

of maternal toxicity (Table 1, taken from the EPA

1991 Developmental Toxicity Guidelines [4]).

Developmental effects induced by exogenous

agents may include death, structural abnormalities,

altered growth, and functional deficits. These may

result from exposure during various developmental

periods between conception and sexual maturation.

The types and pattern of effects may vary, de-

pending on the timing of exposure, the develop-

mental processes occurring at the time of exposure,

and the developmental periods between conception

and sexual maturation. For some end points, it may

not be possible to see traditional dose-response rela-

tionships. As exposure increases, for example, the

higher levels may be lethal to the offspring, causing

an observed decrease in malformations or other ad-

verse effects with increasing dose (41,42).

Although many studies have evaluated prenatal

developmental effects over the years, today there

is increased recognition of the importance of func-

tional alterations that can be detected in postnatal

evaluations. Although regulatory requirements for

postnatal evaluations are limited, research indicates

that, for such agents as metals, chlorinated hydro-

carbons, anticonvulsants, and opioids, postnatal

testing of laboratory animals can detect functional

effects that are relevant to the human health effects

of these agents (43). In many of these cases, func-

tional defects are the primary effects observed at

dose levels below those that cause other types of

overt toxicity. Developmental toxicity studies have

looked for functional deficits in the nervous, urinary,

cardiovascular, respiratory, immune, endocrine, re-

productive, and digestive systems (44-46).

Table 2 lists a series of end points for develop-

mental toxicity evaluation. Those measured at term

are detailed further in the EPA testing guidelines for

standard developmental toxicity studies (4,47,48).

The postnatal end points measured are somewhat

dependent on the organ system under study. For

neurobehavioral evaluation, the EPA has published

Developmental Neurotoxicity Testing Guidelines (4)

that outline a protocol and general testing ap-

proaches for certain categories of function. For re-

Page 23: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process l J. A. MOORE ET AL. 83

Table 1. End points used to assess maternal toxicity

Mortality

Mating Index no. with seminal plugs or sperm

no. mated x loo

Fertility index no. with implants

no. of matines x loo

Gestation length (useful when-animals are allowed to deliver pups) Body weight

@Day0 l During gestation l Day of necropsy

Body weight change l Throughout gestation l During treatment (including increments of time within treatment period) 0 Posttreatment to sacrifice l Corrected maternal (body weight change throughout gestation minus gravid uterine weight or litter weight at sacrifice)

Organ weights (in cases of suspected target organ toxicity and especially when supported by adverse histopathology findings)

l Absolute l Relative to body weight l Relative to brain weight

Food and water consumption (where relevant) Clinical evaluations l Types, incidence, degree, and duration of clinical signs

l Enzyme markers l Clinical chemistries

Gross necropsy and histopathology

Source: U.S. EPA Guidelines for Developmental Toxicity Risk Assessment (4).

productive system function, an evaluation can use

the end points measured in the two-generation study

(47,48). For other organ systems, there are no stan-

dard testing protocols, and the end points measured

depend on the organ system under study. When eval-

uators encounter such data on a chemical that is

under review, however, they should consider and

evaluate the information.

Table 2. End points used to determine developmental toxicity

End points typically measured at terminal phase of pregnancy

Implantation sites

Corpora lutea Preimplantation loss Resorptions and fetal deaths Live offspring with malformations and variations

Affected (nonlive and malformed) Fetal weight

End points that can be measured postnatally

Stillbirths Offspring viability (birth, within the first week, weaning, etc.)

Offspring growth (birth, postnatally) Physical landmarks of development (e.g., vagina1 opening, palano-preputial separation) Neurobehavioral development and functiona

Reflex development Locomotor development Motor activity Sensory function Social/reproductive behavior Cognitive function Neuropathology and brain weights.

Reproductive system development and functiona Ovarian cyclicity Sperm measures (e.g., morphology, motility, number) Fertility Pregnancy outcome

Other organ system function (e.g., renal, cardiovascular)”

Adapted from U.S. EPA Guidelines for Developmental Toxicity Risk Assessment (4). aActua1 end points measured depend on the function or organ system being studied.

Page 24: An evaluative process for assessing human reproductive and developmental toxicity of agents

84 Reproductive Toxicology Volume 9, Number 1, 1995

Short-term tests. (a) In vivo mammalian tests:

the most widely used in vivo short-term test is the

test developed by Chernoff and Kavlock (49). The

approach is based on the hypothesis that a prenatal

injury that results in altered development will be

manifested postnatally as reduced viability and/or

impaired growth. As originally proposed, the proto-

col consisted of administering the test substance to

mice during major organogenesis at a single dose

level that would elicit some degree of maternal toxic-

ity. The pups are counted and weighed shortly after

birth, and again after 3 to 4 d. End points considered

in the evaluation include general maternal toxicity

(including survival and weight gain), litter size, pup

viability and weight, and gross malformations in the

offspring.

Other in vivo mammalian testing protocols in-

corporate more extensive dosage considerations and

evaluation of end points of both reproductive and

developmental toxicity (5051).

(b) In vitro developmental toxicity screening

tests: any procedure that uses a test subject other

than a pregnant mammal falls under the general

heading of an “in vitro developmental toxicity

screen.” Examples include isolated whole mamma-

lian embryos in culture, tissue or organ culture, cell

culture, and developing nonmammalian organisms.

These procedures have long been used to assess

events associated with normal and abnormal devel-

opment, but only recently have they been consid-

ered as potential screening tests for developmental

toxicity (52-54). Many of these tests are now being

evaluated for their ability to predict the develop-

mental toxicity of various agents in intact mammals.

Validation requires certain considerations in study

design, including defined end points for toxicity, an

understanding of the procedure’s ability to respond

to chemicals that become toxic only after they are

metabolized, and the accuracy of the test’s response

to chemicals that are, and are not, developmental

toxicants (53,55-57).

Although in vitro test systems can provide sig-

nificant information, by themselves, they are insuf-

ficient for risk assessment (4). In part, this is because

the ability to apply the data to effects in whole ani-

mals is limited. But it is also because few of the

assays have been appropriately validated, a fact

noted in several reviews of available in vitro systems

(56-58) and during the National Toxicology Pro-

gram Workshop on In Vitro Teratology (59). In vitro

test data can, however, be very useful in describing

the relative toxicity (potency) of members of chemi-

cal families. Because closely related chemicals are

likely to act through a common mechanism, a single

in vitro screen that is sensitive to this mechanism

may predict the relative potencies of all members

of the family. For example, an in vitro mouse limb

bud cell screen has been successfully used to rank

the relative teratogenic potential of a large series of

synthetic retinoids (60).

III.l.3.2 Interpretation

The minimum sufficient evidence that an agent

causes developmental toxicity in animals is the dem-

onstration of an adverse developmental effect in a

single, well-conducted study using an appropriate

species. A judgment that an agent does not pose a

potential hazard requires a minimum of two well-

conducted standard studies showing no effect, as

well as studies showing no other types of develop-

mental toxicity. These studies must involve at least

two species, evaluate a variety of potential pre- and

postnatal manifestations of developmental toxicity,

and find no developmental effects at doses that were

minimally toxic to adults.

Examples of insufficient evidence for develop-

mental toxicity in animals are: studies that generated

biologic data that are not statistically or biologically

significant; improperly conducted studies (studies

with too few animals per dose group, with a less-

than-standard exposure period, with inappropriate

dose selection or exposure information, or with

other uncontrolled factors); data from a single spe-

cies with no reported adverse developmental effects;

short-term tests; nontraditional in vivo studies; or

databases limited to information on structure-activ-

ity relationships, pharmacokinetics, or metabolic

precursors.

111.2. MALE REPRODUCTIVE TOXICITY

111.2.1 Manifestations

Expressions of male reproductive toxicity may

involve alterations in the male reproductive organs

or in the related endocrine systems. Such alterations

may include changes in sexual behavior (mating be-

havior, libido, erection, intromission, ejaculation);

onset of puberty (delayed physical and behavioral

development); fertility (achieving conception within

a defined period of time); pregnancy outcome (pro-

duction of normal quality and number of offspring);

reproductive organ structure and morphology; re-

productive endocrine parameters (including peptide

and steroid hormone control); or other functions that

compromise the integrity of the male reproductive

system.

Male reproductive toxicity may be evaluated in

animals and humans. The sections below identify

Page 25: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process 0 J. A. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAMOORE ET AL. 85

test systems for detecting adverse effects on male

reproduction, describe how male reproductive toxi-

cants may affect the measured end points, and out-

line the characteristics of a database that would be

sufficient for estimating human risk. A strategic ap-

proach to the study of male reproductive toxicology,

including descriptions of laboratory techniques and

their interpretation, has recently been published

(61).

111.2.2 Human Data

111.2.2.1 Measures of potential adverse effects

The heterogeneity in human populations, cou-

pled with the presence of innumerable confounding

variables, requires that studies of reproductive tox-

icity in humans use sound epidemiologic methods.

Below are considerations that are particularly rele-

vant to male reproductive toxicology in humans.

Endocrine parameters. If the results of single

measurements of hormones in blood are well outside

the normal range, they may indicate an adverse ef-

fect. In most instances, the pulsatile nature of hor-

mone secretion requires serial sampling at short time

intervals to characterize abnormalities of gonadotro-

pin or sex hormone secretion. The response of hor-

mones to their releasing factors may also be a useful

measure of endocrine competence.

Sexual behavior and interest. Studies that eval-

uate sexual interest typically use questionnaires.

The information collected might include data on

such factors as the number and types of sexual expe-

riences in a given period of time.

Semen evaluations. Semen studies routinely

evaluate sperm count, morphology, and motility,

because abnormalities in these parameters are asso-

ciated with reduced fertility in men. A consistent

length of abstinence before sampling, which can af-

fect the results of a semen evaluation, is difficult to

ensure in clinical evaluations, however. And, al-

though normal morphology and sperm count are

highly variable in the human population, significant

differences from normal should be considered evi-

dence of toxicity. Additional tests on ejaculated se-

men that supplement the evaluation of male repro-

ductive toxicity include the sperm penetration assay

using zona-free hamster oocytes, measures of acro-

some reaction, and the sperm chromatin structure

assay.

Biochemical markers. Some male reproductive

organs produce biochemical markers that may corre-

late with organ function. The seminal vesicles, for

example, secrete fructose into the seminal fluid. Cur-

rently, however, we can regard alterations in bio-

chemical markers as supplemental information only,

because they do not necessarily signify male repro-

ductive toxicity.

111.2.2.2 Interpretation

In characterizing reproductive toxicity, it is im-

portant to consider any evidence of change in any

of the parameters discussed above. In doing so, eval-

uators should be alert to the possibility of bias and

confounding in any human study of reproductive

parameters. Furthermore, case reports that describe

alterations in parameters, such as decreased sperm

counts, can only signal the need for additional con-

trolled studies, since, by themselves, they cannot

characterize an agent as a reproductive toxicant.

111.2.3 Experimental Animal and In Vitro Studies

111.2.3.1 Potential measures of determining

adverse reproductive effects

Single-generation test systems. Single-genera-

tion studies evaluate reproductive effects only on

the exposed adult animals. Most use laboratory rats

or, occasionally, other rodents. Usually each breed-

ing pair in a single-generation study produces a sin-

gle litter, which is examined for effects on growth

and development and then discarded.

These studies evaluate multiple end points, in-

cluding fertility, altered litter parameters, and fetal

parameters (pup weight, structural or functional

deficits, or decreased weight gain or survival). Sex-

ual behavior is often evaluated by daily checks for

evidence of mating. Single-generation studies rou-

tinely generate certain indices of reproductive com-

petence (Table 3), and should also assess testes and

accessory sex organs in the adults for changes in

weight and morphology. At the end of the test, the

rodents are often evaluated for sperm count, mor-

phology, and motility. Single-generation tests may

also be useful for studying endocrine and other target

organ effects. The quality of these latter can vary

substantially, as discussed below under Subchronic

toxicity tests.

Although single-generation studies may provide

evidence of effects on the male and female test sub-

jects as a couple, because most of these studies treat

both the male and female, it may not be possible to

determine which sex is affected by the agent. It may,

thus, be necessary to review or develop further data

to determine whether one or both sexes are affected

by the agent.

Typically, single-generation studies expose

Page 26: An evaluative process for assessing human reproductive and developmental toxicity of agents

86 Reproductive Toxicology Volume 9, Number 1, 1995

Table zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA3. Indices of fertility and reproductive function evaluated

Female mating index

Female fertility index

Female fecundity index

Male mating index

Parturition index

Gestation index

Live litter size

Live birth index

Viability index

Lactation index

Weaning index (if litter size artificially reduced)

Preweaning index

no. estrous cycles with copulation

no. cycles required for conception x loo

no. females presumed pregnant x loo

no. females cohabited

no. confirmed pregnant

no. with copulatory plug or sperm x 100

no. males with pregnant females

no. males x 100

no. parturitions

no. females confirmed pregnant x loo

no. females with pups born alive

no. confirmed pregnant x loo

no. live offspring

no. females with copulatory plug or sperm x 100

mean pups per litter pups alive

mean pups per litter

mean pups per litter alive day 4

mean pups per litter born alive

mean pups per litter alive day 21

mean pups per litter alive day 4

mean pups per litter alive day 21

mean pups per litter kept at day 4

mean pups born per litter - mean pups per litter weaned

mean pups born per litter

males 5 to 8 weeks old to the test agent for 8 to 10

weeks before mating. Treatment must take place

either through all relevant periods of spermatogene-

sis (the time for spermatogonia to mature to sperma-

tozoa capable of fertilization in control males) or

must allow sufficient time to elapse between the

initial treatment and one of the mating trials to permit

maturation of early spermatid stages. If a study does

not meet these conditions, it may test for the wrong

stage of spermatogenesis and, as a result, may not

detect treatment-related effects.

In screening studies, exposures that take place

less than 8 to 10 weeks (for example, 1 week) before

mating may be acceptable if the study includes good-

quality histopathology of the testis at the end of the

study and if chemical exposure continued through-

out the mating period. Early-stage spermatogenesis

lesions in the testis may manifest as seminiferous

tubule changes. Identification of these changes

allows both the collection of fertility assessment data

in a much shorter time and assessment of the testis

ifthe fertility study does not detect late-stage effects.

Screening studies usually treat the animals daily

through feed or drinking water, and typically use

three dose levels and appropriate controls, with 20

or more fertile males per dose level. The high-dose

level should elicit minimal nonreproductive toxicity;

the low-dose level should be selected to establish a

NOAEL for reproductive toxicity.

Multigeneration test systems. Multigeneration

studies extend the single-generation studies by con-

tinuing exposures and observations in the offspring

of the first generation. Thus, in addition to treating

the adult animals, these studies expose the offspring

to the test agent during their prenatal, postnatal,

and pubertal development. Once the offspring are

mature, their treatment continues as in the single-

generation study.

Continuous-breeding test systems. The contin-

uous-breeding protocol was designed to increase the

sensitivity of fertility measurements, which are

generally acknowledged to be relatively insensitive

indicators of male reproductive toxicity (40). Con-

tinuous-breeding assays differ from single- and

multigeneration protocols by their ability to discrim-

inate subfertility in individual animals: each male

has the opportunity to mate numerous times during

the test. Because the breeding pairs are kept in the

same cage throughout the study, the continuous-

Page 27: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process l J. A. MOORE ETAL. 87

breeding design also allows measurements of the

time between pregnancies.

Beginning 1 week after the initiation of treat-

ment, the male and female stay together throughout

the study, and are separated only after enough time

has passed for the control pairs to produce four or

five litters. This protocol differs from that of multi-

generation studies, in which the investigator con-

trols the timing of the second matings. In a continu-

ous-breeding study, effects of exposure on the F,

generation are typically assessed in the last litters

delivered after the males and females in the parental

generation are separated. A continuous-breeding

study may also assess the F, generation by raising

the pups from the first litter and then resuming co-

habitation of the parental F, males and females after

or by saving the last litters, breeding them, and ex-

amining the offspring (62). The use of the F, genera-

tion for breeding makes the continuous-breeding

study results essentially equivalent to those of other

multigeneration designs; in addition, it provides a

more sensitive indicator of effects on fertility.

Male dominant lethal test. The dominant lethal

test is intended to detect mutagenic effects that hap-

pen during spermatogenesis and that are lethal to

the embryo or fetus (for a review see Green et al.

[63]). While aimed specifically at detecting mutage-

nicity, data from dominant lethal assays can be rele-

vant to other mechanisms of male reproductive tox-

icity. In conventional studies, sexually mature male

rats or mice are treated with the test agent for about

a week. To assess the sensitivity of the various

stages of spermatogenesis to the agent, these assays

place each male with one or two untreated females

each week for a period of 8 to 10 weeks. The females

are generally killed before parturition so that the

implantation sites can be counted. The end points

are pre- and postimplantation loss.

The male dominant lethal test contains many of

the same design parameters and end points as the

single-generation study. Some of the same fertility

indices that are calculated in multigeneration and

continuous-breeding studies can also be calculated

in male dominant lethal studies (e.g., mating index,

live litter size). Because there is usually no postnatal

assessment, it is not possible to calculate some of

these indices.

Subchronic toxicity test. Subchronic toxicity

studies are conducted to assess general toxicity in

rodent and nonrodent species. In rodents, exposure

generally begins at 6 to 8 weeks of age and continues

for 90 d. Clinical signs of toxicity, growth, clinical

chemistry, and target-organ weights and morphol-

ogy are routinely assessed. End points relevant to

male reproductive toxicity are generally limited to

testicular weight and morphology of formalin-fixed,

paraffin-embedded testes and accessory sex organs.

Epididymal weight also may be measured.

It is possible to substantially improve the male

reproductive toxicity end points in these tests. For

example, the use of Bouin’s fixative and PAS stain-

ing will improve testicular histopathology in paraf-

fin-embedded specimens. Better still are plastic

embedding techniques that preserve testicular mor-

phology more effectively (64).

Subchronic study designs can also incorporate

other end points of testicular or epididymal toxicity

as multigeneration or continuous-breeding studies

do. For example, it is relatively easy to weigh acces-

sory sex organs, such as the seminal vesicle or pros-

tate, at necropsy.

Measures of sperm production, such as epididy-

ma1 sperm count or spermatid count from testicular

homogenates, are collected at the end of rodent stud-

ies. Rabbit sperm can be collected in an artificial

vagina at various times throughout a treatment pe-

riod and at the end of a study. Sperm head and tail

morphology and sperm motility can be evaluated

from the same samples.

Endocrine function can also be evaluated by

measuring plasma or tissue hormone levels. In gen-

eral, the steroid hormones produced by the testis

and the pituitary gonadotropins are measured most

frequently.

Biochemical markers of exposure or effects can

be assessed in fluids or tissues (65). Some reports

suggest that markers in accessory sex organ secre-

tions (e.g., fructose, prostatein) may indicate toxic-

ity. To date, however, no study has validated the

ability to monitor and interpret most of these

markers.

Chronic toxicity test. One- to 2-year continuous

exposures in rodents provide the opportunity to

evaluate long-term effects of toxicant exposure on

male reproductive function. Interim sacrifices are

critical for obtaining useful data that can be com-

pared to the findings at the end of the exposure

period. Chronic tests can use the same end points

as subchronic tests.

111.2.3.2 Interpretation

Fertility indices. Well-conducted multigenera-

tion and continuous-breeding studies can provide

data that demonstrate changes in the key parameters

of male fertility and reproduction. Statistically sig-

nificant, dose-related changes in the indices listed in

Page 28: An evaluative process for assessing human reproductive and developmental toxicity of agents

88 Reproductive Toxicology Volume 9, Number 1, 1995

Table 3 provide sufficient evidence of reproductive

toxicity, but, by themselves, do not identify the af-

fected sex. Because most multigeneration or contin-

uous-breeding studies place test males with females

treated at the same dose level, they are unable to

identify the affected sex. Although such studies may

be the most typical way to evaluate the reproductive

toxicity of an agent, most provide insufficient evi-

dence that the agent is a male reproductive toxicant

in animals. There is, therefore, a need for additional

data that, in fact, may come from the same study.

For example, evidence of gonadal toxicity measured

by testicular weight or altered morphology can pro-

vide sufficient evidence that an agent is a male repro-

ductive toxicant or may add weight to evidence that

it is not a male reproductive toxicant. Another way

to provide sufficient evidence of male reproductive

toxicity would be to mate the treated of one sex to

controls of the other sex. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

Organ weights. A statistically significant, dose-

related decrease in absolute or relative testicular

weight is generally sufficient evidence that an agent

is a reproductive toxicant in animals. Most testicular

toxicants cause decreases in testicular weight, but

if they cause edema, the testicular weight increases.

Although decreases in testicular weight may be con-

sidered sufficient evidence of toxicity by them-

selves, increases need to be explained by other end

points, such as morphology. Any changes must also

be considered in light of any systemic toxicity elic-

ited by the test chemical. Severe systemic toxicity

brings into question not only the organ weight data,

but also the relevance of any other reproductive

effects.

Weight changes in male accessory sex organs

can indicate significant functional effects. Both the

seminal vesicles and prostate, for example, contain

a large proportion of luminal fluid that may decrease

rapidly when androgenic hormone levels decline.

Epididymal weight is largely affected by the number

of sperm present in the epididymis. Statistically sig-

nificant dose-related decreases in the weight of the

epididymis would be sufficient evidence of male ef-

fects. Decreases in the weight of the seminal vesicles

or ventral prostate may be sufficient evidence of

male reproductive toxicity, but are more useful if

supplemented by data on endocrine effects. Changes

in pituitary weights alone would typically be insuffi-

cient evidence of male reproductive toxicity, both

because pituitary weights are inaccurate and be-

cause changes in pituitary function are best mea-

sured by other parameters, such as hormone levels.

Furthermore, only a small portion of the gland is

involved with reproductive function.

Organ morphology. Changes in testicular mor-

phology are best observed when the tissues are pre-

served by optimal methods. The best evaluations

can be done on testes fixed by perfusion techniques

and embedded in a plastic, such as glycol methacry-

late. More conventional, but still quite acceptable,

morphologic investigations can be performed on tes-

tes fixed by immersion in Bouin’s fixative, embed-

ded in paraffin, and stained with PAS. Formalin fix-

ation and paraffin embedding of testes is inferior

and generally inadequate for the study of testicular

pathology. These inferior techniques will detect only

the most severe effects. Testicular toxicity may be

observed in formalin-fixed tissues, however, even

when there are no significant changes in testicular

weight. In formalin-fixed and paraffin-embedded tis-

sues, only the most severe changes in the seminifer-

ous epithelium of the testis could be considered suf-

ficient evidence of male effects. The sensitivity of

these evaluations can be substantially improved by

more careful fixation, embedding, and observation

techniques. Low-quality morphologic techniques,

such as formalin fixation and paraffin embedding,

are never sufficient to show that a compound is not

a testicular toxicant.

Morphologic changes in accessory sex organs

are less common, but clear treatment-related effects

may also provide sufficient evidence of male effects.

Sexual behauior. Fertility studies do not incor-

porate measures of sexual behavior, but they indi-

rectly measure end points that may be altered by

effects on sexual behavior. These measurements in-

clude collecting vaginal smears to check for the pres-

ence of sperm or checking vaginal plugs as evidence

of mating. An azospermic male, however, may have

normal sexual behavior but will not have a “sperm-

positive” mating. Thus, even though a decrease in

sperm-positive matings may be sufficient evidence

of reproductive toxicity, it would not be sufficient

evidence of abnormal sexual behavior. If a study

does measure sexual behavior, mounting frequency,

intromission, ejaculation number, and latency can

be measured. More detailed studies of sexual behav-

ior (66) would be helpful, but are rarely done.

Sperm evaluation. In mice and rats, sperm

motility and count are relatively sensitive and reli-

able indicators of male reproductive toxicity (12).

Statistically significant, dose-related decreases in

these parameters would constitute sufficient evi-

dence of male reproductive toxicity, even if fertility

is not adversely affected. Sperm morphology

changes, if statistically significant and dose related,

would be sufficient evidence of reproductive toxic-

Page 29: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process l J. A. MOORE zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAET AL. 89

ity; experience to date has shown, however, that

sperm morphology changes in rodents are fairly in-

sensitive indicators of reproductive toxicity (12)

even though they may be good indicators of repro-

ductive dysfunction in humans.

Sperm evaluations in rats and mice are nearly

always limited to the terminal sacrifice of the test

animals because it is extremely difficult to collect

semen samples from such small animals. Because

investigators can collect whole semen samples from

rabbits and domestic animals, however, it is possible

to assess and follow progressive changes in semen

in these animals over a period of time. The potential

advantages to conducting sperm assessments in rab-

bits include the ability to assess the same parameters

(morphology, motility, and sperm count) at succes-

sive points in time. Studies have shown that large

decreases in semen parameters must occur before

there are noticeable changes in fertility. Statistically

significant, dose-related decreases in semen quality,

however, may constitute sufficient evidence that a

compound causes reproductive effects in the test

species. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

Endocrine evaluations. If adequately designed

studies detect changes in levels of gonadal steroid or

gonadotropic pituitary hormones, these endocrine

parameters do provide sufficient evidence of repro-

ductive toxicity. Typically, adequate studies that

show toxicity will have multiple samples obtained

in a well-defined context that includes sex, age, re-

productive state, day of cycle, and so on. Endocrine

changes that indicate toxicity will include both multi-

ple values outside the normal physiologic ranges

and physiologically plausible changes in direction in

hormone levels.

Biochemical markers of reproductive exposure

and effect. Various markers of exposure and effect

have been investigated in male reproductive toxicol-

ogy, including prostatein, androgens, and prolactin

(65). Sertoli cell enzymes or biochemical secretory

products, measured in vitro and in vivo as markers

of cell function, are other examples of useful end

points for studying target organ or cell responses.

Currently, however, they cannot be considered evi-

dence of male reproductive toxicity.

In vitro methods. There are methods for cultur-

ing various cells from the male reproductive system,

such as pituitary cells, Sertoli cells, and germ

cell-Sertoli cell cocultures. But, although these in-

vestigations help elucidate mechanisms of action,

by themselves they cannot generate sufficient evi-

dence of reproductive toxicity.

III.3 FEMALE REPRODUCTIVE TOXICITY

111.3.1 Manifestations

Female reproductive toxicity includes adverse

effects on reproductive organs and related endocrine

systems. End points that reflect such toxicity include

sexual behavior (receptivity to the male at appro-

priate times in the cycle), age at onset of puberty,

fertility (the ability to produce offspring in normal

number), gestation length, parturition, lactation, and

age at reproductive senescence.

111.3.2 Human Data

111.3.2.1 Measures of Potential Adverse Effects

The heterogeneity of human populations, cou-

pled with the presence of numerous confounding

variables, requires that studies in humans use sound

epidemiologic methods. A discussion of considera-

tions that are of particular relevance to female repro-

ductive toxicology follows.

Standardized fertility ratio. This measurement

compares the number of pregnancies in an exposed

population with the number expected on the basis

of statistics in a reference population. Limitations

include the inability to diagnose early pregnancies

accurately in the population. Because age, marital

status, contraceptive practices, parity, and other

factors can influence the number of pregnancies that

take place in a population, the studies under review

must account for these potentially confounding vari-

ables (67,68).

Standardized birth ratio. Because it is easier to

count births than pregnancies, this measure can be

used in place of the standardized fertility ratio. Dif-

ferences in the incidence of spontaneous and in-

duced pregnancy loss between the populations being

compared will introduce inaccuracies. Stratification

for important variables is required.

Znfertility rate. Because infertility is usually de-

fined as absence of pregnancy after 12 months of

regular, unprotected coitus, it is difficult to measure

in a population. A case-control design will permit

the evaluation of exposure information in couples

who seek medical care for infertility and in a suitable

control group. This approach will not be practical if

the exposures of interest are unusual in both groups.

Time to pregnancy. The use of 12 months of

unprotected coitus as a requirement for the diagnosis

of infertility causes couples to be regarded as equally

fertile whether they conceive on the first day they

attempt pregnancy or on the 364th day. For couples

who are planning a pregnancy, a delay in conception

Page 30: An evaluative process for assessing human reproductive and developmental toxicity of agents

90 Reproductive Toxicology Volume 9, Number 1, 1995

is important; having these couples measure the num-

ber of cycles necessary before conception provides

a measure of the delay in conception. The disadvan-

tage of this method is that it excludes couples with

unplanned pregnancies. Because these couples may

represent the most fertile individuals in a population,

their exclusion may be important (69). zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

Age at puberty. Puberty in humans is a process

in which several stages occur in sequence. Matura-

tion of adrenal, hypothalamic, pituitary, and ovarian

function are the physiologic hallmarks of puberty.

Clinical signs include breast tissue development and

enlargement, the appearance of pubic and axillary

hair, and the onset of menstrual bleeding. The nor-

mal distributions of ages at which girls enter the

phases of puberty can be evaluated in exposed and

control populations; it is important to recognize,

however, that there are differences among human

subpopulations (such as racial groups) in the normal

distribution of these ages.

Age at menopause. To be credible, epidemio-

logic studies must use sound strategies for gathering

data on menopause. Although it is possible to com-

pare age at menopause between exposed and unex-

posed women, comparisons that rely, for example,

on the ability of elderly women to recall both their

exposure status and their age at menopause may be

subject to bias. Another strategy for collecting such

information is to interview exposed and nonexposed

women in their 40s and 50s to identify how many in

each age group have reached menopause (e.g., 70).

Menstrual cycle parameters. Although any pop-

ulation of women will show a normal distribution of

variation in the length of the menstrual cycle and

the number of days of menstrual flow (71), variability

is also common between women and within the same

woman. But, although studies may be able to docu-

ment marked alterations in menstrual cycle parame-

ters, such as amenorrhea or extreme cycle-length

alterations, validating more subtle alterations will

be difficult (72).

Incidence of early pregnancy loss. Because a

large proportion of fertilized ova are lost at or near

implantation, they do not result in clinically recog-

nized pregnancies. Thus, despite the fact that this

parameter overlaps with developmental toxicity,

women with recurrent early pregnancy loss appear

clinically to be infertile. Very sensitive tests for hu-

man chorionic gonadotropin (HCG) make it possible

to detect early pregnancies (73). The transient ap-

pearance of HCG in the urine is taken as evidence

of early pregnancy loss. Because early pregnancy

loss in the absence of a toxicant is common (74),

careful evaluation of a control population is espe-

cially important.

Incidence of ectopic pregnancy. Alteration in

genital tract function may result in abnormal implan-

tation of the conceptus outside the endometrial cav-

ity. Ectopic pregnancy is a life-threatening compli-

cation that is typically related to other factors, such

as previous pelvic infection.

Endocrine parameters. Single measurements of

hormones in blood may be helpful if the results are

well outside the normal range. For example, a fol-

licle-stimulating hormone (FSH) concentration

greater than 70 ng/mL may be taken as a persuasive

sign of ovarian failure. In most instances, the pulsa-

tile nature of hormone secretion requires investiga-

tors to take serial samples at short time intervals in

order to characterize abnormalities of gonadotropin

or sex hormone secretion. The response of hor-

mones to their releasing factors may also be a useful

measure of endocrine competence.

Sexual behavior and interest. The evaluation of

sexual interest in exposed women has been con-

ducted through questionnaire studies. Data col-

lected have included, for example, the number and

types of sexual experiences in a given period of time.

Sexual interest in women may be cycle dependent

(75), although this is controversial. Sexual interest

also appears to depend on social factors (76). This

underscores the importance of selecting appropriate

controls in studies of this type.

Breast milk. Although it is possible to determine

the composition and volume of breast milk, it is

important to control for the normal variation that

happens during each day and each feeding. The

length of time a woman has breast fed since the birth

also affects the composition and volume of the milk.

These variations are best controlled by evaluating

the composition of milk that is collected at the same

part of the feeding, at the same time of day, and at

the same length of time after a normal term birth.

111.3.2.2 Interpretation

In characterizing reproductive toxicity, it is im-

portant to consider any evidence of change in any

of the parameters discussed above. In doing so, eval-

uators should be alert to the possibility of bias and

confounding in any human study of reproductive

parameters. Furthermore, case reports that describe

alterations in parameters, such as amenorrhea or

extreme cycle-length alterations, can only signal the

need for additional controlled studies, since, by

Page 31: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process l J. A. MOORE ET AL. 91

themselves, they cannot characterize an agent as a

reproductive toxicant.

111.3.3 Experimental Animal

and In Vitro Studies

A description of how to assess female reproduc-

tive toxicity, including a description of experimental

methods and their interpretation, has recently been

published (77).

111.3.3.1 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBATypes of studies

Single-generation test systems. Except for the

fact that females are treated throughout pregnancy

and lactation, single-generation tests are generally

similar for both males and females (for a description,

see section 2.2.1.2.). The end points evaluated, in-

clude those listed in Table 3. Appropriate doses of

the test agent are important: typically, the highest

dose is one that produces a minimum level of nonre-

productive toxicity, the lowest dose is clinically rele-

vant or is designed to produce a NOAEL for repro-

ductive toxicity, and the intermediate dose is

somewhere between these two.

Multigeneration test systems. The protocols for

these tests are identical to those for males.

Continuous-breeding test systems. As is the

case for male reproductive tests, continuous-breed-

ing protocols for females were designed to increase

the sensitivity of fertility measurements. These

tests, which keep a breeding pair together continu-

ously, allow the measurement of time between preg-

nancies and the discrimination of subfertility. To

define whether the reproductive effect is mediated

by male or female toxicity, or by a couple effect,

supplemental tests mate each member of the pair to

an untreated control.

Cyclicity. Abnormal findings for estrous ani-

mals include persistent estrus, prolonged diestrus,

or anestrus (78). To characterize the estrous cycle

in appropriate experimental animals, studies can use

vaginal cytology and cyclic signs in menstruating

animals. These parameters can give information on

whether cycling has been abolished or whether seg-

ments of the cycle are altered in length. Because

estrous cycle length has a normal variation, it is also

possible to evaluate changes in the distribution of

cycle lengths. The interpretation of these data is,

however, open to question (see section 3.2.2. be-

low). Vaginal cytology data can also be incorporated

into such protocols as the continuous-breeding test

(13).

Structural reproductive organ alterations. The

weight of the reproductive organs can be evaluated

at surgery or necropsy and should be assessed both

as an absolute value and relative to body weight.

Because organ weights show marked variation at

different times of the cycle, any study that proposes

to use reproductive organ weight as a measure must

control carefully for cyclic variation. Changes in

uterine weight in immature or castrate females are

a common bioassay for “estrogenicity.” Although

such changes in uterine weight are not clear signs

of toxicity, they may raise issues of concern. Histo-

pathologic evaluation of reproductive organs may

be useful; here again, cyclic and maturational varia-

tions can affect histopathologic end points. Evalua-

tion of the ovary often includes counts of follicles

or subpopulations of follicles (79,80).

Biochemical reproductiue organ changes. Se-

cretion products of the uterus can be obtained with

uterine lavage (81). Changes in uterine secretions

may be useful in characterizing alterations associ-

ated with treatment; because these changes may be

cycle dependent, however, they may be difficult to

interpret. To date, the characterization of normal

changes in uterine secretory products is still incom-

plete.

Timing of puberty or reproductive senescence.

In animals with estrous cycles, the onset of puberty

is marked by vaginal opening. Reproductive senes-

cence may manifest as persistent vaginal estrus fol-

lowed by anestrus.

Reproductive endocrine parameters. In estrous

and menstrual animals, the reproductive cycle is

characterized by the production of sex steroids from

the ovary in response to pituitary gonadotropins.

These gonadotropins are under hypothalamic con-

trol. Although it is possible to measure the relevant

hormones, evaluators must keep in mind that the

hormones are produced in a pulsatile fashion, with

cyclic variation in the amplitude and frequency of

the pulses. For this reason, single static measures

are unlikely to be informative unless a result is well

outside the normal ranges (e.g., castrate levels of

gonadotropins). Other strategies for evaluating en-

docrine parameters include serial measurements of

hormones in blood at short intervals, and response

of an endocrine measure to a stimulus. In the serial

measurement strategy, frequent sampling permits

the construction of a profile of the hormone change

over time, which may disclose the pulse pattern.

This method is difficult in animals with small blood

volumes where frequent sampling may produce its

Page 32: An evaluative process for assessing human reproductive and developmental toxicity of agents

92 Reproductive Toxicology Volume 9, Number 1, 1995

own effects. The second method, response of an

endocrine measure to a stimulus, involves sampling

an animal at a fixed time after administration of a

releasing factor. One can, for example, measure lu-

teinizing hormone (LH) after injecting gonadotro-

pin-releasing hormone (GnRH), or measure proges-

terone after injecting chorionic gonadotropin (82).

The disadvantage of this method is the possibility

that the injection of the releasing agent may cause

an atypical physiologic situation, so that one cannot

extrapolate the effect it “unmasks” to unmanipu-

lated animals. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

Culture methods. Tissue culture methods have

been used to study ovary slices in vitro, and cell

culture methods have been used for studying granu-

losa cells and myometrial cells. In culturing ovary

slices or granulosa cells, investigators often use the

release of sex steroids into the medium as an out-

come parameter. Under certain culture conditions,

granulosa cells will luteinize, producing a range of

steroid and nonsteroid products; of these, progester-

one is measured most commonly. Some studies,

however, have measured a number of other prod-

ucts, including nonsteroidal substances (83,84).

Some cell culture studies have made use of the con-

tractile properties of myometrial cells in evaluating

the potential of agents to alter uterine activity. In

all of these test systems, the artificial nature of the

in vitro setting may limit the predictive value of the

results.

Organ perfusion. Ovaries perfused in vitro are

useful systems for studying the mechanical aspects

of ovulation. The preparations allow observations

on the effects of agents in preventing rupture of the

follicle and expulsion of the oocyte. The perfusion

system is artificial, however, and the relocation of

the ovary from peritoneal cavity to the perfusion

chamber may alter the mechanical features of the

system. For this reason, data from perfusion studies

are not, in themselves, sufficient for drawing conclu-

sions about an agent’s reproductive toxicity.

Breast milk. Although an agent’s effects on lac-

tation can generally be identified by changes in the

lactation index (Table 3), it is also possible to use

histopathologic changes in breast tissue or changes

in the volume or composition of milk as indicators

of an agent’s effects (85). In reviewing data on lacta-

tion, evaluators must keep in mind that the number

and maturity of the offspring will influence both the

volume and the composition of the milk (86). Pro-

tein, fat, and electrolyte composition vary with time

of day and even within a feeding. Therefore, studies

that evaluate milk volume and composition must

incorporate very careful controls. Finally, although

it is also possible to measure levels of many xenobi-

otics in milk, the mere presence of an agent in milk

does not automatically indicate toxicity.

111.3.3.2 Interpretation

Indices. The end points listed in Table 3 are

considered to have a direct bearing on female repro-

ductive toxicity. Therefore, any statistically signifi-

cant, dose-dependent decrement in any one of these

parameters is sufficient to characterize the agent as

a female reproductive toxicant in animals. In contin-

uous-breeding protocols, one can also consider the

additional parameter of number of litters and time

between pregnancies. A progression of toxic effects

over the course of a continuous-breeding study

strengthens the conclusions of toxicity. When a con-

tinuous-breeding study shows an adverse effect, it

is desirable that the study also mate each member

of a breeding pair to an untreated control to identify

which member is affected by the agent. If a study

has not taken this step, it cannot be said with cer-

tainty that the observed effect is the result of female

reproductive toxicity; it may be equally likely that

a male effect or a couple effect is involved.

Because most standard animal reproduction

studies do not observe mating, they do not contain

evaluations of an agent’s effect on sexual behavior.

If a study does report observations of mating, the

failure of female rodents to assume a lordotic posi-

tion and to accept mounting is evidence of abnormal

sexual behavior. Additional signs include running

from or fighting with the male (87,88).

Most animal reproductive studies use chronic

dosing regimens in order to encompass all of the

biologic events that are important in reproduction.

When a study reports using a short dosing interval,

a positive result may still serve as evidence of repro-

ductive toxicity; however, it will not be possible to

interpret a negative result.

Cytology abnormalities. Persistent estrus, pro-

longed diestrus, and anestrus are abnormal findings.

Any of these are sufficient to characterize female

reproductive toxicity. Alterations in the distribution

of estrous or menstrual cycle length alone have not

been shown to be reliable predictors of reproductive

toxicity. By themselves, these alterations would be

insufficient to identify an agent as a reproductive

toxicant.

W eight and morphology changes. A statisti-

cally significant decrement in ovarian or uterine

Page 33: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process l J.

weight in a study properly controlled for cyclic varia-

tion is worthy of consideration and should signal the

need for additional studies. Similarly, an increase

in uterine weight in an acyclic or castrate animal,

or in a study that controlled for cyclic variation,

should raise concern about possible estrogenicity of

the test agent and should suggest that additional

studies are needed. Neither of these parameters, as

an isolated end point, is sufficient to characterize

an agent as a reproductive toxicant. A decrease in

the number of ovarian follicles or a change in follicle

subtype, however, is evidence of reproductive

toxicity. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

Biochemical changes. Evidence of change in

reproductive tract secretions or their biochemical

constituents is of interest, and may be interpreted

as evidence of toxicity if it supports or confirms

other data that indicate toxicity. Such changes

alone, however, are insufficient to characterize an

agent as a reproductive toxicant.

Alterations in age at puberty or reproductive

senescence. A change in the age at puberty or repro-

ductive senescence is sufficient to characterize re-

productive toxicity, although it is desirable to have

supporting data that explain the mechanism of tox-

icity.

Endocrine parameters. If changes in levels of

gonadal steroid or gonadotropic pituitary hormones

are detected in adequately designed studies, these

endocrine parameters do provide sufficient evidence

of reproductive toxicity. Typically, adequate studies

that detect toxicity will have multiple samples that

are obtained in a well-defined context that includes

sex, age, reproductive state, day of cycle, and other

relevant data. Results from these studies should in-

clude multiple values outside the normal physiologic

ranges, changes in hormone levels in physiologically

plausible directions, or failure of key hormonal

events (such as LH surge, preovulatory estradiol

rise, maintenance of luteal phase progesterone pro-

duction, etc.).

In vitro and perfusion systems. Any change ob-

served in an in vitro or organ perfusion system

should be considered supplemental. Isolated find-

ings of studies that use these systems are insufficient

to characterize an agent as a reproductive toxicant.

Breast milk. Changes in breast histopathology

or in breast milk amount or composition should sig-

nal the need for additional studies, and in particular,

the need for studies that evaluate the effect of such

changes on the nourishment and health of the off-

spring. The mere presence of xenobiotics in milk is

A. MOORE ET AL. 93

not, by itself, evidence of toxicity; however, if a

test agent is concentrated in milk, this should prompt

recognition of the need for studies on the nursling.

REFERENCES

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

16.

17.

18.

19.

20.

21.

OECD. OECD guidelines for testing chemicals. Organization for Economic Co-Operation and Development; 1987. Food and Drug Administration. Good laboratory practice regulations for nonclinical laboratory studies. Food and Drug Administration; 1988. Environmental Protection Agency. Environmental Protec- tion Agency Federal Insecticide, Fungicide and Rodenticide Act (FIFRA): Good laboratory practice standards. U.S. EPA; 1990. Environmental Protection Agency. Guidelines for develop- mental toxicity risk assessment. Fed Reg. 1991;56(234): 63797-826. Environmental Protection Agency. Proposed guidelines for assessing male reproductive risk. Fed Reg. 1988;53: 24850-69. Environmental Protection Agency. Proposed guidelines for assessing female reproductive risk. Fed Reg. 1988;53: 24834-47. Enironmental Protection Agency. Exposure factors hand- book. In: Office of Health and Environmental Assessment, Office ofResearch and Development. Washington, DC; 1990. Environmental Protection Agency. Guidelines for exposure assessment. Fed Reg. 1992;57:22888-938. Hallenbeck WH, Cunningham KM, eds. Quantitative risk assessment for environmental and occupational health. Chel- sea, MI: Lewis Publishers, Inc.; 1986. Environmental Protection Agency. Pesticide assessment guidelines for applicator exposure monitoring-Subdivision U. Washington, DC: Office of Pesticide Programs; 1987. Environmental Protection Agency. Dietary Risk Evaluation System (DRES). Washington, DC: Office of Pesticide Pro- grams; 1987. Morrissey RE, Lamb JC 4th, Schwetz BA, Teague JL, Morris RW. Association of sperm, vaginal cytology, and reproduc- tive organ weight data with results of continuous breeding reproduction studies in Swiss (CD-l) mice. Fundam Appl Toxicol. 1988;11:359-71. Morrissey RE, Lamb JC IV, Morris RW, Chapin RE, Gulati DK, Heindel JJ. Results and evaluations of 48 continuous breeding reproduction studies conducted in mice. Fundam Appl Toxicol. 1988;13:747-77. Wilson JG, Scott WJ, Ritter EJ, Fradkin R. Comparative distribution and embryotoxicity of hydroxyurea in pregnant rats and rhesus monkeys. Teratology. 1975;11:169-78. Wilson JG, Ritter EJ, Scott WJ, Fradkin R. Comparative distribution and embryotoxicity of acetylsalicylic acid in pregnant rats and rhesus monkeys. Toxicol Appl Pharmacol. 1977;41:67-78. Kimmel CA, Young JF. Correlating pharmacokinetics and teratogenic end points. Fundam Appl Toxicol. 1983;3:250-5. Kimmel CA, Francis EZ. Proceedings of the workshop on the acceptability and interpretation of dermal developmental toxicity studies. Fundam Appl Toxicol. 1990;14:386-98. Lilienfeld AM, Lilienfeld DE, eds. Foundations of epidemiol- ogy, 2nd ed. New York: Oxford University Press; 1980. Barnes DG, Dourson M. Reference dose (RfD): Description and use in health risk assessments. Regul Toxic01 Pharmacol. 1988;8:471-86. Terry KK, Elswick BA, Stedman DB, Welsch F. Develop- mental phase alters dosimetry-teratogenicity relationship for 3-methoxyethanol in CD-l mice. Teratology. 1994;49: 218-27. Crump KS. A new method for determining allowable daily intakes. Fundam Appl Toxicol. 1984;4:854-71.

Page 34: An evaluative process for assessing human reproductive and developmental toxicity of agents

94 Reproductive Toxicology Volume 9, Number 1, 1995

22.

23.

24.

25.

26.

27.

28.

29.

30.

31.

32.

33.

34.

35.

36.

37.

38.

39.

40.

41.

42.

43.

Barnes DG, Daston GP, Evans JS, Jarabek AM, Kavlock RJ, Kimmel CA, Park C, Spitzer HL. Benchmark dose work- shop: Criteria for use of a benchmark dose to estimate a reference dose. Regul Toxic01 Pharmacol. (in press). Allen BC, Kavlock RJ, Kimmel CA, Faustman EM. Dose-response assessment for developmental toxicity: II. Comparison of generic benchmark dose estimates with NOAELs. Fundam Appl Toxicol. 1994;23:487-95. Allen BC, Kavlock RJ, Kimmel CA, Faustman EM. Dose-response assessment for developmental toxicity: III. Statistical models. Fundam Appl Toxicol. 1994;23:496-509. Lewis SC, Lynch JR, Nikiforov AI. A new approach to deriving community exposure guidelines from “no-observed- adverse-effect-levels.” Regul Toxic01 Pharmacol. 199O;ll:

314-30. Renwick AG. Safety factors and establishment of acceptable daily intakes. Food Addit Contam. 1991;8:135-50. Faustman EM, Allen BC, Kavlock RJ, Kimmel CA. Dose-response assessment for developmental toxicity: I. Charaterization of data base and determination of NOAELs.

Fundam Appl Toxicol. 1994;23:478-86. Anderson LM, Donovan PJ, Rice JM. Risk assessment for transplacental carcinogens. In: Li AP, ed. New approaches in toxicity testing and their application in human risk assess- ment. New York: Raven Press; 1985:179-202. Herbst AL, Ulfelder H, Poskanzer DC. Adeocarcinoma of the vagina: Association of maternal stilbestrol therapy with appearance in young women. N Engl J Med. 1971;284:878. Environmental Protection Agency. Guidelines for carcinogen risk assessment. Fed Reg. 1986;51:33992-4003. Environmental Protection Agency. Guidelines for mutagenic- ity risk assessment. Fed Reg. 1986;51:34006-12. Scott DT. Detection of neurobehavioral dysfunction in in- fancy: Current methods problems and prospects. In: Bracken MB, ed. Perinatalepidemiology. New York: Oxfornd Univer-

sity Press; 1984:464-90. Food and Drug Administration. Guidelines for reproduction and teratology of drugs. Bureau of Drugs; 1966. Food and Drug Administration. Advisory committee on Pro- tocols for Safety Evaluations. Panel on reproduction report on reproduction studies in the safety evaluation of food addi- tives and pesticide residues. Toxic01 Appl Pharmacol. 1970;16:264-96. OECD. Guideline for testing of chemicals-Teratogenicity. Organization for Economic Cooperation and Development;

1981. Environmental Protection Agency. Pesticide assessment guidelines, subdivision F. Hazard evaluation: Human and

domestic animals; 1982. Environmental Protection Agency. Toxic Substances Con- trol Act test guidelines; Final rules. Fed Reg. 1985;50:

39426-34. Environmental Protection Agency. Triethylene glycol mono- methyl, monoethyl, and monobutyl ethers; Proposed test rule. Fed Reg. 1986;51: 17883-94. Environmental Protection Agency. Diethylene glycol butyl ether and diethylene glycol butyl ether acetate; Final test rule. Fed Reg. 1988;53:5932-53. Lamb JC IV. Reproductive toxicity testing: Evaluating and developing new testing systems. J Am Co11 Toxicol. 1985;

4:163-71. Wilson JG. Environment and birth defects. New York: Aca- demic Press; 1973:30-2. Selevan SG, Lemasters GK. The dose-response fallacy in human reproductive studies of toxic exposures. J Occup Med. 1987;29:451-4. Kimmel CA, Price CJ. Developmental toxicity studies. In: Arnold DL, Grice HC, Krewski DR, eds. Handbook of in vivo toxicity testing. San Diego, CA: Academic Press; 1990:271-301.

44. Riley EP, Vorhees CV, eds. Handbook of behavioral teratol- ogy. New York: Plenum Press; 1986.

45. Kavlock RJ, Grabowski CT, eds. Abnormal functional devel- opment of the heart, lungs, and kidneys: Approaches to func- tional teratology. Proceedings of a conference, Asheville, NC, May 11-13, 1983. Prog Clin Biol Res. 1983;140:1-387.

46. Fujii T, Adams PM. Functional teratogenesis: Functional effects on the offspring after parental drug exposure. Tokyo, Japan: Tokyo University Press; 1987.

47. Environmental Protection Agency. Assessment of risks to human reproduction and to development of the human con- ceptus from exposure to environmental substances; 1982.

48. Environmental Protection Agency. Hazard evaluation divi- sion standard evaluation procedure. Teratology studies. Of- fice of Pesticide Programs; 1985.

_.

49. Chernoff N. Kavlock RJ. An in vivo teratoloev screen utiliz-

50.

51.

52.

53.

54.

55.

56.

57.

58.

59.

60.

61.

62.

63.

64.

65.

66.

67.

ing pregnant mice. J Toxicol Environ Hyalth. 1982;lO: 541-50. Kavlock RJ, Short RD Jr, Chernoff N. Evaluations of an in vivo teratology screen. Teratogenesis Carcinog Mutagen.

1987;7:7-16. Wickramaratne GA. The Chernoff-Kavlock assay: Its vali- dation and application in rats. Teratogenesis Carconog Muta- gen. 1987;7:73-83. Wilson JG. Survey of in vitro systems: Their potential use in teratogenicity screening. In: Wilson JG, Fraser FC, eds. Handbook of teratology. vol. 4. New York: Plenum Press; 1978:135. Kimmel GL, Smith K, Kochhar DM, Pratt RM. Proceedings of the consensus workshop on in vitro teratogenesis testing. Teratoaenesis Carcinoa Mutagen. 1982:2:221-374. BrownNA, Fabro S. The in v&o approach to teratogenicity testing. In: Snell K, ed. Developmental toxicology. London: Croom-Helm; 1982:31-57. Kimmel GL. In vitro tests in screening teratogens: Considera- tions to aid the validation process. In: Marois M, ed. Preven- tion of physical and mental congenital defects, part C. New York: Alan R. Liss. Inc.: 1985:259-63. Whitby KE. Teratological research using in vitro systems. III. Embryonic organs in culture. Environ Health Per-

spect.1987;72:221-3. Brown NA. Teratogenicity testing in vitro: Status of valida- tion studies. Arch Toxicol Suppl. 1987;11:105-14. Faustman EM. Short-term tests for teratogens. Mutat Res. 1988;205:355. Schwetz BA, Morrissey RE, Welsch F, Kavlock RJ. Pro- ceedings of a conference on in vitro teratology. Environ Health Perspect. 1991;94:265-8. Kistler A. Limb bud cell cultures for estimating the terato- genie potential of compounds. Arch Toxicol. 1987;60:403-14. Chap& RE, Heindel JJ. Male reproductive toxicology. New York: Academic Press. Inc.: 1993:389 (Tvson CA. Witschi H, eds. Methods in Toxicology; vol 3A). . Francis EZ, Kimmel GL. Proceedings of the workshop on one- versus two-generation reproductive effects studies. J Am Co11 Toxicol. 1988;7:91 l-25. Green S, Auletta A, Fabricant J, et al. Current status of bioassays in genetic toxicology-The dominant lethal assay: A report of the U.S. Environmental Protection Agency Gene- Tox Program. Mutat Res. 1985;154:49-67. Hess RA, Moore BJ. Histological methods for evaluation of the testis. In: Chapin RE, Heindel JJ, eds. Male reproductive toxicology. New York: Academic Press, Inc.; 1993:52-85 (Tyson CA, Witschi H, eds. Methods inToxicology; vol3A). Committee on Biologic Markers. Biologic markers in repro- ductive toxicology. Washington, DC: National Academy of

Sciences; 1989. Zenick H, Clegg ED. Assessment of male reproductive toxic- ity: A risk assessment approach. In: Hayes AW, ed. Princi- ples and methods of toxicology. 2nd ed. New York: Raven Press; 1989:279-309. Levine RJ, Symons MJ, Balogh SA, Amdt DM, Kaswandik NR, Gentile JW. A method for monitoring the fertility of workers: I. Method and pilot studies. J Occup Med. 1980; 22:781-91.

Page 35: An evaluative process for assessing human reproductive and developmental toxicity of agents

Evaluative process l J. A. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAMOORE ET AL. 95

68. Starr TB, Dalcorso RD, Levine RJ. Fertility of workers: A comparison of logistic regression and indirect standardiza- tion. Am J Epidemiol. 1986;123:490-8.

69. Baird DD, Wilcox AJ, Weinberg CR. Using time to pregnancy to study environmental exposures. Am J Epidemiol. 1986; 124:470-80.

70. Jick H, Porter J. Relation between smoking and age of natural menopause. Report from the Boston Collaborative Drug Sur- veillance Program, Boston University Medical Center. Lan- cet. 1977;1:1354-5.

71. Treloar AE, Boynton RE, Behn BG, Brown BW. Variation in the human menstrual cycle through reproductive life. Int J Fertil. 1970;12:77-126.

72. Shortridge LA. Assessment of menstrual variability in work- ing populations. Reprod Toxicol. 1988;2: 171-6.

73. Wilcox AJ, Weinberg CR, Wehmann RE, Armstrong EG, Canfield RE, Nisula BC. Measuring early pregnancy loss: Laboratorv and field methods. Fertil Steril. 1985:44:366-74.

74. Wilcox AJ, Weinberg CR, O’Connon JF, et al. Incidence of early loss of pregnancy. N Engl J Med. 1988;319:189-94.

75. Harvey SM. Female sexual behavior: Fluctuations during the menstrual cycle. J Psycosom Res. 1987;31:101-10.

76. Strauss B, Appelt H. Psychological concomitants of the men- strual cycle: A prospective longitudinal approach. J Psy- chosom Obstet Gynecol. 1983;2:215.

77. Heindel JJ, Chapin RE. Female reproductive toxicology. New York: Academic Press, Inc.; 1993:404 (Tyson CA, Witschi H, eds. Methods in Toxicology; vol 3B).

78. May PC, Finch CE. Aging and responses to toxins in female reproductive functions. Reprod Toxicol. 1988;1:223-8.

79.

80.

81.

82.

83.

84.

85.

86.

87.

88.

Pederson T, Peters H. Proposal for a classification of oocytes and follicles in the mouse ovary. J Reprod Fertil. 1968;17:555.

Smith BJ, Plowchalk DR, Sipes IG, Mattison DR. Compari- son of random and serial sections in assessment of ovarian toxicity. Reprod Toxicol. 1991;5:379-83. Teng CT, Walker MP, Bhattacharyya SN, Klapper DG, Di- Augustine RP, McLachlan JA. Purification and properties of an estrogen-stimulated mouse uterine glycoprotein (approx. 70 kDa). Biochem J. 1986;240:413-22. Hughes CL. Effects of phytoestrogens on GnRH-induced luteinizing hormone secretion in ovariectomized rats. Reprod Toxicol. 1988;l: 179-81. Haney AF, Hughes SF, Hughes CL. Screening of potential reproductive toxicants by use of procine granulosa cell cul- tures. Toxicology. 1984;30:227-41. Teaff NL, Savoy-Moore RT, Subramanian MG, Ataya KM. Vinblastine reduces progesterone and prostaglandin E pro- duction by rat granulosa cells in vitro. Reprod Toxicol. 1990;4:209-13. Wilson JT. Determinats and consequences of drug excretion in breast milk. Drug Metab Rev. 1983;14:619-52: Butte NF. Garza C. Johnson AJ. O’Brien-Smith E. Nichols BL. Longitudinal changes in milk composition of mothers delivering preterm and term infants. Early Hum Dev. 1984;9:153-62. Uphouse LL. Effects of chlordecone on neuroendocrine function of female rats. Neurotoxicology. 1985;6: 191-210. Uphouse LL, Williams J. Sexual behavior of intact female rats after treatment with o,p’DDT or p,p’-DDT. Reprod Tox- icol. 1989;3:33-41.