-
7/28/2019 Development of Functional Movement Scale for
Infants
1/8
J O U R N A L O F A P P L I E D M E A S U R E M E N T , 3 ( 2 ),
1 9 0 -2 0 4Copyright 2 0 0 2 D E V E L O P M E N T O F A F U N C T
IO N A L M O V E M E N T S C A L E F O R I N F A N T S 19 1
Development of a FunctionalM ovement Scale for InfantsSuzann K.
Campbell
University of Illinois at ChicagoBenjamin D. WrightJ. Michael
Linacre
University of Chicago
The increasing survival rate of infants with a complicated birth
and perinatal higenerated the need for a test of functional motor
performance with the capabilityidentifying children under four
months of age with delayed dev elopment which could'addressed with
physical therapy. This paper describes a Rasch analysis of the
psychomequalities of the Test of Infant Motor Performance (TIMP)
for the purpose of reducinglength of the test while maintaining its
precision as a measurement device. Folioanalysis of fit statistics,
item-to-total correlations, redundancy of item difficulty mealand
consideration of clinically-relevant features of test items from
analysis of 17 32the TIMP was reduced from 59 to 4 2 items forming
a functional motor scale for premaborn infants. The resulting
person separation index was 4 .85 and the item sepindex was
23.79.
Req uests for reprints should be sent to Suzann K. Campbell,
Department of PhTherapy, University of Illinois at Chicago, 1919 W.
Taylor Street M/C 898, Chica60612-725 1, email. skc @uic.edu .
An increasing number of infants with perinatal complications now
survive as aresult of advanced technology and caregiving
interventions provided in specialcare nurseries. The developmental
morbidity associated with this trend, how-ever, appears to be
unchanging (Fanaroff, Wright, Stevenson, Shankaran,Donovan,
Ehrenkrans, Younes, Korones, Stoll, Tyson, Bauer, Oh, Lemons,
Papile,and Verter, 1995). For example, infants with brain insults
(Pinto-Martin, Whitaker,Feldman, V an Rossem, and Paneth, 200 0) or
chronic lung disease (Majnemer,Riley, Shevell, Birnbaum,
Greenstone, and Coates, 2000), and those born ex-tremely early tend
to have delayed motor development (McCormick, McCarton,Tonascia,
and Brooks-Gunn, 1 99 3) Improving the accuracy with which
infantsat risk for disability are identified remains a challenge,
in part because infantscan recover from some neurodevelopmental
abnormalities during the first yearof life (Wildin, Smith,
Anderson, Swank, Denson, and Landry, 19 97). Thepurpose of this
paper is to describe the use of Rasch analysis in the developmentof
a test of functional motor performance intended to identify infants
whos emovement capabilities demonstrate delayed development. The
specific goal ofthis analysis was to reduce the length of the
test.
T h e T es t o f In f an t M ot or P er f orm an ceThe Test of
Infant Motor Performance (TIMP) is a 25- 35 m inuteassessment of
the posture and selective control of mov ement needed byinfants
under four months of age for functional performance in daily
life.The TIMP was developed to 1) identify infants with delayed
motor devel-opment, 2) discriminate among infants with varying
degrees of risk forpoor motor outcome, and 3) measure change
resulting from intervention.The TIMP has gone through three
research versions to date.V ersion 1 of the TIMP was developed by G
irolami for use in assessingthe effects on posture and movement of
physical therapy provided to prema-ture infants at risk for movem
ent dysfunction in a special care nursery(Girolami and Campbell,
1994). The test included fifteen dichotomous itemsthat were scored
on the basis of observing infants' spontaneous movements.A n
example of these Observed Item s is one used to assess the infant's
abilityto hold the head centered along the midline of the body,
i.e., keep the headfrom falling to the side while backlying.
Twenty-eight additional ElicitedItems were scored on Likert-type
scales reflecting the infant's responses tobeing placed in various
positions or stimulated with interesting sights andsounds, such as
a rattle or the examiner's voice. Results of a controlled
clinical trial demonstrated that the TIMP was a valid tool for
capturing theeffects of intervention (G irolami and Campbell, 199
4).
-
7/28/2019 Development of Functional Movement Scale for
Infants
2/8
D E V E L O P M E N T O F A F U N C T I O N A L M O V E M E N T
S C A L E F O R IN F A N T S 1 9 392A M P B E L L , e t a l .Based
on the sensitivity of TIMP sco res for detection of interven-tion
effects, the decision was made to expand the test in order to
increasethe age range to include performance of premature infants
as young as 32weeks postconceptional age, i.e., 2 months prior to
expected date of birth,as well as infants up to 3-4 months of age
(chronologic or corrected forpremature delivery). The resulting V
ersion 2 of the TIM P had 27 dichoto-mous O bserved Items and 30 E
licited Items scored with rating scales. Therating scales for some
TIMP items allow one to assess whether or howmuch of an activity an
infant can do w hile others are used to assess changesin how an
infant solves a mov ement problem, i.e., changing patterns
ofmovement synergies used in response to stimulation.
With funding from the Foundation for Physical Therapy, TIMP V
er-sion 2 was studied in a cross-sectional sample of 137 infants
from threerace/ethnicity groups in the Chicago metropolitan area:
non-Latino/a white,black (African or African-American), and
Latino/a (Mexican or PuertoRican) (Campbell, O sten, Kolobe, and
Fisher, 199 3; Campbell, Kolobe,O sten, Lenke, and Girolami, 1995
). Data from 174 tests on these subjects,was analyzed with BIGSTE
PS V .2.2. Clarity of the measure as reflectedin the item
separation index was 7 .38 (root mean square error=0 .19) witha
separation reliability of .98. The items separated the infants into
about 6levels of ability (person separate index of 6.02 with root
mean square er-ror=0.21) which was thought to be excellent in view
of the 5- 6-monthrange of the test. TIMP performance correlated
with age at r =(Campbell, Kolobe, O sten, Lenke, and Girolami, 19
95 ).
A fter study of R asch analysis results on TIMP V ersion 2 and a
reviof new literature on possible predictors of abnormal outcome,
the test deopers honed their theoretical understanding of what the
T IMP shouldcomplish and made several new changes to the test. As a
result, V ersionthe TIMP was developed which included 28 O bserved
Items and 31 E liItems, 6 of which were the same items but used to
test different sides obody so that asymmetry of movement could be
assessed.V arious aspects of the TIMP 's validity were explored in
researgraduate students or with funding from the N ational Center
for MReh abilitation Research of the National Institutes of Health.
V.3TIMP was shown to 1) relate to movement demands on infants proby
caretakers in n aturalistic interactions such as bathing, dressing,
,
by maturation and to differentiate among groups of infants with
cliff
degrees of risk for poor motor outcome based on medical
conditions(Campbell and Hedeker, 20 01), and 3) predict 12-month
outcome on theAlberta Infant Motor Scale with high sensitivity and
specificity (Campbell,Kolobe, Wright, and Linacre, in press).
Prediction to 5-year outcome iscurrently under study ( Kolobe and B
ulanda, personal communication).Despite strong support for the
proposed uses of the TIMP resulting
from our research, a shorter tool was de sirable for practical
clinical use.The purpose of this paper is to present evidence from
Rasch analysis ofTIMP V .3 data and to describe how R asch item fit
statistics and otherconsiderations were used to shorten the test to
create V ersion 4.M e t h o d s
SubjectsThe subjects in this study were a sample of convenience
born during theyears 199 6-98 and recruited from the special care
nurseries of three hospitals or
from the com munity within the Chicago metropolitan area.
Subject recruitmentmethods were approved by the Institutional
Review Board for the protection ofthe rights of human subjects at
the University of Illinois at Chicago (#H-9 9-115 8) and at each
field testing site. Subjects were 15 9 infants with a range
ofmedical complications who participated either in a study of
test-retest reliabilityover the space of three days (n=56) or in
both the test-retest and a longitudinalstudy (n=103) of performance
on the TIMP. Fifty-five percent of the infantswere male, 42% female
(3% had missing data on sex). The distribution of race/ethnicity
was 38% white, non-Latino/a; 31% black (African or
African-Ameri-can); 26% Latino/a; and the rest Asian (1% ), mixed
(1% ), or missing (3% ).Procedures
With the informed consent of a parent to test an eligible
subject andmedical clearance from the infant's physician, the
infants in the longitudi-nal study were scheduled for testing with
the TIMP every week until ap-proximately 4 months corrected age.
The number of weekly tests eachinfant received ranged from 2-23 .
Test numbers varied because of 1) ageand health status of the
infant at recruitment, 2) illness, and 3) familyscheduling
conflicts. In the test-retest study, infants were tested
twicewithin the space of 3 days.At initial testing, infants were
required to be off mechanical ventila-
tion and cleared for testing by their physician, but could be
receiving oxy-
-
7/28/2019 Development of Functional Movement Scale for
Infants
3/8
D E V E L O P M E N T O F A F U N C T I O N A L M O V E M E N T
S C A L E F O R IN F A N T S 1 9 594A M P B E L L , e t al .gen by
nasal canula. Thus infants began testing at different ages based
onhealth. The infant was tested in its current environment:
isolette with vitalsigns monitors in place, open crib, home, o r
occasionally during an outpa-.tient clinic visit. Testers were not
told the age or m edical history of theinfant before testing
(unless information was needed to guarantee safehandling of an
infant during assessment). Testing occurred about one hourprior to
expected feeding time for preterm infants or about mid-way be-tween
feedings for older infants.
Twelve testers participated in this study. Each tester had
experienceda period of training in use of the TIMP which consisted
of a 4-hour wo rk-shop on development and validation of the test
and how to score it, inde-pendent reading of research on the test,
practice in testing at least 10 infants of varying ages, and rating
of item performances on 14 v ideotapes of in-fants of different
ages with or without a variety of medical complications.Rater
consistency was evaluated with the Facets computer program forRasch
psychometric analysis (Linacre, 1988); raters needed to have
fewerthan 5% misfitting ratings, i.e., unexpected ratings given the
infant's leveof ability on an item and the item difficulty, in
order to qualify for being atester in this study.Data Analysis
Scores on the TIMP were subjected to Rasch analysis using theBIG
STE PS computer program V er. 2.65 in order to transform the raw
ordinal scores into interval-level logit measures (Wright and
Linacre, 199Wright and Masters, 19 82). A ccording to the Rasch
model, the probaity of passing an item is based only on the ability
of the subject and thdifficulty of the item and its various scale
levels. Analy sis yields bopopulation-independent estimates of item
parameters, and individuality estimates for the latent trait being
measured (H ambleton, 2000 ), in tcase functional motor performance
in early infancy.
The R asch analytical rating scale model was the A ndrich model,
bwith each Elicited Item conceptualized as having its own rating
scale struture as in the Master's partial credit model. The partial
credit model wselected because the rating scale for each E licited
Item is unique and tnumber of levels varies from 5 to 7. G iven
their dichotomous natO bserved Items were assumed to form a group
as a whole. BIGSTEbegins with a central estimate for each person
measure, item calibratioand rating scale step calibration. An
iterative version of the no rmal a
proximation algorithm is used to reach a rough conv ergence to
the ob-served data pattern. The unconditional maximum likelihood me
thod, us-ing proportional curve fitting, is then iterated to obtain
more exact estimates,standard errors and fit statistics. The scale
mean for item difficulty was setto 50 with one logit (log-odds
unit) equal to 10 points.Item fit to the Rasch mod el was
investigated using infit (informa-
tion-weighted fit statistic which is sensitive to unexpected
behav ior af-fecting responses to items near the person's ability
measure) and outfit (anoutlier-sensitive fit statistic sensitive to
unexpected behavior by personson items far from the person's
ability level) m easures. Fit statistics arereported as mean square
residuals which have approximate chi-square dis-tributions. In BIG
STE PS values of standardized fit statistics are obtainedfrom the
squared residuals by means of an asymptotic normal
distribution(Windmeijer, 19 90) . Mean-square fit statistics of 1.2
or greater were usedas the cutoff for identifying misfitting items
for further evaluation. V aluesbelow one w ere not addressed in the
analysis.Items with unsatisfactory fit to the Rasch mod el were
reviewed forthe possibility of removal from the test or revision to
improve item fit.Before consideration of remo ving any misfitting
item from the test, sev-eral factors other than Rasch results were
also considered. E licited Itemswere compared with data on
frequency of occurrence as a naturalistic de-mand for m ovement
during caregiver-infant interactions so that no itembelieved to be
of functional significance in daily life would be rem ovedwithout
considering whether the item could instead be revised to
improveitem response characteristics. Second, how well therapists
and parentsliked these items was considered; item deletion was
supported by findingthat the items were hard to administer, seemed
particularly demanding forfragile infants, or otherwise were
difficult for parents to understand basedon therapists' impressions
of parent and child responses. A final consid-eration was whether
removal of an item would be likely to decrease testprecision
because its item difficulty was not duplicated by o ther items
atthe same level of scale performance. A finding of redundancy in
difficultywith another item was also grounds for consideration of
deleting an item.A goal of the analysis was to reduce the number of
items, if possible, inorder to minimize infant stress and fatigue
and to increase the practicalityof using the test by reducing the
time required to administer it.
-
7/28/2019 Development of Functional Movement Scale for
Infants
4/8
DEVELOPMENT OF A FUNCTIONAL MOVEMENT SCALE FOR INFANTS 19 7Resu
lts
Testing in the longitudinal study and the test-retest
reliability studyresulted in data from 172 3 tests crossing the
entire range of age from 32weeks postcon ceptional age through 4
months post-term. O nly the firsttest of a test-retest pair was
used in the data analysis. Tests from 1719non-extreme scoring
individuals were assessed. The mean Rasch measurewas 55.43 w ith a
SD of 10.49 and mean SE of 1.98 with a SD of .45.O verall infit
mean square for persons was .98 and outfit mean square was1.10. The
person separation index was 4.57 with a reliability of .95.A wide
range of item difficulty w as achieved with an item separationindex
of 21 .37 and reliability of 1.00. The m ean item measure (as set
forthe analysis) was 50 .00 w ith a SD of 11.92, m odel error of
.48 with a SD
Table 1Difficulty Calibrations, Infit and Outfit Mean Square
Values, and Point BiserialCorrelations for Items Misfitting the
Rasch Model*
ITEMIFFICULTY INFITMNSQ OUTFITMNSQ PT.BISERIAL02 Head turn L 55
1.13 1.23 .3003 Head turn R 55 1.15 1.24 .2 904 Hands together 58
1.19 1.27 .2 405 R Hand to mouth 46 1.45 2.23 -.1006 L Hand to
mouth 50 1.44 1.89 -.0307 Mouths R hand 72 1.19 1.38 .1908 Mouths L
hand 73 1.17 1.36 .2009 Individual R finger movements 37 1.23 1.71
.05010 Individual L finger movements 39 1.26 1.78 .02011 R wrist
movements 37 1.11 1.29 .18012 L wrist movements 37 1.11 1.30 .18015
Pelvic lift in supine 43 1.23 1.55 . 12016 Hip/knee flexion 20 1.04
1.38 .10017 R ankle movements 29 1.06 1.28 .17018 L ankle movements
30 1.06 1.31 .17023 Arm movements with forearm off surface 53 1.33
1.47 .10024 Arm movements with upper arm off surface 45 1.26 1.46
.12E3 Straighten spine with head supported 47 1.55 2.26 .40Ell R
neck rotation 47 1.06 1.35 .59E13 Defensive head movements 28 1.46
7.90 .10E14 Defensive arm movements 61 1.31 1.35 .56E28 Arm release
in prone 57 1.58 1.53 .49E29 Standing 62 1.21 1.17 .58;g* Bold
lettering indicates value exceeding the targeted range for item fit
statistic of 1.2,
of .23. T he third column from the left in Figure 1 shows that
the averageitem difficulties ranged from 20 to 83 (D represents
dichotomously scoredO bserved Items and X's represent scaled
Elicited Items). Calibrations forthe easiest step on Elicited Items
extend the range of the scale down to 10(Fig. 1, second column) w
hile calibrations for the hardest step extend thescale up to 85
(Fig. 1, fourth column) Point biserial correlations for itemsranged
from -.10 to .78 w ith 29 of 59 (49 % ) greater than .50. Infit
meansquare for items was 1.01 with an outfit of 1.21 .
Information on misfitting items is presented in Table 1 . The
identi-fied items had difficulty values ranging from 20 to 73 and
point biserialcorrelations ranging from -.10 to .59. R eview of the
infit mean squaredata for each item revealed 12 items with infit
greater than 1.2: 5 E licitedI tems (E3, E 13, E 14, E 28, and E29)
and 7 O bserved I tems (05, 06, 09,MEASURE
MAP OF INFANTS AND ITEMS MEASURE100.0 100.0
90.0 90.0
XX80.0 . xxxx 80.0. # XX X. # XX X.## XXX.### XXX.### XX X70.0
.##### # # # x 70.0.##### # # # # xx XX XXD60.0 . # # # # # # # #
XD 60.0# # # # # # # # XX########## XXXDD. # # # # # # # # # # #
DD. # # # # # # # # # # XXXD########### XX XXX50.0 . # # # # # # #
# # XXDDD 50.0############# XX. # # # # # # # # # XXXXXX. # # # # #
# # # XXXDD.### # # # D# # # # # #40.0 .### XXXX 40.0# DXX
Xxxxxx
DDDDx
30.0 XXXx DDx 30.0XXX XXXXx20.0 xxxxxxx
20.0
10.0 x 10.0INFANTSTEMS-LOWTEMS-MEANTEMS-HIGH
EACH '#' IN THE INFANT COLUMN IS 10 INFANTS; EACH '.' IS 1 TO 9
INFANTSFigure 1. TINIP BIGSTEPS-Version 3 Item Scaling
Properties
19 6A M P B E L L , e t a l .
-
7/28/2019 Development of Functional Movement Scale for
Infants
5/8
D E V E L O P M E N T O F A F U N C T I O N A L M O V E M E N T
S C A L E F O R I N F A N T S 1 9 998A M P B E L L , e t al .010,
015, 023, and 024). Of the 12, 11 items also had unacceptableoutfit
statistics. In addition, 11 items-02, 03, 04, 07, 08, 011, 012,016
, 017 , 01 8, and E ll-had outfit values greater than 1 2Based on
these results, the following O bserved Items were deleted informing
V .4 of the TIMP: 02 through 08, 011, 012, 01 5, 023, and 024.With
the exception of 07 and 08, other items with similar average
diffi-culty levels as the items deleted were already available in
the scale. Withrespect to 07 and 0 8 with difficulty measures of 72
and 73 , respectively,although other items of similar average
difficulty are lacking in this abilityregion, the rightmost column
o f Figure 1 shows that several Elicited Itemshave their highest
step level in this region . Thus precision at the h igherability
levels should be maintained without 07 and 08. Finally,
pointbiserial correlations for all removed O bserved Items were low
(< .31).
Items 020, 021, and 022 were also deleted, although their fit
stalltics were satisfactory, because during testing they engender a
positichange from supine to prone that w as an undesirable
disturbance dunspontaneous movement observation, and because they w
ere redundantdifficulty with reliably obtained E licited Items E2
4, E 26, and E 27 whiassess the same movements under conditions of
stimulation.
O f the misfitting Observed Items, only 09, 010, 0 16, 017, and
0were retained. 09 and 01 0 are used to assess the ability to
perform invidual finger movements. Their misfit values were just
above 1.2 (1.and 1.26 , respectively). They were relatively easy
movements for infto display (difficulty 37 and 39 , respectively)
in an area of the scale wfew items of similar average difficulty so
deleting them might reduceprecision of the scale for the lowest
functioning infants. In addition thedelicate finger movements were
items that fascinated parents when pointout by testers. Observed
Item 0 16 is used to measure kicking and waseasiest item in the
test with no other items of similar difficulty. 016's iwas
satisfactory and kick ing is a fundamental movement sk ill that
thethors believed should be retained in the test. 017 and 01 8 are
usedassess isolated movements of the right and left ankle,
respectively, ageasy items that were not redu ndant in difficulty
with other items.values were satisfactory so the items w ere
retained.
O f the five Elicited Items selected for possible revision or
deletifrom the TIMP because of high infit values, E3 and E 28 were
removed,addition to support for deleting these items based on
misfit statistics,were items deemed by therapists to be ones
families liked less well
others and both often resulted in crying. Although the point
biserial valuefor E 3 w as moderate at .40, neither item was
identified as being similar tocaretaker demands for movement in
Murney and Campbell's study (19 98)of the ecologic relevance of TIM
P E licited Items. O ther items were avail-able with similar
difficulties so remo val did not seem likely to affect theprecision
of the test.Items El 3, E 14, and E 29 w ere not removed but were
instead revised,in part because each of them was found to occur
moderately or frequentlyoften in naturalistic interactions (Murney
and Campbell, 1998). Problemswith E 1 3, one of the easiest items
in the test, were believed to come frominfrequent use of several
response categories and little variation in infantperformance,
leading to an exceptionally high outfit value (7.90) and poorpoint
biserial value (.10 ). R ather than removing the item from the
test, itwas revised by collapsing two levels, making the item a
4-level rather than5-level item, and by changing the time allowed
for the response to occur tomake it a more difficult item.E 14 had
a point biserial correlation of .56 and w as not redundantwith
other items in average difficulty. The item, therefore, was revised
tocombine the old steps 2 and 3 into one step, leaving a 5- rather
than 6-levelitem.E29 is a relatively difficult item (difficulty
measure 62) that is used toassess supported standing, an activity
therapists believed could not be removedfrom the test because
virtually all infants are placed in supported standing bytheir
parents (91 % in the study of ecologic relevance of TIMP E licited
Items),making it important in daily life interactions. E29's point
biserial correlation of.58 also favo red keeping the item but
revising it to better reflect what infantsactually did during item
administration. To accomplish this task, v ideoclips ofthe standing
item from 55 infant observations were reviewed by an
experiencedtester. She proposed changes to the item that reflected
the infant performancescaught on videotape and the item
descriptions were revised.Finally, item E ll with a poor outfit was
not removed because an-other item used to test the same activity on
the opposite side of the body(E 12) had excellent fit statistics
and E ll's point biserial correlation of .59favored keeping the
item. Items that wo uld be capable of reflecting asym-metry of
performance on the two sides of the body were believed to
beimportant items to retain because of their possible diagnostic
value.The resulting V .4 of the TIMP has 42 items-13 O bserved
Items and29 E licited Items-6 of which are tests of the same
activity on either side
-
7/28/2019 Development of Functional Movement Scale for
Infants
6/8
D E V E L O P M E N T O F A F U N C T I O N A L M O V E M E N T
S C A L E F O R I N F A N T S 20 100A M P B E L L , e t al .of the
body. A fter deleting the 17 selected items from the test, a newBIG
STE PS analysis was run using the 42 items of V.4 (before revision
ofrating scale descriptions of items E1 3, E 14 , and E2 9). The
resulting sum-mary statistics demonstrate that deletion of these
items shortens the lengthof the test but does no t significantly
change the q uality of the measuringdevice. The mean Rasch measure
increased to 57.58 with a SD of 13.05and mean SE of 2.34 with a SD
of .64. O verall infit mean square forpersons increased slightly to
1.01 and o utfit mean square to 1.14. Theperson separation index
improved to 4 .85 w ith a reliability of .96.
A gain, a wide range of item difficulty w as demonstrated with
an itemseparation index of 23.7 9 and reliability of 1.00 . The
mean item measure(as set for the analysis) was 50.00 with a SD of
13.49 , model error of .47with a SD of .25. Item difficulties
ranged from 2 0 to 87. P oint biserialcorrelations for items ranged
from .01 to .81 with 28 of 42 (67% ) greaterthan .50. Infit mean
square for items was 1.02 with an outfit of 1.35.Review of the
infit mean square data for each item revealed that 5
items remained with infits greater than 1.2: 09 (1.35 ), 010
(1.39), E 13(1.63), E 14 (1 .57), and E 29 (1 .37), all items
previously identified and pur-posely not removed from the test. As
previously mentioned, E13 , E 14and E 29 have been revised with the
expectation that they will perfobetter in data analyses with a new
sample of infants. 09 and 01 0 will bretained for the reasons
previously given and because further reanalysiremoving these items
did not produce substantial improvement in Rascfit statistics for
the test as a whole. Thirteen items remain with high
outfistatistics. Although these could perhaps be deleted without
affecting toreliability or precision, they are items w hich assess
mea ningful activitiethat therapists wish to have included for the
purposes of obtaining a fulleunderstanding of infant developm ent
for use in treatment planning ananticipatory guidance of families
regarding their infant's development.Discussion
Rasch analysis was used to reduce the length of a new
assessmentinfant posture and movement. The 42 -item TIMP has high
precision, gofit to the Rasch psychom etric model, and good fit to
the ability levelsinfants tested in the age range for which the
test is intended, especial!those with low to moderately high levels
of ability (measure ability ranfrom 10 -85 ). The lack of a floor
effect indicates sufficient room forsessment of low functioning
infants, a major purpose of the examinatio
Harder items would need to be added if precision was desired for
4-month-old infants of the highest ability. This is not the test
developers' intent,however, because o ther tests are available for
assessing infants from 3-4months of age onward while no other
quantitative assessment is currentlyavailable with precision for
assessing infants' functional motor perfor-mance under the age of 3
months.With the elimination of 15 items from the set of O bserved
Items,many of w hich involved arm and hand functions, it becomes
more obviousthat the TIMP is an assessment primarily of gross
(large muscle), ratherthan fine, motor function. It is likely that
the m isfit of these items re-flected the fact that they are part
of a different construct than that of pos-ture and movem ent needed
for e arly functional activities or that it is tooearly in life to
reliably assess hand and arm functions. Unfortunately,removal of
the large number of items that were O bserved Items did little
toreduce the physical demands of the TIM P no r the time required
for testingbecause O bserved Items do not involve handling Reducing
the number ofO bserved Items from 28 to 13, how ever, does reduce
the attentional de-mand of the tester because these items must be
observed for throughoutthe examination. In addition, one position
change was d eleted and re-moval of tw o E licited Items
contributes slightly to reducing the time re-quired fo r testing.
Future research will assess the feasibility of using asmaller set
of the best items as a screening test and for assessment of themost
fragile infants.Because we are aware of no other test for young
infants that wasdeveloped using R asch analytic methods, we have no
other comparabledata to evaluate relative to our results. In
general, however, other tests fornewborns and premature infants
have a small range o f age, are not inter-
val-scaled, and have not been studied with longitudinal
assessment of in-fants to document a linear relationship between
age or ability and testscores. Neither do they provide the
capability to scale individual items toidentify difficulty level of
steps involved in learning various tasks andhow they should appear
in the sequence of early development of prema-turely born
infants.Scales developed with R asch methods are available for
older chil-dren with disabilities (Coster, Ludlow, and Mancini, 19
99 ), including theGross M otor Function Measure (GMF M). A recent
report of the use ofRasch analysis to reduce the number of items in
the G MFM showed that a
66-item test was as sensitive to change w ith age in children
with cerebral
-
7/28/2019 Development of Functional Movement Scale for
Infants
7/8
20 2A M P B E L L , e t a l. D E V E L O P M E N T O F A F U N C
T IO N A L M O V E M E N T SCALE FOR INFANTS 203palsy as the
earlier 88-item test, but the criteria used to delete items wnot
reported (Russell, A very, Rosenbaum, R aina, Walter, and
Palisan2000).Along w ith evidence that the TIMP discriminates among
infants baseon early medical complications (Campbell and Hedek er,
200 1), predictdevelopment at 12 months from TIM P assessment at 3
months (CampbelKolobe, Wright, and Linacre, in press), and has
ecologic validity for asensing activities that are important in
daily life (Campbell and Murne199 8), the evidence provided by R
asch analysis provides support for thvalidity of the TIMP for use
in clinical practice and research to measurethe development of m
otor skills in early infancy. The next stage in deveopment of the
TIMP will use V .4 for a re-assessment of the test's
scalinproperties in a population-based sample of 1200 infants
selected to matethe race/ethnicity distribution of the population
of low -birth-weight ifants in the U.S. U se of data from this
cross-sectional sam ple in a neRasch analysis will alleviate the
limitations of the current data obtainefrom repeated assessment of
about tw o-thirds of the infants. Data frothe national sample will
also be used to establish age standards for perfomance of infants
in 14 d ifferent age groups in support of its use as a dianostic
measure of infant motor development.
AcknowledgmentsResearch w as conducted at the University of
Illinois at ChicagMedical Center, the University of Chicago
Hospitals, and Lutheran General Hospital, Park R idge, IL, with
funding (R01 HD 3 256 7) from theNational Center for Medical
Rehabilitation R esearch of the National In-stitute of Child Health
and H uman Development, USP HS. We are grate-
ful to the physicians at each institution--Nagamani Beligere,
Jaideep Singand David Sheftel for providing access to their
patients. The authorwould like to thank Dolores Schorr, Pat
Byme-Bowens, Dawn KuerschneCarrie Ryan, and Kathy Tolzien for
assistance in recruiting subjects; BeO sten, Kathy Kamm, P ai-jun
Mao Liao, Vinod Subramonian, and SriraBalachandran for assistance
with data management; and Betty BrameMary Carter, Judy Hegel,
LouAnn Gouker, Thubi Kolobe, Maureen LenkPai-jun Mao Liao, G ail
Liberg, Mary Murney, Beth O sten, Celina WisMartin, and Laura
Zawacki for testing of infants. Without the participa-tion of
infants and their parents who believed that our research would
helpother families, this work would not have been possible.
ReferencesCampbell, S. K., and Hedeker, D. (200 1) V alidity of
the Test of Infant MotorPerformance for discriminating among
infants with varying risk for poor motoroutcome. Journal of
Pediatrics, 139, 546-551.Campbell, S. K., Kolobe, T. H. A., O sten,
E.T, Lenke, M., and Girolami, G.L.(199 5). Construct validity of
the Test of Infant Motor Performance. Physical
Therapy, 75, 585-596.Campbell, S. K., Kolobe, T. H. A ., Wright,
B. D., and Linacre, J. M. (In press).Predictive validity of the
Test of Infant Motor P erformance with the AlbertaInfant Motor
Scale. Developmental Medicine and Child Neurology.Campbell, S. K.,
Osten, E . T., Kolobe, T. H. A., and Fisher, A. G. (1 993) .
Devel-opment of the Test of Infant Motor Performance. Physical
Medicine and Re-
habilitation Clinics of North America, 4(3), 541-550.Coster, W.,
Ludlow L., and Mancini, M. (1999 ). Using IR T variable maps
toenrich understanding of rehabilitation data. Journal of Outcome
Measure-
ment, 3, 123-133.Hambleton, R. K. (2000). Emergence of item
response modeling in instrumentdevelopment and data analysis.
Medical Care, 38 (9, Suppl. II), 60-65.Fanaroff, A. A ., Wright, L.
L., Stevenson, D. K., Shankaran, S., Donovan, E. F.,E hrenkrans, R.
A., Y ounes, N., Korones, S. B., Stoll, B. J., Tyson, J. E.,
Bauer,C. R., O h, W, Lemons, J. A., Papile, L. A., and V erter, J.
(1995). V ery-low-birth-weight outcomes of the National Institute
of Child Health and HumanDevelopment Neonatal Research Network, May
1991 through December 1992.
American Journal of Obstetrics and Gynecology, 173,
1423-1431.Girolami, G., and Campbell, S. K. (1 994 ). E fficacy of
a Neuro-DevelopmentalTreatment program to improve motor control of
preterm infants. Pediatric
Physical Therapy, 6, 175-184.Linacre, J. M. (1988). FACETS.
Computer Program for Many-faceted RaschMeasurement. Chicago: MESA
Press.Majnemer, A ., Riley, P ., Shevell, M., Birnbaum, R .,
Greenstone, H., and Coates,A . L. (2 000 ). Severe bronchopulmonary
dysplasia increases risk for later neu-rological and motor sequelae
in preterm survivors. Developmental Medicine
and Child Neurology, 42, 53-60.McCormick, M. C., McCarton, C.,
Tonascia, J., and Brooks-G unn, J. (199 3). E arlyeducational
intervention for very low birth weight infants: Results from
theInfant Health and Development Program. Journal of Pediatrics,
123, 527-533.Murney, M. E ., and Campbell, S. K. (1 998). The
ecological relevance of the Test
ofInfant Motor Performance Elicited Scale items. Physical
Therapy, 78, 479-489.
-
7/28/2019 Development of Functional Movement Scale for
Infants
8/8
204AMPBEL L , et al.Pinto-Martin, J. A., Whitaker, A . H.,
Feldman, J. F., V an Rossem, R ., and PanethN. (20 00 ). Relation
of cranial ultrasound abnormalities in low-birthweiginfants to
motor or cognitive performance at ages 2, 6, and 9 years.
Develop,mental Medicine and Child Neurology, 41, 826-833.Russell,
D. J., A very, L. M., Rosenbaum, P . L., Parminder, S. R., Walter,
S. D., anPalisano, R. J. (200 0). Improved scaling of the Gross
Motor Function Measures for children with cerebral palsy: evidence
of reliability and validity. Physi.
cal Therapy, 80, 873-885.Wildin, S. R., Smith, K. E., Anderson,
A. E., Swank, P. R., Denson, S. E., anLandry, S. H. (19 97). P
rediction of developmental patterns through 40 monthfrom 6- and
12-m onth neurologic examinations in very low birth weight
in-fants. Developmental and Behavioral Pediatrics, 18,
215-221.Windmeijer, F. A. G . (1990 ). The asymptotic distribution
of the sum of w eightedsquared residuals in binary choice models.
Statistica Neerlandica, 44(2), 69-78.Wright, B. D., and Linacre, J.
M. (19 96). A User's Guide to BJGSTEPS. Chicago: MESA Press.Wright,
B. D., and Masters, G. N . (1982). Rating Scale Analysis. Chicago:
MESA
Press.
J O U R N A L O F A P P L I E D M E A S U R E M E N T , 3 ( 2 )
, 2 0 5 - 2 3 1C o p y r i g h t 2 0 0 2
Detect ing and Evaluat ingthe Im pact o f M ult id imens ional i
tyusing Item F it Statist ics and P rincipalCom ponent A nalys is o
f Res idualsE verett V . Smith, Jr.
The University of Illinois at Chicago
The purpose of this research is twofold. First is to extend the
work of Smith (19 92, 19 96)and Smith and Miao (199 1, 19 94) in
com paring item fit statistics and principal componentanalysis as
tools for assessing the unidimensionality requirement of Rasch
models. Secondis to demonstrate methods to explore how violations
of the unidimensionality requirementinfluence person measurem ent.
For the first study, rating scale data were simu lated torepresent
varying degrees of m ultidimensionality and the proportion of items
contributingto each component. The second study used responses to a
24 item Attention DeficitHyperactivity Disorder scale obtained from
317 college undergraduates. The simulationstudy reveals both an
iterative item fit approach and principal component analysis
ofstandardized residuals are effective in detecting items simulated
to contribute tomultidimensionality. The methods presented in Study
2 d emonstrate the potential impactof multidimensionality on norm
and criterion-reference person measure interpretations.The results
provide researchers w ith quantitative information to help assist w
ith thequalitative judgment as to whether the impact of
multidimensionality is severe enough towarrant removing items from
the analysis.
Requests for reprints should be sent to E verett Smith,
University of I llinois at Chicago,1040 West Harrison Street, M/C
147, Chicago, IL, 60607, e-mail [email protected] .