-
COGNITIVE PSYCHOLOGY 20, 38-64 (1988)
Surface versus Edge-Based Determinants of Visual Recognition
IRVING BIEDERMAN AND GINNY Ju
State University of New York at Buffalo
Two roles hypothesized for surface characteristics, such as
color, brightness, and texture, in object recognition are that such
information can (a) define the gradients needed for a 2%-D sketch
so that a 3-D representation can be derived (e.g., Marr &
Nishihara, 1978) and (b) provide additional distinctive features
for accessing memory. In a series of five experiments, subjects
either named or veri- fied (against a target name) brief (50- 100
ms) presentations of slides of common objects. Each object was
shown in two versions: professionally photographed in full color or
as a simplified line drawing showing only the object’s major compo-
nents (which typically corresponded to its parts). Although one or
the other type of picture would be slightly favored in a particular
condition of exposure (dura- tion or masking), overall mean
reaction times and error rates were virtually iden- tical for the
two types of stimuli. These results support a view that edge-based
representations mediate real-time object recognition in contrast to
surface gra- dient or multiple cue representations. A previously
unexplored distinction of color diagnosticity allowed us to
determine whether color (and brightness) was employed as an
additional feature in accessing memory for those objects or con-
ditions where there might have been an advantage for the color
slides. For some objects, e.g., banana, fork, fish, and camera,
color is diagnostic as to the object’s classification. For other
objects, e.g., chair, pen, mitten, and bicycle pump, color is not
diagnostic, as such objects can be of any color. If color was
employed in accessing memory, color-diagnostic objects should have
shown a relative advan- tage when presented as color slides
compared to the line drawing versions of the same objects. Also,
this advantage would be magnified when subjects could antic- ipate
the color of an object in the verification task, particularly on NO
trials when the foil was of a different color. Neither an overall
advantage for color-diagnostic objects when presented in color nor
a magnification of a relative advantage on the NO trials in the
verification task was obtained. Although differences in surface
characteristics such as color, brightness, and texture can be
instrumental in de- fining edges and can provide cues for visual
search, they play only a secondary role in the real-time
recognition of an intact object when its edges can be readily
extracted. 0 1988 Academic Press. lnc
This research was supported by research Grants F492083C0086 and
86-0106 from the Air Force Office of Scientific Research. We thank
John Clapper for designing and constructing many of the line
drawings: Melinda Boa for running subjects and preparing stimuli in
a pilot study; Mary Lloyd for her design, construction, and
photography of the mask; Thomas Blickle and Deborah A. Gagnon for
their helpful suggestions; and Fred Kwiesien of the SUNY/Buffalo
Educational Communications Department for his expert photographic
work. Professors Julian Hochberg and Irvin Rock made helpful
comments on an earlier version of this manuscript. Correspondence
should be addressed to Irving Biederman, who is now at the
Department of Psychology, University of Minnesota, Elliott Hall, 75
East River Road, Minneapolis, MN 55455.
38 OOlO-0285188 $7.50 Copyright 0 1988 by Academic Press, Inc.
All rights of reproduction in any form reserved.
-
OBJECT RECOGNITION 39
This investigation compared the latency at which objects could
be identified as members of basic level categories when they were
depicted either as line drawings or by color photography. We
presented pictures of common objects, such as a chair or a fork, at
brief durations and mea- sured the speed and accuracy at which
subjects were able to name or verify them. To our knowledge, there
are no adequate experiments com- paring the latency of
identification of photographed objects to line drawing depictions
of the same objects. An oft-cited study by Ryan and Schwartz (1956)
will be considered under Discussion.
Theoretical Significance
We are concerned with two issues of object perception relevant
to the comparison between color photography and line drawings. One
is with the role of surface gradients, such as those from
variations in brightness, texture, and color, in defining the
physical description of the stimulus. In Marr and Nishihara’s
(1978) view, surface gradients were central to both the
establishment of a primal sketch and the construction of the 2X-D
sketch in which the depth and orientation of local patches of
surface were represented. Although not as explicit about the course
of recognition, Gibson (1966) expressed a similar view as to the
importance of surface gradients in shape definition.
In contrast are those accounts which emphasize the sufficiency
of an edge-based (or contour) representation of an object (e.g.,
Ullman, 1984; Biederman, 1987; Witkin & Tenenbaum, 1983). One
recent account, rec- ognition by components (RBC) (Biederman,
1987), assumes that the image is segmented at regions of sharp
concavity into simple, convex volumetric primitives such as blocks,
cones, wedges, and cylinders. Ob- jects are represented as an
arrangement of these components. This repre- sentation can be
completely specified by a line drawing. For example, one perceives
the curvature of the bowl of the pipe or barrel of the hair dryer
in Fig. 1 even in the absence of characteristic variations in the
sur- face intensity map over those surfaces.
The present edge-based account should not be interpreted as sug-
gesting that the perception of surface characteristics per se is
delayed relative to the perception of the components. In both
edge-based and gradient accounts of object representation, sharp
changes in surface at- tributes provide the edges for the
edge-based description. The empirical issue is whether the presence
of the gradients facilitates the determina- tion of an object’s 3-D
structure (or whatever representation is used for matching to
memory) over what can be derived solely by depiction of an object’s
edges.
The second issue concerns the nature of the representation that
deter-
-
40 BIEDERMAN AND JU
mines the initial activation of the representation of an object
in memory. It may be that the recognition of visual entities is
based on multiple cues, in which both contour and surface
information provide simultaneous routes to recognition (e.g.,
Bruner, 1957; Gibson, 1969). In addition to their characteristic
edges, rolling pins tend to be made of lightly colored wood. By
this perspective, surface information functions just as any other
cue for basic level categorization. Under the right conditions
(viz., overlap and independence of the latency distributions for
the processing of the cues, sufficient variability of the
distributions, response initiation once sufficient information is
available), redundancy gains in identitica- tion latency might be
expected and naming reaction times (RTs) should be shorter for
objects depicted by color photography (cf. Biederman &
Checkosky, 1970).
The alternative, edge-based account assumes that surface cues
are gen- erally less efficient routes for accessing the memorial
representation of an object category and are primarily used as
secondary routes for recogni- tion. By this account, we may know
that an image of a chair has a partic- ular color, brightness, and
texture simultaneously with its edge-based de- scription, but it is
only the edge-based description that provides efficient access to
the mental representation of CHAIR.
The present effort was directed toward an account of the
determinants of the first contact between a single, isolated,
undegraded, unanticipated object and a representation in memory.
This first contact is termed primal access (Biederman, 1987).
Often, but not always, this initial categoriza- tion will be at a
basic level (Rosch, Mervis, Gray, Johnson, & Boyce- Braem,
1976), for example, when we know that a given object is a type-
writer, banana, or giraffe. Much of our knowledge about objects is
orga- nized at this level of categorization- the level at which
there is typically some readily available name to designate that
category (Rosch et al., 1976).l We take the naming to be an
indicant of the achievement of a basic level categorization of the
image.
Experimental Strategy Comparing line drawings to color
photography presents something of
i RBC holds that it is a structural description of the largest
components in their specific arrangement that controls recognition.
When exemplars of a basic level category have the same structural
description of a category prototype, then classification will
appear to be made at the basic level. An Asian elephant and an
African elephant would first be classified as ELEPHANT. However,
nonprototypical exemplars- defined as those with a different
structural description than the prototype-will be initially
classified at the subordinate level. So we might know that a given
object is a floor lamp, sports car, or dachshund more rapidly than
we know that it is a LAMP, CAR, or DOG (cf., Jolicoeur, Gluck,
& Kosslyn, 19841.
-
OBJECT RECOGNITION 41
an apples and oranges problem in that one is faced with
different specifi- cations for photography and drawings. Our pilot
work established that by varying the quality of the photography or
the drawings we could readily confer an advantage in recognition
speed for one or the other kind of image. We attempted to optimize
the perceptibility of both kinds of stimuli but bent over backwards
to select parameters of photography and exposure that favored the
color slides. We also engaged a professional photographer (after we
tried it ourselves). The line drawings were done by students in the
lab and were subject to the constraint that they be readily
generated from the 36 simple convex components assumed by Biederman
(1987).
Fortunately, a previously unexplored color-diagnosticity
distinction among objects allowed us to determine whether color and
brightness (but not texture) were providing a contribution to
primal access independent of the main effect of photos vs drawings.
For some kinds of objects, such as bananas, forks, fishes, or
cameras, color (and brightness) is diagnostic as to the object’s
identity. For other kinds, such as chairs, pens, blow- dryers, or
mittens, color is not diagnostic. The detection of a yellow re-
gion might facilitate the perception of a banana but the detection
of the color of a chair is unlikely to facilitate its
identification, as chairs can be any color. If color was
contributing to primal access, then we should find that the former
kinds of objects, for which color is diagnostic, should benefit
more than the nondiagnostic objects by their depiction as color
photographs rather than as line drawings.
We studied the recognition of objects with two kinds of tasks.
With the naming task (Experiments I, II, and III), subjects saw a
slide of an ob- ject, one made either by color photography or from
a line drawing, and had to name it. With the verification tasks
(Experiments IV and V), sub- jects had to verify the name of an
object by pressing either a YES or a NO microswitch. For example,
when given the target MUSHROOM, subjects would respond YES if a
picture of a mushroom was actually shown, otherwise a NO response
was made. In this task, subjects could anticipate the texture and
details of almost all of the targets, and for the diagnostic
objects, the color as well. Consequently, if subjects were using
the color or texture to access an object category, then benefits
for objects when photographed in color or those that were
diagnostic should be in- creased relative to the naming tasks.
The initial experiments and observations indicated that longer
expo- sure durations and slightly dimmer projector intensities
would favor the color photography, so experiments were included
that allowed determi- nation of the extent to which the relative
improvement for the color slides could be attributed to the
employment of diagnostic color.
-
42 BIEDERMAN AND JU
METHOD
Stimuli Color photography. Twenty-nine objects were photographed
by a professional photogra-
pher against a homogeneous white background. (Thirty objects
were used in the actual experiments but one, a screwdriver, was
eliminated after it became apparent that its tip was so small and
far removed from central fixation that attempts at its
identification yielded RTs and error rates that were more than
twice that of any other object. It was only included as a
distractor object in experiments IV and V.) The f-stop (aperture
setting) was determined by photographing a sample of the
experimental objects with a range of seven half stops cen- tered in
a region where, by the photographer’s judgment, the best rendition
of the object would be obtained. A panel of three judges than
decided on the particular aperture setting, from the seven, where
four functions would be satisfied: (a) the colors would appear most
representative of the photographed objects, viz., not “washed out”
or too dark; (b) there would be high contrast of the object against
its background; (c) there would be high contrast among the object’s
parts; and (d) the objects “looked best.” Differences among the
middle three f-stop values were judged to be small and all three
were judged to be of high quality. The panel’s final choice for an
f-stop matched that of the photographer’s
Line drawings. The line drawings were done by pen and ink, black
on white, and were of the same object in the same orientation as
those in the photograph. The drawings were designed only to reveal
the object’s major components (Biederman, 1987). Small details,
texture, shading, shadowing, and minor departures from asymmetry of
parts were omitted. Some sample line drawing stimuli are shown in
Fig. 1. Figure 2 shows three examples of the photography (here
printed in grey scale only). The experimental stimuli thus
consisted of 58 slides, half by color photography, half from
drawings.
FIG. 1. Sample line drawings for six of the experimental
objects. In the verification tasks (Experiments IV and V), if the
object on the left was a target, the center object would be a
similar distractor and the object on the right a dissimilar
distractor.
-
43
FIG. 2. Photographic examples (showing grey scale only) for
three of the objects. The line drawings for the telephone and pipe
are shown in FIG. 1. The experimental slides were of considerably
higher contrast and clarity than illustrated in this figure.
Two judges rated both types of images with respect to the
overall “quality” of the repre- sentation, with 1 being
“EXCELLENT,” and 2, 3, and 4 being “VERY GOOD,” “GOOD,” and “POOR,”
respectively. The ratings were based on a combination of (a) the
degree of prototypicality of the particular instance for the basic
level category (e.g., how good an instance of the class of
blowdryers that particular blowdryer might be) and (b) the adequacy
of the viewpoint (and depiction) for conveying the shape of the
object. The judges also rated the color slides on the contrast of
the bounding contour with the background and the contrast among the
object’s components. The ratings are shown in Table 1. Because all
the images were of fairly high quality, judges were encouraged to
use the scale relatively, so that the full range would be used.
The low to modest inter-rater reliability coefftcients were
likely consequences of the re-
-
44 BIEDERMAN AND JU
TABLE 1 Ratings of Quality and Contrast as a Function of
Diagnosticity of Stimuli
Rating
Measure
Reliability (r) Diagnosticity
Diagnostic Nondiagnostic
Quality Contrast (color slides)
LD Color Background Components
.42* .51* .34* .18
1.98 1.25 1.71 1.77 1.78 1.54 1.83 1.91
Note. Ratings were on a four-point scale with one being
“excellent” and four being “poor,” relative to the set of objects
in the experiment. Reliability values are means for the interrater
r’s among three judges. LD, Line drawings; color, color
photography. Back- ground contrast is the judged contrast of an
object’s major components against the back- ground; component
contrast is the contrast among the object’s components. Contrast
ratings were only taken of the color slides.
* Significant at .Ol level.
stricted range because of the uniform (and high) quality of both
the line drawings and color slides. A slight (nonsignificant)
difference between scores on the quality judgments favored the
color slides. Differences between quality and contrast ratings for
diagnostic and non- diagnostic slides were slight and not
significant.
The slides were projected on a screen 2.59 m from the subject
and the projected borders of the slide frame subtended an angle of
5”25’ horizontally and 8’54 vertically. Sizes were specified by
measuring the smallest rectangle (of any orientation) that would
completely enclose each object’s image. For the line drawings, the
mean length of these rectangles was 5”19’ (SD = l”55’) and mean
width was 3’13’ (SD = l”33’). For the color slides, the mean length
was 4”53’ (SD = 1’48’) and the mean width was 2”54’ (SD l”l8’). The
slight variation in size was not correlated with either RTs or
error rates.
Design and Procedure Each presentation was immediately followed
by a mask in all experiments except III and
V. The mask was a random-appearing collage of shapes of varied
textures (e.g., papers, wood, metals, fabrics, wires) and
color.
Five experiments were run. In all experiments, subjects viewed
all 58 slides, equally distributed in random-appearing order
between those made from line drawings and those made from color
photography. Half of the objects were shown first as color slides;
the other objects were shown first as line drawings. Slides were
shown for 50-, 65-, and lOO-ms dura- tions that our previous work
had indicated would provide a broad range of performance. By
reversing the sequence of slides, the order of slides over subjects
was balanced so that each slide had the same mean serial position
(29.5) and an initial appearance as a line drawing or colored
photograph. Each slide also appeared at each exposure duration an
equal number of times. Each experiment thus had a 2 (photograph or
drawing) x 3 (exposure duration) design. Any one subject could have
only one-sixth of the possible combinations of vari- ables. To
perform a quasi-F analysis using objects as a random variable, the
data from subgroups of six subjects were combined to produce a full
balanced design. There were five
-
OBJECT RECOGNITION 45
such subgroups in each of the first three experiments.
Experiments IV and V included a between-groups main effect
(similarity of distracters) with eight balanced subgroups (of six
subjects each) in each of the experimental groups.
The subjects were fully familiarized with the task and read
through a list of names of the experimental and practice objects.
[Other research (Biederman, 1987) indicated that there was
virtually no effect of the name familiarization procedure.] In
addition to the experi- mental slides, subjects had approximately
20 practice trials and two “buffer” slides for warm up before the
experimental trials. The objects used in these practice and warm-up
trials were not part of the experimental set.
The subject would initiate each trial by pressing a key on the
terminal. In Experiments I, II, and III subjects named the stimuli
into a voice key. In these experiments errors were recorded (by a
press of a microswitch) by the experimenter. Immediately after each
trial, error and RT feedback were provided on the subject’s
terminal in all experiments. RTs over 3s were recorded as errors
but this criterion was rarely invoked (less than 10 times over all
the experiments).
Verification Task In Experiments IV and V, subjects performed a
verification task. A target name, e.g.,
LAMP, was presented on the terminal. Subjects were to press a
YES microswitch if the slide matched the target; a NO microswitch
if it did not. The similarity of the distracters (on the NO
trials), defined in terms of the silhouette (bounding contour) of
the object, was varied, as illustrated in Fig. 1. Half of the
subjects in the experiments had distracters judged by the panel of
three judges to be similar to the target. For example, if the
target was FLASHLIGHT, a similar distractor was a rolling pin. For
the other half of the subjects, the distracters were dissimilar to
the object; camera was the dissimilar distractor for FLASH- LIGHT.
Table 2 shows the targets and their similar and dissimilar
distracters along with their color-diagnosticity designations.
Exposure Variations The experiments differed in the conditions
of exposure and masking and number (N) of
subjects, as follows:
I. High intensity, mask. Naming task. N = 30. II. Low intensity,
mask. Naming task. N = 30. III. Low intensity, no mask. Naming
task. N = 30. IV. Low intensity, mask. Verification task. N = 96.
V. Low intensity, no mask. Verification task. N = 96.
The intensity parameter refers to the setting on the Kodak
carousel projector (Ektagra- phic Model B-2). The lowered intensity
appeared to slightly enhance the appearance of some of the objects.
The high- and low-intensity settings produced a background
luminance of the line drawings of approximately 70 and 55 cd/m*,
respectively. The corresponding values for the color slides were
approximately 56 and 44 cd/m*. We omitted the mask en- tirety in
Experiments III and V because we wanted to explore RT differences
when error rates were minimal and the lower contrast of the color
slides was less of a potential disad- vantage.
RESULTS
Over the five experiments, mean RTs and error rates for naming
or
-
46 BIEDERMAN AND JU
TABLE 2 Diagnosticity and Similar and Dissimilar Distracters for
the Verification Task
Distractor
Nail (D) Whistle (D) Mushroom (D) Lock (D) Pen (ND) Fork (D)
Knife (D) Pipe (D) Apple (D?) Banana (D) Fish (D) Flowerpot (D)
Scissors (D) Mitten (ND) Stapler (ND) Rolling pin (D) Flashlight
(ND) Pencil sharpener (D) Camera (D) Iron (D) Blowdryer (ND) Drill
(ND) Pot (D) Tea kettle (ND) Telephone (ND?) Cane (ND) Tire pump
(ND) Briefcase (ND?) Chair (ND)
Pen Pencil sharpener Apple Chair Nail Pipe Fish Fork Mushroom
Stapler Knife Telephone Screwdriver Tea kettle Banana Flashlight
Rolling pin Whistle Briefcase Pot Drill Blowdryer Iron Mitten
Flower pot Tire pump Cane Camera Lock
Dissimilar
Apple Cane Pen Drill Mushroom Telephone Chair Iron Nail Mitten
Flower pot Fish Pencil sharpener Banana Tire pump Blowdryer Camera
Scissors Flashlight Pipe Rolling pin Lock Briefcase Screwdriver
Fork Whistle Stapler Pot Knife
Note. D, Color diagnostic to object’s identity; ND, color
nondiagnostic to object’s iden- tity. Objects with question marks
had variable judgments. Reassignment of such objects did not affect
results.
verifying line drawings were virtually identical to those
measures of per- formance for color slides, as shown in Table 3.
Eight of the 10 F' ratios (five experiments, two response measures
each) for image type (photog- raphy vs line drawing), were near or
less than 1.00. Only in Experiment III was a significant image type
advantage (for RTs) of the color slides obtained [and this was not
replicated in the other experiment (V) where a mask was not used].
In all experiments, near errorless performance was possible from a
IOO-ms exposure.
-
OBJECT RECOGNITION 47
TABLE 3 Mean Correct Reaction Times (ms) and Percentage Errors
as a Function of Stimulus Type
(Color Slide or Line Drawing) and Experimental Condition
Experiment
Color Slides Line Drawings
RT Error RT Error
I 916 14.7 903 11.9 II 831 11.4 839 7.3 III 783 1.7 807 2.0 IV
Sim-Yes 571 10.1 564 8.9 IV Sim-No 652 13.7 641 13.1 IV Dis-Yes 513
10.4 497 10.4 IV Dis-No 574 9.7 580 8.8 V Sim-Yes 425 4.0 436 6.8 V
Sim-No 513 8.7 495 7.2 V Dis-Yes 410 6.6 421 6.2 V Dis-No 455 7.6
460 5.5
Mean 604 9.0 604 8.0
Note. Experiments I, II, and III were naming tasks. Experiments
IV and V were veritica- tion tasks. Sim, Similar distracters; Dis,
dissimilar distracters. Experiments III and V were run without a
mask. All experiments but I were run at low intensity. Thirty
subjects partici- pated in each of the naming experiments; 96
subjects participated in each of the verification experiments.
Naming Task
Experiment Z (High Intensity, Mask)
Overall there was a slight nonsignificant advantage for the line
drawings. Figures 3 and 4 show the error rates and RTs as a
function of exposure duration for Experiment I. The slightly lower
RTs (12 ms
40
i0 a-,-
--__ --__
-- --+ Line Dmwings
0 Cobr slides
50 65 m
Exposure Duration hsec.)
FIG. 3. Mean percentage naming errors in Experiment I as a
function of exposure dura- tion and image type.
-
48
4040
-990
f
F & 970
5900
%
=wo
H
t \ \
860
I * 041 ’ I 1
50 65 400 Exposure Duration (msec.)
BIEDERMAN AND JU
FIG. 4. Mean correct naming reaction times (ms) in Experiment I
as a function of expo- sure duration and image type.
overall) and error rates (2.9%) for the line drawings were not
significant (both F’s approximately 1.00). All of this advantage of
the line drawings came from the briefest exposure durations,
producing a highly significant image type x duration interaction
for errors; F’(2,15) = 13.04, p < Ml. (The comparable F-ratio
for the RTs was less than 1.00.) The overall ef- fects of exposure
duration were significant for both RTs (F’(2,19) = 5.22, p <
.05) and errors (F’(2,21) = 35.17, p < .OOl).
Experiment ZZ. (Low Intensity, Mask)
Figures 5 and 6 show the error rates and RTs, respectively, for
Experi- ment II, which was identical to Experiment I except for the
lowered pro-
Exposure Duration (m.sec.1
FIG. 5. Mean percentage naming errors in Experiment II as a
function of exposure dura- tion and image type.
-
49 OBJECT RECOGNITION
Color Slider
I O-l 50
I I 65 400
Exposure Duration (msec.1
FIG. 6. Mean correct naming reaction times (ms) in Experiment II
as a function of expo- sure duration and image type.
jector intensity. The 4.1% overall advantage in error rates for
the line drawings was significant; F(1,12) = 5.42, p < .05.
Because all the advantage in errors for the line drawings came
at the briefest (50-ms) exposure duration, the interaction of image
type with du- ration was highly significant (F’(2,19) = 10.42, p
< .OOl), as was the main effect of duration (F’(2,14) = 15.74, p
C .OOl). There was an 8-ms net advantage of the color slides with
RTs (F < 1.00, ns). With single presen- tations of the stimuli
and F’ statistics, neither the effects of duration nor the image
type X duration interaction, in which the color slide advantage
increased with increasing exposure duration, was significant.
Experiment ZZZ. (Low Intensity, No Mask)
As shown in Fig. 7, error rates were so low (1.8% overall) when
the mask was omitted that none of the F’s for errors were
significant. How- ever, the 24-ms RT advantage for the color slides
was significant (Fig. 8); F’(1,9) = 6.09, p < .05. (This color
slide advantage when no mask was presented was not replicated in
Experiment V.) Performance in this ex-
L e b z 40 8 2
0 1 f----.: Lime Drawings
-----*Wm Slides 50 65 400
Exposure Dvrotion (msec.1
FIG. 7. Mean percentage naming errors in Experiment III as a
function of exposure dura- tion and image type.
-
50 BIEDERMAN AND JU
Exposure Duration (msec.)
~~~~ 8. Mean correct naming reaction times (ms) in Experiment
111 as a function of expo- sure duration and image type.
periment was unaffected by exposure duration. The F’ ratios for
duration and the image type X duration interaction were both less
than 1.00.
Verification Tasks
Experiment IV. (Verification Task, Low Intensity, Mask)
The RTs and error rates for the positive and negative trials as
a func- tion of distractor similarity and exposure duration in
Experiment IV are shown in Figs. 9,. 10, 11, and 12.
No significant overall effect of image type was found for either
the RTs (7 ms, favoring the color slides) or error rates (.4%
favoring the line drawings); both F’s < 1.00. Distractor
similarity had a sizable effect on the RTs: Latencies in the
similar group were 66 ms longer than those in the dissimilar group
F(l,lS) = 6.05, p < .05. However, this effect was equivalent for
both color photography and line drawings. The F’ for the similarity
X image type interaction was less than 1.00 for both RTs and
errors. Despite significant differences in RTs and error rates
among the objects, p < .OOl and .05 for RTs and error rates,
respectively, the simi- larity X image type X objects F’s were less
than 1.00 for both measures. F’ ratios for the duration and the
duration X image type interactions for
FIG. 9. Mean percentage verification errors on negative trials
in Experiment IV as a function of exposure duration, distractor
similarity, and image type.
-
OBJECT RECOGNITION 51
=A, ’ I I 60 66 m
Exposure Duration hsec.1
FIG. 10. Mean correct verification reaction times (ms) on
negative trials in Experiment IV as a function of exposure
duration, distractor similarity, and image type.
both RTs and errors were close to or less than 1.00. RTs for the
YES trials were markedly lower, by 76 ms, than those for the NO
trials (F(1,27) = 31.60, p < .OOl), but this variable also did
not interact with image type, similarity, or their interaction: F
< 1.00 in all cases.
Experiment V. (Verification Task, Low Intensity, No Mask)
Figures 13, 14, 15, and 16 show the RTs and error rates for the
positive and negative trials as a function of distractor similarity
and exposure du- ration in Experiment V. Verification performance
was virtually identical for color slides and line drawings; mean
correct RTs were 451 ms (6.7% errors) for the color slides and 453
ms (6.4% errors) for the line drawings (F’ < 1.00 for both
measures). As in Experiment III, with so few errors none of the
major experimental variables had reliable effects on error rates.
As in Experiment IV, latencies in the similar group were longer
than those in the dissimilar group, by 3 1 ms, although this
between-group difference fell short of significance; F(1,15) =
3.66, .05 < p < .lO. Nei- ther similarity nor response type
interacted with image type.
FIG. 11. Mean percentage verification error on positive trials
in Experiment IV as a func- tion of exposure duration, distractor
similarity, and image type.
-
52 BIEDERMAN AND JU
Exposure Duration (msec.1
FIG. 12. Mean correct verification reaction times (ms) on
positive trials in Experiment IV as a function of exposure
duration, distractor similarity, and image type.
Color Diagnosticity Analysis
The 29 objects were partitioned into tvvo sets according to
whether their color was diagnostic to the object’s identity or not
as indicated in Table 2.2 Objects with question marks were those
where there was some uncertainty among the raters about their
diagnostic assignment. The al- ternate assignment of these objects
produced only a negligible effect on the results.
Table 4 shows the magnitude of the color advantage (line
drawings minus color slides) for both RTs and error rates for the
five experiments as a function of diagnosticity. The nondiagnostic
objects had higher RTs
FIG. 13. Mean percentage verification errors on negative trials
in Experiment V as a function of exposure duration, distractor
similarity, and image type.
* Inferences concerning diagnosticity are limited to color (and
brightness and lightness) but not to texture. Very few objects are
without a characteristic texture. CHAIR was the only one in the
present experiment.
-
OBJECT RECOGNITION 53
t 440
1 L 1 0” 50 65 400
Exposure Duration (msec)
FIG. 14. Mean correct verification reaction times (ms) on
negative trials in Experiment V as a function of exposure duration,
distractor similarity, and image type.
overall, but these were completely attributable to the presence
of three nondiagnostic objects (TIRE PUMP, PENCIL SHARPENER, and
DRILL) that were relatively unfamiliar or had multisyllable
names.
To determine whether larger (more positive) color advantages
were as- sociated with increased effect of diagnosticity, point
biserial correlation coefficients were computed between
diagnosticity (where 1 = diagnostic and 0 = nondiagnostic) and the
magnitude of the color advantage for each of the three exposure
durations for the seven experimental condi- tions. A positive r
would indicate that larger color advantages were asso- ciated with
the diagnostic objects. The mean of these 33 correlations (be-
tween diagnosticity and the color advantage) was - .Og for RTs and
- .09 for errors (neither distinguishable from zero), indicating
that within ex- periments, more diagnostic objects were not
associated with larger color advantages.3 The verification tasks
gave no evidence of a larger color
Exposure Duration (msec)
FIG. 15. Mean percentage verification errors on positive trials
in Experiment V as a function of exposure duration, distractor
similarity, and image type.
3 It will be recalled that the diagnostic objects had higher
quality ratings than the non- diagnostic objects. The point
biserial correlation between the quality ratings and diagnosti-
city was .30. Rartialing out the effects of quality ratings
rendered the correlations between diagnosticity and the color
advantage slightly more negative: - .10 for RTs and - .l 1 for
error rates, but still indistinguishable from zero.
-
54 BIEDERMAN AND JU
FIG. 16. Mean correct verification reaction times (ms) on
positive trials in Experiment V as a function of exposure duration,
distractor similarity, and image type.
advantage or stronger diagnosticity effect than the naming
tasks. For RTs, the mean color advantage for the naming experiments
was 8 ms; for the verification task, -7 ms. It might be argued that
even if one were anticipating color on the verification task, e.g.,
expecting yellow for a BANANA, then even if yellow were detected,
it would still be necessary to determine shape before a response
could be made. But this was only true on the YES trials. On the NO
trials for diagnostic objects, as soon as a mismatched color could
be detected, a response could be initiated without the need to
determined shape. From this account, diagnostic ob- jects would be
expected to enjoy an advantage when presented as color slides on
the NO trials of the verification task. But this did not occur. The
mean color advantage for diagnostic objects on such trials was - 9
ms (an advantage for the line drawings) compared to a 3-ms
advantage for the
TABLE 4 Mean Color Advantage for Correct RTs (ms) and Percentage
Errors as a Function of
Diagnosticity of the Four Experiments
Error rate (percentage) Reaction time (ms)
Experiment Diagnostic Nondiagnostic Diagnostic Nondiagnostic
I II
III IV Sim-Yes IV Sim-No IV Dis-Yes IV Dis-No V Sim-Yes V Sim-No
V Dis-Yes V Dis-No
Mean
.07 .03 -35 12 -.05 -.02 8 25
.Ol .oa 29 15 -2.56 1.14 -13 1 - 52 .38 -26 9
- 1.81 1.91 -46 24 - .26 1.52 -1 14 1.54 4.51 15 7
- 53 - 2.78 -19 .12 - -3.68 4.17 10 13 -2.70 -2.03 9 1 -.95 .79
-6 10
-
OBJECT RECOGNITION 55
nondiagnostic objects, an ordering opposite what would be
expected from the diagnostic use of color on these trials.4
DISCUSSION
The absence of a main effect of image type and the lack of any
interac- tion between diagnosticity and image type is counter to a
conceptualiza- tion that would favor surface characteristics as a
route to speeded object recognition.
Especially significant was the failure to obtain evidence of the
use of color-either in a larger color slide advantage or a larger
diagnosticity effect-from the verification task, particularly on
the NO trials, where subjects would have had full opportunity to
anticipate surface character- istics prior to the presentation of
the image and initiate a response if the color was disconfirmed.
But not only was the color advantage smaller in the verification
task, diagnosticity was even more negatively correlated with the
color advantage than in the naming tasks.
The verification task provided additional support for the view
that images made by color photography were recognized in the same
manner as line drawings. The similarity variation, determined from
the contour of the object, had virtually identical effects for
color slides and line drawings. The same was true for the effects
of response type (YES vs NO) in the verification task.
It is not resolved why the line drawings enjoyed an advantage
over the color slides at the briefest exposure durations in
Experiments I and II and why the color slides enjoyed an advantage
in Experiment III. One possi- bility was that the lower contrast of
the color slides might have rendered them more susceptible to
masking. However, color slides did not enjoy an advantage when no
mask was used with a verification task in Experi- ment V. Moreover,
an analysis of the correlations of the contrast ratings with
performance gave no indication that lower contrast slides were more
adversely affected at the brief masked exposures. Additional
experi- mental work is needed to replicate and explore the
conditions under which one or the other form of depiction shows an
advantage.
Although this experiment did not provide an explicit test of the
primi- tives (geons) proposed by RBC (Biederman, 1987), the line
drawings were generated from the set of 36 convex volumetrics
proposed by that theory. Consequently, the equivalence of these
line drawings to the color
4 For 14 of the 17 diagnostic objects, the distracters were of a
different color than the target. Data for the three diagnostic
targets (WHISTLE, SCISSORS, and PENCIL SHARPENER) whose distracters
had surface appearances (viz., metallic) similar to those of the
targets were indistinguishable from the 14 diagnostic objects whose
distracters dif- fered in color.
-
56 BIEDERMAN AND JU
slides provides indirect support for the sufficiency of the
class of edge- based object descriptors such as those assumed by
RBC in accounting for primal access. (The sufficiency of any
edged-based theory that proposed the same contours would also be
supported.)
The Ryan and Schwartz Experiment The results of the present
experiment are also consistent with portions
of the popular construal (though not necessarily the actuality)
of the Ryan and Schwartz (1956) experiment. Ryan and Schwartz did
compare the perceptibility of photography (black and white) against
line and shaded drawings and cartoons. They reported that the
objects depicted by car- toons enjoyed an advantage over
photographs and shaded drawings, which were about equivalent, and
the latter were superior to line drawings, the stimulus types used
in the present experiment. Ryan and Schwartz used only three
objects: a plate with five double-throw elec- trical knife
switches, a steam valve, and a hand. There were four possible
configurations (switch positions, valve cycles, or finger postures)
for each of these objects and the subject had to report-not the
basic level categorization of the object-but which one of the four
configurations was being presented for a given object. The subjects
knew which of the three stimulus types was to be presented prior to
its presentation.’ For two of the objects-the switch and the
valve-responses were more ac- curate when depiction was by
cartoons. But as Tversky and Baratz (1985) noted, these objects
required that fine detail be discriminated against a busy
background. This visual noise was removed in the cartoon versions.
In addition, the switch handles were darkened in the cartoon
versions so that they had higher contrast with the background
contacts, as shown in Fig. 17. Thus subjects needed extraordinarily
long exposure durations, by general perceptual standards, 1133 and
2564 ms for the photo and line drawing versions, respectively, to
determine the configurations of the switch handles shown in Fig.
18. The cartoon version required a presen- tation duration of 680
ms. Such contrast problems were not an obvious source of difficulty
in determining the configuration of the lingers of the hand, the
stimulus example (Fig. 18) that is most frequently shown in
secondary sources (e.g., Gibson, 1969; Neisser, 1967; Rock, 1984).
Yet the cartoon version of this category did not have lower
thresholds than the photographs.6 That threshold presentation
durations often were
J Ryan and Schwartz adopted this paradigm and stimuli because
they were exploring not how subjects come to identify an object but
how they are able to perceive its “. . . detailed structure” (p.
61).
6 It is possible that the Ryan and Schwartz experiment would
never have received its widespread recognition had the switch been
presented in secondary sources as a sample
-
OBJECT RECOGNITION 57
- --- -_ _ _._J
FIG. 17. The double pole-double throw switch stimuli (“Position
1”) from the Ryan and Schwartz (1956) experiment. Subjects had to
report the positions of the four switches. Note the higher contrast
of the handles in the cartoon version (lower right) as compared to
the line drawing.
longer than 1 s- even without a mask-indicates that the studied
pro- cesses were not intimately involved in object recognition (but
see foot- note 5). They exceeded, by an order of magnitude, the
masked presenta- tion durations required in the present study. Not
only were a number of the absolute threshold durations exceedingly
long, within each stimulus type, some configurations were
dramatically more difficult than others. The photo of the switch
positions in Fig. 17 required a presentation dura- tion of 1133 ms
but the photo for another switch configuration required
experimental stimulus instead of the hand. The cartoon version
of the hand does not appear to be noticeably more identifiable, so
the (misleading) suggestion from secondary sources that it was the
favored form of representation (because the cartoons, overall, were
favored) lent a counter-intuitive flavor to the reported
result.
-
58 BIEDERMAN AND JU
FIG. 18. The hand stimuli (“Position 1”) from the Ryan and
Schwartz (1956) experiment. The cartoon version (D) had higher
thresholds than the photo (A) or the shaded drawing (B).
less than one-twentieth of that exposure duration-50 ms!7 These
stim- ulus sampling, drawing, and procedural specifications render
interpreta-
7 Ryan and Schwartz were aware of this item variability problem
but argued that their materials should be regarded as random
samples of the various kinds of photographic and drawn images of
the kind that might be found in instruction manuals. We take no
issue with this claim, but it does not address the problem of why
there was so much variability. Our own experience is that
instructional materials for assembling equipment are more easily
followed when the parts are drawn than when photographed. The major
reason for this drawing advantage, in our opinion, is that
reproduced photographic images typically have insufftcient contrast
for determination of the contours of the components. This is a
problem even if the results are to be limited to the study of the
perception of the detailed structure of prespecified objects.
Certainly, cartoons enjoy no general advantage. Tversky and Baratz
(1985) recently reported that photographic images of famous persons
were more rapidly identified in tachistoscopic exposures than
political cartoon of these same persons.
-
OBJECT RECOGNITION 59
tion of this experiment highly problematical with respect to
conclusion about real-time access to object recognition.
When Do Surface Cues Affect Recognition?
Although the present results support, at best, only a minimal
role of surface cues in speeded recognition of intact, undegraded
objects, there are four cases where a significant contribution from
such cues would be expected. However, in every case recognition
would be expected to re- quire more time than required for the
identification of the kinds of objects studied in the present
investigation.
Mass Nouns
The objects used in the present experiments were the kind that
have characteristic boundaries, as distinct from those objects that
can assume any shape. This distinction between those objects that
do and do not have specifiable boundaries is reflected in the
distinction in our language between count and mass nouns. Count
nouns, as the name implies, are concrete entities that tend to have
specifiable boundaries and to which we can apply number and the
indefinite article. For example, for a count noun such as CHAIR we
can say “a chair” or “three chairs.” Mass nouns, by contrast, are
concrete entities to which the indefinite article or number cannot
be applied, such as sand, water, or snow. So we cannot say “a
water” or “three waters,” unless we refer to a count noun shape as
in “a drop of water,” “a bucket of water,” or a “grain of sand,”
each of which does have a simple volumetric description. We
conjecture that mass nouns are identified primarily through surface
characteristics such as texture and color (and position in a
scene), rather than through con- tour-based volumetric
primitives.
Compound Texture-Volumetric Objects
There are some count objects that require a texture region in
addition to a volumetric description for a complete representation,
such as hair- brushes, typewriter keyboards, and corkscrews. It is
unlikely that many of the individual bristles, keys, or coils are
parsed and identified prior to the identification of the object.
Instead those regions are represented through the statistical
processing that characterizes their texture (Hoch- berg, 1984),
although we retain a capacity to zoom down and attend to the
volumetric nature of the individual elements (as we can with any
tex- ture tield). The structural description that would serve as a
representa- tion of such objects would presumably include a
statistical specification of the texture region along with a
specification of the larger volumetric components. A recent study
in our laboratory (Biederman & Hilton, 1987) revealed that RTs
and error rates were greater for naming such
-
60 BIEDERMAN AND JU
compound texture-volumetric objects than for naming control
objects that were closely matched in silhouette but did not require
a texture re- gion. Examples of such pairs were zebra-horse,
broom-spoon, and file-knife. Rather than serve as a redundant cue
which would facilitate RTs, the texture region may function as a
necessary component with a long access time because of the high
spatial frequencies required to de- termine its structure.
Volumetric Cohorts
Another subclass of objects for which surface characteristics
play a role in their classification can be illustrated with the
pairs peach-plum or leopard-panther. Because these objects have
virtually identical edge (volumetric) descriptions, speeded
recognition will obviously be depen- dent on surface attributes.
Such subclasses, in which objects with iden- tical contour
descriptors have different basic level classifications, are rare
and because one would have to appeal to surface features, recogni-
tion would be expected to be relatively slow. In general, when two
or more objects have highly similar edge descriptions with respect
to their major components, appeal will have to be made to other
sources of dif- ferentiation. Often this appeal is to small details
(and labels), as one is forced to do when attempting to distinguish
a Honda Accord from a Mazda 626. These alternate sources of
differentiation will typically re- quire additional time for their
employment.8
Degraded or Occluded Objects
Under restricted viewing and positional uncertainty conditions,
as when an object is partially occluded or its position in a field
of distracters is unspecified, texture, color, and other cues (such
as position in the scene and labels) may contribute to the
identification of count nouns, as, for example, when we identify a
particular shirt in the laundry pile from just a bit of fabric.
Such identifications are indirect, typically the result of
inference over a limited set of possible objects. That is, we know
it is a shirt because it is the only item in the laundry pile of a
particular color and surface pattern.
The expectation from RBC is that identification latencies for
the various cases listed above will generally be long, relative to
the kinds of objects used in the present investigation. If this
were true-if it took more time to say “plum” or “panther” to a line
drawing of a plum or
* Recognition of a particular face (rather than the recognition
that some stimulus is a face) might be included in this case. What
needs to be explored is whether the mechanisms for identifying
individual faces enable more rapid recognition than would be
expected from the type and scale of the relevant information.
-
OBJECT RECOGNITION 61
panther compared to control objects-then these cases would
actually provide support for the primacy of edge-based descriptions
for primal access .9
Indirect Contributions of Surface Attributes to Recognition
Performance
The four cases described above are concerned with the role of
surface attributes in the activation of a representation in memory
of an object for recognition. For purposes of completeness,
additional-often critical- roles placed by surface attributes
should also be specified. Although the surface gradients may not
directly activate a representation of an object, as noted in the
introduction, it is the sharp changes in surface attributes that
provide the edge information that does control matching. In
addition, surface gradients provide information as to a region’s
curvature, poten- tially facilitating the determination of a
region’s concavity, convexity, or planarity. But as noted in the
introduction, this information is often re- dundant with the
edge-based representation. One can determine the cur- vature of a
cylinder or planarity of a square or volumetric characteristics of
a nonsense object from a line drawing, without the presence of
surface gradients. Perhaps even more striking was that the
functional definition of several of the objects in the present
experiment require that they have a hollow (concave) component. The
view of the whistle, pipe, flowerpot, and pot all included the
hollow regions for these objects, yet the line drawings did not
have the shadow gradients to represent hollowness. The near overall
equivalence between the color slides, where such gradients were
present, and the line drawings suggests that when edges can be
readily extracted from the input (as with the present line
drawings), the gradients may play only a secondary role. lo
Consistent with this result is Witkin and Tenenbaum’s (1983)
demonstration that when an object’s edges and brightness gradients
(from which an object’s curvature can be inferred) are in conflict,
the gradient loses its capacity to convey curva- ture. However,
under conditions where an object’s full edge description
9 Ostergaard and Davidoff (1985) recently reported that color
pictures of objects were more quickly named than black and white
photographs of those same objects. But the stimuli were limited to
fruits and vegetables, many of which would be expected to show a
color advantage because of the high shape similarity among members
within that class. Although between-experiment comparison can be
tenuous, it should also be noted that naming latencies in the most
comparable condition in the present experiment (no mask), even with
our brief exposure durations, were shorter than those from the
Ostergaard and Davidoff experiment (where the exposure was,
apparently, terminated by the response).
lo Although the surface gradients may not have a major effect on
recognition, it is obvious that they have critical roles in
nonrecognition activities such as locomotion and navigation in the
environment.
-
62 BIEDERMAN AND JU
is not present in the image, as when the object is partially
occluded, then surface gradients would be expected to play a more
significant role.
Despite the subordinate role played by surface features in the
present experiments, it is nonetheless likely that real-world
search is more often organized around surface features than around
edge descriptions. Given uncertainty about where an object will be
in a field of distracters, surface attributes, viz., color,
texture, and lightness, are more frequently diag- nostic to where a
target object might be. Thus if one is searching for a red car in a
parking lot full of cars, it is more efficient to organize search
around color than around a particular contour. Only a small
proportion of the cars will be red but any contour attribute (such
as a curved patch) will likely be found among almost all the
distractors.ll Another benefit of or- ganizing search around
surface attributes is that occlusion or deformation will hide or
alter a particular contour but leave a surface attribute unaf-
fected. Thus the search for a shirt in a laundry pile is better
made on the basis of color than contour.
Even though surface attributes may not control primal access, it
does not necessarily follow that inconsistent coloring or texture
will produce no interference effect on recognition RTs. It is
possible that the represen- tation of BANANA is only weakly
activated by the presence of yellow in a presented object but that
objects that are typically not yellow, such as forks or telephones,
will be inhibited. The reason for this is that gross features of an
object, such as its overall size, aspect ratio, and surface
characteristics, may index all objects possessing that property by
inhib- iting the activation of all objects not possessing that
property. However, if many objects share that property it will
still be necessary to engage in detailed edge processing before
that object could be identified. The gain in inhibiting other
objects may then be modest, if not absent. The issue here may be
whether it will be necessary to assume the same kind of strong
bottom-up inhibition for object perception that McClelland and
ii Simple contour descriptors (e.g., the presence of a curve or
a cylindrical volume) are not an effective basis around which to
organize search because in most cases of real-world search for
objects it is the arrangement of the descriptors that produce the
unique edge descriptions that characterize each object. A
conjunctive search may thus be required among distracters that
contain the same set of primitive contours but in a different
arrange- ment (Treisman & Gelade, 1980). Consistent with this
possibility is that object search among nonscene displays of
distractor objects shows the same linear effect of the number of
dis- tractors as has been reported for conjunctive search for other
types of stimuli (Biederman, Blickle, Teitelbaum, & Klatsky, in
press). However, the conjunctive limitation does not appear to
limit performance when only a single object is presented. Under
such conditions, complex objects-defined as those requiring a
relatively large number of parts to look complete-are more rapidly
identified than simple objects (Biederman, 1987).
-
OBJECT RECOGNITION 63
Rumelhart (1981) found necessary to assume for letter
recognition. Some support for potential interference effects of
surface features derives from Bruner and Postman’s (1949) report
that the perception of a red 10 of spades suffered compared to a
black 10 of spades. This problem should be investigated with a
large sample of common objects.
CONCLUSION
The conclusion from these studies is that a simple line drawing
can be identified about as quickly and as accurately as a fully
detailed, textured, colored photographic image of that same object.
No contribution to speeded recognition was apparent when objects
with diagnostic color and lightness were presented by color
photography compared to objects with little or no constraints on
their color. Being able to anticipate an object’s surface features
likewise conferred no beneficial effects for color photog- raphy.
These results support the premise that the initial access to a
mental representation of an object can be modeled as a matching of
an edge-based representation of a few simple components. Such an
edged- based description is thus sufJicient for primal access.
REFERENCES
Biederman, I. (1987). Recognition-by-components: A theory of
human image under- standing. Psychological Review, 94, 115-
145.
Biederman, I., Blickle, T. W., Teitelbaum, R. C., & Klatsky,
G. J. Object search in non- scene displays. Journal of Experimental
Psychology: Learning, Memory and Cogni- tion, in press.
Biederman, I., & Checkosky, S. E (1970). Processing
redundant information. Journal of Experimental Psychology, 83,
486-490.
Biederman, I., & Hilton, H. J. (1987). Recognition of
objects that require texture specifica- tion. Unpublished
manuscript, SUNY/Buffalo, NY.
Binford, T. 0. (1981). Inferring surfaces from images.
Artificial Inrelligence, 17, 205-244. Bruner, J. S. (1957). Going
beyond the information given. In Contemporary approaches to
cognition. Cambridge, MA: Harvard Univ. Press. Bruner, J. S.,
& Postman, L. (1949). On the perception of incongruity: A
paradigm. Journal
of Personality, 18, 206-223. Gibson, E. J. (1969). Principles of
Perceptual Learning and Development. New York: Ap-
pleton-Century-Crofts. Gibson, J. J. (1966). The senses
considered as perceptual systems. Boston: Houghton Mif-
flin. Hochberg, J. (1984). Form perception: Experience and
explanations. In P. C. Dodwell and
T. Caelli (Eds.), Figural synthesis. Hillsdale, NJ: Erlbaum.
Jolicoeur, P., Gluck, M. A., SC Kosslyn, S. M. (1984). Picture and
names: Making the con-
nection. Cognitive Psychology, 16, 243-275. Julesz, B. (1981).
Textons, the elements of texture perception, and their interaction.
Na-
ture, 290, 91-97. Marr, D. (1982). Vision. San Francisco:
Freeman.
-
64 BIEDERMAN AND JU
Marr, D., & Nishihara, H. K. (1978). Representation and
recognition of the spatial organi- zation of three-dimensional
shapes. Proceedings of the Royal Society of London B, 200,
269-294.
McClelland, J. L., & Rumelhart, D. E. (1981). An interactive
activation model of context effects in letter perception. Part I:
An account of basic findings. Psychological Review, 375-407.
Neisser, U. (1%7). Cognitive psychology. New York:
Appleton-Century-Crofts. Ostergaard, A. L., & Davidoff, J. B.
(1985). Some effects of color on naming and recogni-
tion of objects. Journal of Experimental Psychology: Learning,
Memory, and Cogni- tion, 11, 579-587.
Rock, I. (1984). Perception. San Francisco: Freeman. Rosch, E.,
Mervis, C. B., Gray, W., Johnson, D., & Boyes-Braem, P. (1976).
Basic objects
in natural categories. Cognitive Psychology, 8, 382-439. Ryan
T., L Schwartz, C. (1956). Speed of perception as a function of
mode of representa-
tion. American Journal of Psychology, 69, 60-69. ‘Beisman, A.,
& Gelade, G. (1980). A feature integration theory of attention.
Cognitive
Psychology, 12,97- 136. Tversky, B., & Baratz, D. (1985).
Memory for faces: Are caricatures better than photo-
graphs? Memory & Cognition, 13, 45-49. Ullman, S. (1984).
Visual routines. Cognition, 18, 97-159. Witkin, A. P., &
Tenenbaum, J. M. (1983). On the role of structure of vision. In J.
Beck, B.
Hope, & A. Rosenfeld (Eds.), Human and machine vision. New
York: Academic Press.
(Accepted May 1, 1987)