Top Banner
Feature integration in human vision Ilmari Kurki Department of Psychology University of Helsinki Finland Academic dissertation to be publicly discussed, by due permission of the Faculty of Behavioural Sciences at the University of Helsinki in Auditorium XV at the Main Building on the 12th of December, 2009 at 10 o’clock UNIVERSITY OF HELSINKI Department of Psychology Studies 63: 2009
49

Feature integration in human vision

Apr 06, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Feature integration in human vision

Feature integration in human vision

Ilmari Kurki

Department of Psychology

University of Helsinki

Finland

Academic dissertation to be publicly discussed, by due permission of the Faculty of Behavioural Sciences

at the University of Helsinki in Auditorium XV at the Main Building on the 12th of December, 2009 at 10 o’clock

UNIVERSITY OF HELSINKI

Department of Psychology

Studies 63: 2009

Page 2: Feature integration in human vision

2

Supervisors: Docent Jussi Saarinen

Department of Psychology

University of Helsinki, Finland

Professor Aapo Hyvärinen

Department of Mathematics and Statistics, Department of Computer

Science, Department of Psychology and Helsinki Institute of

Information Technology

University of Helsinki, Finland

Reviewers: Professor Lynn Olzak

Department of Psychology

Miami University, Oxford (OH), USA

Docent Simo Vanni

Low Temperature Laboratory,

Helsinki University of Technology, Finland

Opponent: Professor Dennis Levi

School of Optometry,

University of California, Berkeley (CA), USA

Page 3: Feature integration in human vision

3

Contents

Abstract ........................................................................................................................... 4 Tiivistelmä ....................................................................................................................... 5 Acknowledgments ........................................................................................................... 6 List of original publications ........................................................................................... 7 1 Introduction ................................................................................................................. 8

1.1 From early features to configurations, surfaces and shapes ................................... 8 1.2 Background ............................................................................................................. 9

1.2.1 Processing in the primary visual cortex ........................................................... 9 1.2.2 Object recognition in extrastriate areas ......................................................... 11

1.3 Feature integration in early- and mid-vision psychophysics ................................ 12 1.3.1 Contrast detection and collinear facilitation .................................................. 12 1.3.2 Integration of surface brightness ................................................................... 14 1.3.3 Form integration in Glass patterns ................................................................ 16

2 The aims of the studies .............................................................................................. 19 3 Methods & results ...................................................................................................... 20

3.1 Classification image method ................................................................................ 20 3.1.1 Non-linear classification images ................................................................... 22

3.2 General experimental methods ............................................................................. 23 3.2.1 Subjects and apparatus................................................................................... 23

3.3 Study I: Mechanisms of collinear facilitation....................................................... 24 3.3.1 Stimuli, procedure and data analysis ............................................................. 24 3.3.2 Results ........................................................................................................... 25

3.4 Study II: Edge integration and brightness perception ......................................... 28 3.4.1 Stimuli, procedure and data analysis ............................................................. 29 3.4.2 Results ........................................................................................................... 31

3.5 Studies III & IV: Form integration in Glass patterns .......................................... 34 3.5.1 Stimuli and procedure .................................................................................... 35 3.5.2 Results ........................................................................................................... 37

4 Discussion ................................................................................................................... 40 4.1 Mechanisms of Collinear facilitation ................................................................... 40 4.2 Edge integration and brightness perception.......................................................... 41 4.3 Feature integration in Glass patterns .................................................................... 42

5 Conclusions ................................................................................................................ 44 7 References................................................................................................................... 45

Page 4: Feature integration in human vision

4

Abstract The earliest stages of human cortical visual processing can be conceived as extraction of local stimulus features. However, more complex visual functions, such as object recognition, require integration of multiple features. Recently, neural processes underlying feature integration in the visual system have been under intensive study. A specialized “mid-level” stage preceding the object recognition stage has been proposed to account for the processing of contours, surfaces and shapes as well as configuration.

This thesis consists of four experimental, psychophysical studies on human visual feature integration. In two studies, classification image – a recently developed psychophysical reverse correlation method – was used. In this method visual noise is added to near-threshold stimuli. By investigating the relationship between random features in the noise and observer’s perceptual decision in each trial, it is possible to estimate what features of the stimuli are critical for the task. The method allows visualizing the critical features that are used in a psychophysical task directly as a spatial correlation map, yielding an effective “behavioral receptive field”.

Visual context is known to modulate the perception of stimulus features. Some of these interactions are quite complex, and it is not known whether they reflect early or late stages of perceptual processing. The first study investigated the mechanisms of collinear facilitation, where nearby collinear Gabor flankers increase the detectability of a central Gabor. The behavioral receptive field of the mechanism mediating the detection of the central Gabor stimulus was measured by the classification image method. The results show that collinear flankers increase the extent of the behavioral receptive field for the central Gabor, in the direction of the flankers. The increased sensitivity at the ends of the receptive field suggests a low-level explanation for the facilitation.

The second study investigated how visual features are integrated into percepts of surface brightness. A novel variant of the classification image method with brightness matching task was used. Many theories assume that perceived brightness is based on the analysis of luminance border features. Here, for the first time this assumption was directly tested. The classification images show that the perceived brightness of both an illusory Craik-O’Brien-Cornsweet stimulus and a “real” uniform step stimulus depends solely on the border. Moreover, the spatial tuning of the features remains almost constant when the stimulus size is changed, suggesting that brightness perception is based on the output of a single spatial frequency channel.

The third and fourth studies investigated global form integration in random-dot Glass patterns. In these patterns, a global form can be immediately perceived, if even a small proportion of random dots are paired to dipoles according to a geometrical rule. In the third study the discrimination of orientation structure in highly coherent concentric and Cartesian (straight) Glass patterns was measured. The results showed that the global form was more efficiently discriminated in concentric patterns. The fourth study investigated how form detectability depends on the global regularity of the Glass pattern. The local structure was either Cartesian or curved. It was shown that randomizing the local orientation deteriorated the performance only with the curved pattern. The results give support for the idea that curved and Cartesian patterns are processed in at least partially separate neural systems.

Page 5: Feature integration in human vision

5

Tiivistelmä Varhaisin ihmisen kortikaalinen näkötiedon käsittely voidaan ymmärtää kuvainformaation paikallisten alkeispiirteiden koodaamisena. Monimutkaisemmat visuaaliset toiminnot, kuten esineiden havaitseminen edellyttävät useiden alkeispiirteiden integroimista. Viime aikoina aihetta on tutkittu intensiivisesti ja on esitetty, että piirteiden integraatio tapahtuu varsinaista objektintunnistusta varhaisempien ”keskitason” neuraalisten mekanismien avulla. Näiden mekanismien ajatellaan vastaavan mm. ääriviivojen, pintojen, konfiguraation ja muotojen hahmottamisesta.

Tässä väitöskirjassa tutkitaan visuaalisten piirteiden integraatiota ihmisen näköhavainnossa kokeellisesti psykofysiikan avulla. Kahdessa osatutkimuksessa käytetään viime aikoina psykofysiikassa kehitettyä käänteiskorrelaatioon perustuvaa luokittelukuva-menetelmää. Lähellä havaintokynnystä olevan ärsykkeen päälle lisätään satunnaiskohinaa. Tutkimalla kohinan sisältämien satunnaisten piirteiden vaikutusta kunkin koekerran havaintopäätökseen voidaan päätellä, mitä visuaalisia piirteitä havaintotehtävän suorittamisessa käytetään. Menetelmällä voidaan visualisoida havaintotehtävän suorituksen kannalta kriittiset piirteet korrelaatiokarttana tai ”behavioraalisena reseptiivinä kenttänä”.

Visuaalisen kontekstin tiedetään vaikuttavan siihen, kuinka yksittäiset piirteet havaitaan. Osa interaktioista on varsin monimutkaisia, ja on epäselvää, millä näkötiedon prosessoinnin tasolla ne tapahtuvat. Ensimmäisessä osatutkimuksessa tutkittiin kollineaarista fasilitaatiota, jossa lähelle asetetut kollineaariset sivumaskit parantavat keskellä olevan Gabor-ärsykkeen havaittavuutta. Luokittelukuvamenetelmää käytettiin keskellä olevan Gabor-ärsykkeen detektiosta vastaavan mekanismin behavioraalisen reseptiivisen kentän mittaamiseen. Tulokset osoittavat, että kollineaariset sivumaskit lisäävät behavioraalisen reseptiivisen kentän pituutta sivumaskien suuntaan. Parantunut herkkyys reseptiivisen kentän päissä viittaa siihen, että fasilitaatio tapahtuu matalan tason prosessoinnissa.

Toisessa osatutkimuksessa tutkittiin miten visuaaliset piirteet integroidaan pinnan kirkkauden havainnossa. Luokittelukuvamenetelmää sovellettiin uudella tavalla käyttämällä pinnan kirkkauden arviointitehtävää. Monet teoriat olettavat, että kirkkaushavainto perustuu pinnan reunojen piirteiden analyysiin. Tutkimuksessa oletusta testattiin ensimmäistä kertaa suoraan. Tulokset osoittivat, että sekä tasaisen pinta-ärsykkeen että illusorisen Craik-O’Brien-Cornsweet pinnan kirkkaus integroidaan yksinomaan reuna-informaation perusteella. Lisäksi piirteiden avaruudellinen skaala pysyy lähes vakiona kun ärsykkeen kokoa kasvatetaan, viitaten siihen, että kirkkaushavainto käsitellään ainoastaan yhdellä paikkataajuuskanavalla.

Kolmannessa ja neljännessä osatutkimuksessa tutkittiin kokonaismuotojen havaitsemista satunnaispisteistä muodostuvissa Glass-ärsykkeissä. Kolmannessa osatutkimuksessa mitattiin orientaatiorakenteen erotuskyky koherenteilla konsentrisillä (ympyrän muotoisilla) ja suorilla Glass-ärsykkeillä. Tulokset osoittivat, että kokonaismuoto voitiin havaita paremmin konsentrisillä ärsykkeillä. Neljännessä tutkimuksessa tutkittiin kuinka havaittavuus riippuu ärsykkeen säännönmukaisuudesta. Paikallinen rakenne oli joko kaareva tai suora. Paikallisen orientaation satunnaistaminen heikensi vain kaarevien muotojen havaitsemista Tämä antaa tukea näkemykselle, jonka mukaan kaarevat ja suorat muodot integroidaan ainakin osin eri neuraalisten järjestelmien avulla.

Page 6: Feature integration in human vision

6

Acknowledgments The present study was carried out in the Visual Science Group of the Department of

Psychology in University of Helsinki as a joint project between Department of

Psychology and Neuroinformatics research group of Helsinki Institute of Information

Technology and Finnish Centre of Excellence for Algorithmic Data Analysis. I

gratefully acknowledge the financial support from the Academy of Finland (project

#203344) that enabled me to spend three years as a full-time researcher.

I feel very privileged to have two excellent advisors to whom I am deeply grateful

for superb guidance. Without inspiration and support from Docent Jussi Saarinen I

probably had not ever done a PhD thesis on visual perception. Professor Aapo

Hyvärinen’s guidance was crucial in introducing me to reverese-correlation methods

that were used in this thesis.

I remember warmly our past principal investigator, Docent Pentti Laurinen (1945-

2009). His strive for high scientific standards, originality and clear thinking has made

this research group very stimulating working environment. I express my warmest thanks

to Tarja Peromaa for the essential methodological and technical guidance and many

discussions that have improved my work. Many thanks for discussions and comments to

my other colleagues and friends who have been associated with this research group:

Viljami Salmela, Toni Saarela, Maria Olkkonen, Markku Kilpeläinen, Lari Vainio,

Lauri Nurmela, Miika Pihlaja and Kaisa Tiippana. I thank also my other colleagues at

the Department of Psychology for generating an ambitious working environment.

I am very grateful for Professor Lynn Olzak and Docent Simo Vanni for reviewing

my thesis and Kimmo Alho for being my supervising Professor.

I thank my parents Merja and Markku and my sister Inkeri for support and

encouragement they have provided through my life. I want also say thanks to my all my

friends both in Finland and abroad. Lastly, I thank my darling Sofi for her endless

support at every stage of this project.

Page 7: Feature integration in human vision

7

List of original publications

I Kurki, I., Hyvärinen, A. & Laurinen, P. (2006). Collinear context (and learning)

change the profile of the perceptual filter. Vision Research, 46, 2009-14.

II Kurki, I., Peromaa, T., Saarinen, J. & Hyvärinen, A. (2009). Visual features

underlying perceived brightness as revealed by classification images. PLoS

ONE, 4, e7432.

III Kurki, I. & Saarinen, J. (2004). Shape perception in human vision: specialized

detectors for concentric spatial structures? Neuroscience Letters, 360, 100-2.

IV Kurki, I. & Saarinen, J. (2004). Detection of irregular spatial structures. Spatial

Vision, 19, 375-88.

The articles are reprinted with the permission of the copyright holders.

Page 8: Feature integration in human vision

8

1 Introduction

1.1 From early features to configurations, surfaces and shapes

The first neural representation of the visual world is in the retina, where photoreceptors

sample the local illumination. The light pattern in the image is highly dependent on

lighting sources and shadows that provide no information about identity of objects.

Depending on the viewpoint and the distance, objects project to different locations, at

different angles and sizes on the retina. Higher visual functions like object recognition,

on the other hand are (generally) invariant of proximal aspects of stimulus such as

location, size and by and large, orientation. Recognition is possible from photographs,

line drawings, abstract paintings and caricatures, even when the physical signals bear

little similarity to the actual object, suggesting a higher level, abstract analysis of the

stimulus information. Computational theories of vision (Marr, 1982) suggest that this is

achieved by a series of processing stages where the representation of the scene becomes

increasingly symbolic.

Processing in the early vision is often conceived as coding of the retinal

representation by a set of elementary stimulus features. Cells in the primary visual

cortex can be conceived as neural filters selective to certain local features (roughly:

orientation and width) of the stimulus in a very limited spatial area. However, it is

evident that even for the most basic visual functions; this analysis is just the first step.

For example, contour curvature is not determined by any single orientation in the visual

field, but rather by the relative orientation of successive contour elements. Neural

computation of such higher-level features requires integration of responses over a

population of early filters.

It has become commonplace to think that the extraction of configural information

and integration of global structure in the spatial vision is done by at least somewhat

specialized a neural system, loosely termed mid-level vision (Loffler, 2008). The idea of

specific integrative functions in vision dates back to Gestalt school of psychology in the

early 1900s famous for the “laws” of perceptual organization (Westheimer, 1999). It

was demonstrated that the visual system tends to perceive isolated elements in the

stimulus as groups. Various aspects, such as the distance between the elements or their

similarity dramatically change the way the elements are grouped.

Page 9: Feature integration in human vision

9

Compared to a fairly good understanding of the early vision, very little is known

about feature integration processes and representation in the mid-level vision. In this

thesis, results of four psychophysical investigations on feature integration in three

different tasks and domains are presented. Study I investigates contextual effects in

presumably low-level feature detection task. Study II investigates what visual features

determine the surface brightness. Integration of surfaces is assumed to happen in the

stages immediately following the early feature analysis. Studies III and IV investigated

the integration of local orientation signals to global forms by using Glass patterns.

A special emphasis is in the recent reverse-correlation based classification image

method (Eckstein & Ahumada, 2002). This method enables the direct measuring and

visualizing of the stimulus information that the visual system uses in a perceptual task,

deriving a behavioral receptive field. The method was used in the studies I and II.

1.2 Background

1.2.1 Processing in the primary visual cortex

Since the pioneering studies by Hubel and Wiesel (1959; 1962) it has become well

established that neurons in the primary visual cortex (V1) analyze the local orientation

and the spatial frequency content of the stimulus. Cells are excited by light patterns

(bar) that match the cell’s preferred orientation and spatial frequency (bar width). A

class of cells called simple cells show linear summation of the light pattern in the

receptive field that consists of distinct excitatory and inhibitory areas. The receptive

field structure of simple cells is often modeled by Gabor functions (Marcelja, 1980).

Another class of cells, complex cells do not sum light linearly, but respond to both light

increments and decrements while having similar orientation and spatial frequency

selectivity as simple cells. In mathematical terms, they show phase invariance. By using

linear systems theory, Campbell and associates (Campbell & Green, 1965; Campbell &

Robson, 1968) introduced the view that early processing can be understood as a local

Fourier analysis in which the stimulus is processed in multiple spatial resolutions

(spatial frequency channels) and orientations in a spatially localized manner.

There is a remarkable agreement between psychophysical system-level analysis of

stimulus detectability and electrophysiological recordings on the properties of the early

neural filters (De Valois & De Valois, 1988). Both psychophysical grating contrast

detection tasks and neurophysiological data suggest that early neural filters show almost

Page 10: Feature integration in human vision

10

linear summation of the stimulus information in the receptive field, followed by a

nonlinear transducer function. The transducer is assumed to have a sigmoid shape:

therefore the contrast discrimination thresholds drop in comparison to the detection

threshold in very low baseline (pedestal) contrasts, referred often to as dipper effect

(Foley & Legge, 1981).

In this “classical” view of the V1 processing, the neural filters process only very

local stimulus information and all the integrative functions take place in subsequent

stages of vision. However, since 1980’s it has become clear that even when the stimuli

outside the classical receptive field does not drive the cell, it may change the way it is

activated by a stimulus in the receptive field. Psychophysical studies have shown that a

high-contrast surround grating decreases the perceived contrast of the central grating

(Cannon & Fullenkamp, 1991; Chubb, Sperling, & Solomon, 1989; Foley, 1994; Olzak

& Laurinen, 1999). Similarly, in contrast discrimination a simple transducer model falls

short of explaining the masker orientation effects in contrast discrimination (Foley,

1994).

Neurophysiological animal studies have shown that placing a stimulus outside the

classical receptive field typically suppresses the neuron’s response to the stimulus in the

receptive field [e.g. (Maffei & Fiorentini, 1976); for a review see (Series, Lorenceau, &

Fregnac, 2003)]. These interactions have been modeled by assuming that neuron’s

response depends on two mechanisms: an excitatory center mechanism corresponding to

the classical receptive field and spatially much larger, and more broadly tuned

inhibitory surround mechanism (Carandini, Heeger, & Movshon, 1997; Cavanaugh,

Bair, & Movshon, 2002; DeAngelis, Freeman, & Ohzawa, 1994; Sceniak, Hawken, &

Shapley, 2001). Many models assume that the nature of the suppression is divisive

(Carandini et al., 1997; Cavanaugh et al., 2002). According to one conceptualization,

the surrounds acts as a normalization mechanism, by regulating the cells response by the

integrated response of local cell population (Heeger, 1992).

The surround has shown to have a broad spatial frequency and orientation tuning. It

has been shown that in humans the suppression in V1 is the most likely neural substrate

for the perceptual suppression (Zenger-Landolt & Heeger, 2003). The surround

mechanism, however, may involve a feedback loop from higher visual areas (Angelucci

et al., 2002; Schwabe, Obermayer, Angelucci, & Bressloff, 2006).

Page 11: Feature integration in human vision

11

1.2.2 Object recognition in extrastriate areas

In macaque monkey the neural stream involved in object and shape recognition runs

from V1 to V2 to V4 and further visual areas in inferior temporal cortex showing

selectivity for very complex feature combinations, for example stimuli resembling

monkey faces (Kobatake & Tanaka, 1994). Similar areas have also been identified in

humans with fMRI (see Kanwisher & Yovel, 2006).

Processing in the middle stages of the object recognition stream, especially area V4

has been linked to integration of features in the mid-level vision. V4 cells are known to

have large receptive fields (Gallant, Braun, & Van Essen, 1993; Pasupathy & Connor,

2001) and show transformation invariance to the absolute position of the stimulus in the

receptive field. Kobatake and Tanaka (1994) sought for the critical stimulus features

that activate V4 cells. They first showed pictures of natural objects to find out what

complex stimulus activates the cell maximally. Then, they removed step-by-step

features present in the stimulus to find the simplest feature configuration that still

activates the cell maximally. These responses were further compared with simple

stimuli like light bars. It was found out that a sizable proportion of the cells in V4

responded best to rather complex features containing multiple orientations, for example

crosses and circular patterns.

Gallant et al. (Gallant et al., 1993; Gallant, Connor, Rakshit, Lewis, & Van Essen,

1996) used Cartesian straight (“linear”) and polar (concentric and radial) gratings to

characterize V4 processing, motivated by the finding that cells in the early visual areas

prefer linear gratings. Theoretical considerations (Dodwell, 1983) suggest that radial

and concentric forms could be used for shape representation. It was found 16- 17 %

cells in V4 were more sensitive to either concentric or radial gratings than to linear

gratings, whereas very few cells showed preference for linear gratings. Thus the cells in

V4 are involved in a more complex analysis of form than early visual areas and show

less preference for linear forms. In another study (David, Hayden, & Gallant, 2006),

linearized spectral reverse correlation was used to estimate the Fourier power spectrum

of the stimulus features that drive V4 cells. It was found that receptive fields had a very

broad spatial frequency and orientation tuning. Bimodal orientation tuning, i.e.

responding best to a stimulus that have more than one orientation was common.

Another approach to V4 processing was taken by Pashupathy and Connor (1999;

2001) who investigated V4 shape representation by a set of contour features (angles and

Page 12: Feature integration in human vision

12

curves). The properties of contour features (such as convexity/concavity, acuteness and

contour feature orientation) were systematically varied to measure a cell’s tuning. A

large majority of V4 cells responded more strongly to shape primitives than to simple

bars or edge stimuli. Many (about 30 %) of the cells showed systematic tuning for

contour feature parameters. A combined sensitivity to the contour feature components

could not explain the tuning, as was shown by permuting the positions of the component

features (Pasupathy & Connor, 2001). In addition to the specificity to a single feature,

many cells are modulated by the other features along the shape. However, individual V4

neurons are typically not selective to any specific global shape (Pasupathy & Connor,

2001; Pasupathy & Connor, 2002; Pasupathy & Connor, 2002).

Currently, it is not well known, how closely the neural circuits for shape processing

in macaque V4 resemble the structure of the human brain, as the human homologue of

macaque V4 is unclear (Orban, Van Essen, & Vanduffel, 2004).

1.3 Feature integration in early- and mid-vision psychophysics

1.3.1 Contrast detection and collinear facilitation

In the “classical view” of early vision, neural filters were thought to code the

elementary features independent of each other. However, complex perceptual

interactions between elementary stimulus features have been reported even in simple

detection and discrimination tasks, presumably reflecting the earliest cortical

processing. Polat and Sagi (1993; 1994a) measured psychophysically the contrast

detection thresholds for a small Gabor patch with spatially displaced, high-contrast

Gabor masks, flankers. Flanker location, orientation and the global configuration as well

as the relative position of the flankers and the target were varied.

It was found that when the flankers were added next to the target Gabor, they

suppressed the sensitivity to the Gabor largely irrespective of orientation, similarly to

what was previously found in grating surround suppression studies. However, placing

flankers further apart caused facilitation i.e. an increase in the detectability, compared to

the baseline with no flankers. The facilitation was specific to the spatial frequency and

the relative orientation of the flankers and the target. The facilitation was prominent

only in collinear (coaxial) configuration (figure 1 A), whereas orthogonal (1 B) or

parallel (1 C) flankers had little effect to the detectability.

Page 13: Feature integration in human vision

Figure 1. Stimulus configuration

thresholds for the central Gabor

between the masks and the central Gabor (expressed as wavelength ) was varied as well as the relative

orientation between the target and the flankers.

configuration (A) when d > 2 whereas orthogonal

The most pronounced facilitation was found at flanker

but the effect was measurable to distances as long as 9

effects were reported later from single

Kasamatsu, & Norcia, 1998)

& Norcia, 1998).

The range and independence of the target phase

long-range configuration specific, excitatory interactions between neural filters. It was

suggested that the possible

primary visual cortex (Polat & Sagi, 1993; Polat & Sagi, 1994b)

have shown that there are orientation specific, long

between V1 cells (Gilbert & Wiesel, 1983; Gilbert & Wiesel, 1

1982)1. It was suggested that the long

feature integration, for example in contour integration

bridging the contour caps (Polat & Sagi, 1993; Polat & Sagi, 1994b)

Excitatory long-range interactions between the neural filters sensitive to the flankers

and the target have been modeled by assuming that the flankers increase the sensitivity

of the filters sensitive to the central Gabor 1 Note however that a more recent study suggests more l

connections (Angelucci et al., 2002)

13

. Stimulus configurations in the lateral masking paradigm (Polat & Sagi, 1993)

for the central Gabor were measured with static high-contrast flanker masks. Distance

between the masks and the central Gabor (expressed as wavelength ) was varied as well as the relative

orientation between the target and the flankers. Facilitation of the detection was observed with collinear

whereas orthogonal (B) and parallel (C) masks had little effect.

most pronounced facilitation was found at flanker-target distances around 2.5

the effect was measurable to distances as long as 9 (Polat & Sagi, 1993)

were reported later from single-cell recordings (Polat, Mizobe, Pettet,

Kasamatsu, & Norcia, 1998) as well as from VEP studies (Polat & Norcia, 1996; Polat

independence of the target phase, was interpreted as evidence for

specific, excitatory interactions between neural filters. It was

suggested that the possible neural substrate could be the horizontal connections in

(Polat & Sagi, 1993; Polat & Sagi, 1994b). Anatomical studies

have shown that there are orientation specific, long-range horizontal

(Gilbert & Wiesel, 1983; Gilbert & Wiesel, 1989; Rockland & Lund,

. It was suggested that the long range collinear facilitation may be involved in

feature integration, for example in contour integration (Polat & Bonneh, 2000)

(Polat & Sagi, 1993; Polat & Sagi, 1994b).

range interactions between the neural filters sensitive to the flankers

and the target have been modeled by assuming that the flankers increase the sensitivity

of the filters sensitive to the central Gabor (Adini, Sagi, & Tsodyks, 1997; Chen &

a more recent study suggests more limited functional range for the horizontal

(Angelucci et al., 2002)

(Polat & Sagi, 1993). Detection

ast flanker masks. Distance d

between the masks and the central Gabor (expressed as wavelength ) was varied as well as the relative

acilitation of the detection was observed with collinear

had little effect.

target distances around 2.5 – 3

(Polat & Sagi, 1993). Similar

(Polat, Mizobe, Pettet,

(Polat & Norcia, 1996; Polat

nterpreted as evidence for

specific, excitatory interactions between neural filters. It was

neural substrate could be the horizontal connections in

Anatomical studies

horizontal connections

989; Rockland & Lund,

cilitation may be involved in

(Polat & Bonneh, 2000) and

range interactions between the neural filters sensitive to the flankers

and the target have been modeled by assuming that the flankers increase the sensitivity

(Adini, Sagi, & Tsodyks, 1997; Chen &

mited functional range for the horizontal

Page 14: Feature integration in human vision

14

Tyler, 2001; Chen & Tyler, 2002; Chen & Tyler, 2008; Zenger & Sagi, 1996). Often,

only filters at the location of the central Gabors are thought to convey the detection.

However, some studies have asked, whether excitatory interactions between the

neural filters have to be assumed at all. Solomon, Watson and Morgan (1999) proposed

that the non-optimal neural filters located between the target and the flankers may get a

weak signal from the flankers while also being sensitive to the target, even when the

flankers and the target are quite far away. The weak activation from the flankers acts

like the low-contrast pedestal stimulus in the dipper effect, thus explaining the collinear

facilitation without interactions between the filters. The model could explain the effect

of flankers with opposite phase (Solomon et al., 1999).

It is known that one fundamental limiting factor in visual pattern detection is the

uncertainty of the exact stimulus parameters (Pelli, 1985). The problem in the lateral

masking paradigm is that in addition to low-level factors, the high-contrast collinear

flankers may give cues about the location, spatial frequency and orientation of the target

compared to no-flankers condition. It has been proposed that high-level uncertainty

reduction might explain the facilitation (Williams & Hess, 1998).

Further, it has been argued that collinear contrast facilitation and contour integration

are not related processes, since contour integration is largely independent of the contrast

code (Hess, Dakin, & Field, 1998; Williams & Hess, 1998). Collinear facilitation may

be highly dependent on the specific stimulus configuration, as adding non-collinear

flankers cancels the facilitation (Solomon & Morgan, 2000). Some studies have even

suggested that collinear effects are restricted to patterns near the detection threshold

(Meese, Hess, & Williams, 2001)

1.3.2 Integration of surface brightness

Surfaces are among the most important building blocks of the visual world. However,

neural filters in the early visual areas are very sensitive to luminance differences but

typically give no response to uniform illumination (Hubel & Wiesel, 1959). It is often

assumed that the surface information is not explicitly presented in the early cortical

stages (V1), but is integrated from the early neural filter responses in the subsequent

processing [for attempts to localize the locus see for example: (Cornelissen, Wade,

Vladusich, Dougherty, & Wandell, 2006; Morrone & Burr, 1988; Perna, Tosetti,

Montanaro, & Morrone, 2005)]

Page 15: Feature integration in human vision

15

Many models of brightness perception suggest that surface brightness information is

extrapolated from the border responses, either by an active filling-in process (Grossberg

& Todorovic, 1988) or by a symbolic interpretation of “edge” and “bar” border

responses (Morrone & Burr, 1988). Borders typically have a broad spatial frequency

band, whereas neural filters are narrowly tuned to spatial frequency. This raises the

question how the border information at different spatial scales is integrated in the

surface representation.

The structure of luminance borders can dramatically change the perceived brightness

of the surface. This is demonstrated in the illusory Craik-O’Brien-Cornsweet surface

(figure 2 A). A bright, homogenous surface is seen even when there is no increment in

the physical luminance outside the border area. Similarly, in the simultaneous contrast

illusion (figure 2 B) the background luminance modulates the brightness of identical

patches.

Figure 2. Brightness illusions show that the border structure generates and changes the perceived

brightness. A: Craik-O’Brien-Cornsweet illusion. Slowly changing luminance gradient generates an

illusion of a uniform bright surface, even when there is no physical increment in the middle of the surface

(intensity profile shown in the bottom). B: Simultaneous contrast illusion. Two patches have identical

physical luminance, but the patch on the dark background (right) appears brighter than the patch on the

bright background (left).

Page 16: Feature integration in human vision

16

1.3.3 Form integration in Glass patterns

Integration of elementary features to global forms has been investigated by form

integration tasks. The most intensively studied class of form stimuli are the Glass

patterns (Glass, 1969). They are composed of random dots. A proportion of dots are

paired to form dipoles according to a certain geometrical rule (e.g. rotation, Cartesian

translation2). Even when the local structure consists of just random dots, the global form

is immediately perceived (figure 2).

Figure 3. Glass patterns. A geometrically translated (here: rotation) copy (B) is made of a random dot-

pattern (A). When superimposed with the original image, structure can be perceived (C) even when the

image contains just random dots.

The perception of global structure in the Glass stimulus implies that noisy local

orientation signals are integrated into a global form. The number of all dot pairings in a

pattern consisting of n dots is n(n-1)/2, but the number of “signal dipoles” is at most

n/2. The problem is how to find the “correct” signal dipoles and omit the dipoles at

random orientations. In spatial frequency domain, Glass patterns are broad band

stimulus, but only certain spatial scales (determined by the dot separation) contain the

orientation structure of the target and thus a problem in a filter-based stimulus analysis

is to find the relevant spatial scale (see e.g. Dakin, 1997).

Early studies suggested that the perception of Glass patterns was based on computing

the local spatial autocorrelation between the dots. The first stage was assumed to be

conveyed by V1 simple cells. The perception of the pattern is more difficult when the

separation between the dots is larger (Glass, 1969; Glass & Perez, 1973). If the dots

pairs in signal dipoles are of opposing polarity, the pattern cannot be seen (Glass & 2 In the original studies III and IV pattern with Cartesian translation is referred as linear.

Page 17: Feature integration in human vision

17

Switkes, 1976). A competing hypothesis (Stevens, 1978) was that the integration of the

dipole orientation was done by an explicit grouping process, namely the nearest

neighborhood matching in a higher “symbolic” level. However, the detectability (d') of

the structure in Glass patterns followed Weber’s function of signal to noise ratio, and it

was shown to be possible even when there were 6-10 random dots closer than the

“correct” counterpart. This provided strong support to the idea that the structure

extraction is done by a local autocorrelation process, such as linear filtering (Maloney,

Mitchison, & Barlow, 1987) .

Current models of Glass pattern perception suggest that the first stage of processing

consists of filtering the stimulus by linear neural filters. Some models postulate

sophisticated algorithms which find the correct scale containing the signal (Dakin,

1997). However, recent models have shifted the focus away from the scale selection

problem. Instead of a general pattern detector they suggest that different geometric

translations are detected by specialized mechanisms.

The idea of a special mechanism was first proposed on the basis of spatial integration

results (Wilson, Wilkinson, & Asaad, 1997; Wilson & Wilkinson, 1998). It was

reported that concentric and radial Glass patterns show very strong area summation

(improvement of detectability when increasing the pattern size), whereas Cartesian

patterns show summation only with the smallest stimulus sizes (see however: Dakin &

Bex, 2003b; Kurki, Laurinen, Peromaa, & Saarinen, 2003). In addition, profoundly

lower thresholds were reported with concentric and radial (whole) patterns than with

translational patterns (Wilson et al., 1997; Wilson & Wilkinson, 1998).

The results were modeled by a filter-rectify-filter model, where in the second stage

linear and concentric patterns are integrated by different neural mechanisms. In the

mechanism sensitive to concentric orientation, the outputs of large second-order filters

are summed across the orientations (Wilson et al., 1997; Wilson & Wilkinson, 1998).

Support for concentric and radial detectors also comes from spiral Glass patterns

studies, showing that detection of spiral forms is inferior to concentric or radial (Seu &

Ferrera, 2001).

However, the stimuli used by Wilson et al. contained a possible artifact: the dots in

translational pattern cause a problem in the edge of the stimulus, since the translated

dots fall off from the original random dot area. The edge of the translated stimulus is

therefore jagged. This problem does not exist in a concentric pattern, since the dots are

always translated inside the stimulus area (Dakin & Bex, 2002). Dakin and Bex (2002)

Page 18: Feature integration in human vision

18

used stimuli, where this edge artifact was controlled, either by using a square window or

by adding random dots to the background of the stimulus. The results showed no

difference in thresholds for concentric and translational patterns.

Page 19: Feature integration in human vision

19

2 The aims of the studies Studies I and II used the new classification image method that allows direct estimation

of behavioral receptive field: a spatial correlation map of the stimulus features that are

used for perceptual decisions.

Study I investigated how the local collinear context (collinear flankers) changes

sensitivity to a Gabor stimulus in a detection task, presumably reflecting information

processing in the early visual areas. The classification image method was used to

measure behavioral receptive fields for a Gabor target and the same target, flanked by

two collinear high-contrast Gabor flankers.

Study II estimated, using the classification image method with a novel brightness

matching task, what stimulus features of a surface stimulus are critical for the perceived

brightness. Both “real”, step and illusory stimuli were used. Further, the size of the

target stimulus was varied, in order to see whether the results are determined by tuning

of low-level spatial frequency channel(s).

Studies III and IV used Glass patterns to investigate detection and integration of

spatial forms.

Study III investigated how highly visible forms are processed in a task, where

subjects had to discriminate highly coherent Glass patterns. Both Cartesian translational

and Polar concentric geometrical transformations were used to test the idea that they are

processed in separate neural systems.

Study IV tested the idea that linear and concentric (curved) forms are processed in

separate systems and the concentric form requires integration over larger area, thus

showing a greater dependency on the global form. Form integration was investigated by

introducing “irregularity” to the global form. Glass patterns were divided into patches,

which each had independently randomized local orientation (either: curved or

translational). The number of stimulus patches was varied.

Page 20: Feature integration in human vision

20

3 Methods & results

3.1 Classification image method

In studies I and II a psychophysical reverse-correlation method known as classification

images was used3 (Ahumada & Beard, 1999; Beard & Ahumada, 1998; Chung, Levi, &

Tjan, 2005; Eckstein & Ahumada, 2002; Gold, Murray, Bennett, & Sekuler, 2000; Levi

& Klein, 2003; Li, Levi, & Klein, 2004; Neri & Heeger, 2002; Neri & Levi, 2006;

Solomon, 2002; Tjan & Nandy, 2006). The aim of this method is to directly estimate

what information in the stimulus correlates with the perceptual decision (e.g. seeing the

target stimulus) and thus estimate a systems-level “behavioral receptive field” for the

target stimulus and the task (e.g. detection, discrimination or assessing the brightness).

The result, a classification image, can be presented as a correlation map in the spatial

domain, revealing directly what features in the stimuli are important for the task. Key

benefits of the method are (1) the ability to measure the sensitivity of the perceptual

mechanism directly in the spatial domain (versus for example the spatial frequency

domain in the critical band masking paradigm); (2) relatively few a priori assumptions

about the observer have to be made (versus for example adaptation or spatial frequency

masking paradigm) and (3) the performance can be quantified and compared by a

mathematically exactly defined ideal observer.

Classification image analysis can, in principle, be applied to any psychophysical task

where visual noise added to the target has an effect. For simplicity, let us assume that

contrast detection task is used. On each trial, there is a 50 % change that a constant

target stimulus (e.g. Gabor patch) is presented, masked with white noise or white noise

without target. The stimuli are static, discretely sampled (pixels) that vary along the

luminance axis. Stimulus information can be expressed as (two dimensional) matrices of

local contrast relative to the mean luminance, negative values meaning decrements and

positive values increments. On each trial, the subject reports whether s/he saw the target

by answering either “yes” or “no”. The target contrast has been set to the detection

threshold so that both correct and incorrect responses are obtained.

The signal detection theory proposes that the outcome of the perceptual processing in

a detection task can be modeled by a one-dimensional variable, which represents the

(subjective) likelihood of the target being present (Green & Swets, 1974). Subject 3 The method is also known as noise image classification: psychophysical reverse-correlation

Page 21: Feature integration in human vision

21

responds “yes”, if the value of this response variable exceeds a subjective criterion and

“no” otherwise. The subjective criterion is dependent on non-perceptual factors, such as

the subjective estimate of the likelihood of the target. It is often illustrative to think that

the observer’s response variable is the output of a single “perceptual mechanism” under

study. The classification image technique is easiest to understand if we assume that the

perceptual mechanism has a linear response to the stimulus information, as is often the

case especially in the early vision. In this case, the mechanism responds by weighting

the stimulus information by weights determined by the receptive field profile and then

summing the output.

Let us consider the influence of just a single pixel. The luminance in the pixel is the

sum of the random value of the noise mask and the fixed luminance value of the target,

either present or not. There are the following possibilities: (1) the pixel is not used for

the perceptual decision (2) the pixel is used for the perceptual decision and the

behavioral receptive field as a positive weight; an increment in this pixel “drives” the

perceptual mechanism, i.e. makes a “yes” response more likely. (3) The pixel is used for

the perceptual decision and the receptive field has a negative weight; a decrement is in

this pixel drives the perceptual mechanism. In case (1) obviously, whatever value the

noise has, it does not bear consequences for the decision. In case (2), if the random

value of the noise is positive (increment) it adds up to the value of the target and drives

the mechanism more, i.e. increases the possibility of “yes” response. If the value of the

noise is negative (decrement) it subtracts from the value of the target and lessens the

possibility of a “yes” response. Thus, there is a positive correlation between the pixels

luminance value and the “yes” response. In the case (3), the relation is opposite:

negative values in the noise add up to decrements in the stimulus and increase the

probability of a “yes” response and vice versa. Therefore, the correlation is negative.

When there are more pixels, the response is dependent on how the information from

individual pixels is integrated in the perceptual mechanism. Most often it is simply

assumed that the perceptual mechanism has a purely linear response (equation 1): every

pixel in the stimulus is first weighted by the weight of the receptive field and then

summed.

Page 22: Feature integration in human vision

22

In order to estimate the weights a technique resembling linear regression can be used.

Since all the pixels in the random noise are mutually independent, the estimation

formula is particularly simple. A standard way to estimate classification image C, an

estimator for the behavioral receptive field is to take the average (overline) of the noise

masks N, classified by the trial types and outcomes.

In the equation (2), a denotes a subject’s response “yes” (a1) and ”no” (a0) and t the

target presence present t1 and absent t0 (Ahumada, 2002). This estimation formula

maximizes the expected signal-to-noise ratio of the estimated receptive field (Murray,

Bennett, & Sekuler, 2002). Thus, to see what common features all the noise masks that

cause “yes” responses have, the averages of the noise masks where the subject reported

seeing the target (whether it was present or not) are added together. The averages of the

noise masks where the subject reported that no target was present are then subtracted

from this. It should be noted that target profile is not used in the analysis, classification

images are averages of classified random noise. The two-alternative forced choice

method can be analyzed in similar way, computing first the pixel-wise difference

between the noise masks containing the target and the noise mask containing the

comparison stimulus and then subtracting the average of noise in incorrect trials from

the average noise in correct trials (Abbey & Eckstein, 1999).

3.1.1 Non-linear classification images

It can be shown that the linear perceptual mechanism with a receptive field profile

matching the target stimulus profile is the ideal strategy for detecting a “constant” signal

in white noise when its properties are known exactly (Green & Swets, 1974). For many

stimuli and paradigms an assumption of perfect knowledge may be unrealistic. A very

general limit in detection comes from an inability to focus optimally on the target

information: often subjects behave as if they were uncertain of the exact properties of

the stimulus, responding “yes” not only to target waveform, but all similar stimuli even

when they are able to discriminate them in a discrimination task (Pelli, 1985). For

Page 23: Feature integration in human vision

23

example, if the target is at Gabor at [0,0], they would respond also “yes” to a Gabor at

[0,1’] or [0,-1’].

This phenomenon is called intrinsic uncertainty, in contrast to external or stimulus

uncertainty, where (some) parameters of the target stimulus are randomized. The

detection under uncertainty has been modeled by a maximum of outputs, “winner takes

all” model. For example, instead of using the output of a single receptive field, the

mechanism monitors over outputs of multiple receptive fields at [0,0] [0,1’] and [0,-1’]

and chooses the one with maximum output. It can be shown (Tjan & Nandy, 2006) that

under uncertainty, using the standard classification images analysis (equation 2) reveals

a superposition of all the monitored receptive fields. For example, spatial uncertainty

causes “smearing” of the receptive field profile in the classification image. One may

now erroneously conclude that the observer uses a single receptive field with a large

area whereas only local information over many locations is used.

To tackle this problem, it is important to minimize uncertainty in the experiment by

i.e. enabling the subjects to get accustomed with the stimuli and task, stabilize fixation

etc. Methods have been developed to test how well the linear model assumption

explains the observed performance in the experiment (Murray, Bennett, & Sekuler,

2005). Another possibility is to estimate how likely an uncertainty-based explanation is

by simulating a model perceptual mechanism (see study I). It is also possible to use a

paradigm specially tailored to isolate the effects of uncertainty. Tjan and Nandy showed

that the effect of uncertainty in classification images is most pronounced in target absent

trials. Using a high-contrast target and a special data analysis, it is possible to estimate

both the receptive field and the extent of the uncertainty (Tjan & Nandy, 2006).

3.2 General experimental methods

3.2.1 Subjects and apparatus

The subjects were volunteers. Both authors and subjects naïve to the purpose of the

studies participated. The results of the classification image experiments (study I & II)

shown here are individual data, Glass experiments use mean of 6 (study III) and 4

(study IV) subjects.

Experiments were conducted in a dimly lit laboratory. Stimuli were generated by

(study I) Cambridge Research Systems (Rochester, UK) 2/5, (study II) Cambridge

Page 24: Feature integration in human vision

24

Research Systems ViSaGe and (studies III-IV) Vision Research Graphics (Durham,

NH) Vision Works environments providing 12-15 bit grayscale resolution. High-quality

linearized and calibrated CRT monitors were used.

3.3 Study I: Mechanisms of collinear facilitation

Collinear spatial interactions have been the object of intensive research, yet no

consensus exists whether they reflect interactions between neural filters or can be

explained by previously known classical effects such as dipper function and uncertainty

reduction. In study I, this was investigated by directly estimating the behavioral

receptive field mediating the target. Classification images were measured for a central

Gabor patch (no-flankers condition) and with collinear flanker stimuli in a configuration

where collinear facilitation is very strong.

Models which explain the facilitation by increased sensitivity in the neural filter,

conveyed by long-range interactions between the target and the flankers, predict an

increase in the signal-to-(internal) noise ratio of the classification image, i.e. amplitude.

If collinear facilitation is caused by reduction in uncertainty, the classification

images in no-flankers condition should show signs of using non-optimal orientation or

spatial frequency band of the target; or smearing of the classification image as a result

of spatial uncertainty. These effects should be reduced in collinear flankers condition.

The dipper effect model (Solomon et al., 1999) predicts that an increased sensitivity in

the area between the target and the flankers.

3.3.1 Stimuli, procedure and data analysis

The target was a low spatial frequency (1.5 cpd) Gabor (figure 3 A), masked by low-

contrast (rms 0.1 linear contrast units) white noise. Classification images were measured

for two conditions: target alone (figure 3 B) or the target surrounded by two high-

contrast (40%) flanking collinear Gabors at 1.7° (2.5 Gabor wavelength) distance

(figure 3 C). Large, low- spatial frequency stimuli and fixation crosshair were used to

reduce uncertainty effects. A two-interval forced-choice (2IFC) detection task was used.

Page 25: Feature integration in human vision

25

Figure 3. Stimuli in the study I. Target was a low-frequency (1.5 cpd) Gabor patch, masked with low-

contrast white noise. In the first condition (B), classification images were measured for the target Gabor.

In the second condition (C), high-contrast static flankers were added. In the experiment, the contrast of

the target was much lower than in this illustration.

The classification images were analyzed using forced-choice version of equation 2

(Abbey & Eckstein, 1999). All the noise masks that were shown to the subject and the

corresponding responses (correct, incorrect) in every trial were recorded for later

analysis. 4,000-5,000 trials per subject per condition were measured. All behavioral

receptive fields as estimated from the classification images resembled Gabor stimuli

with varying parameters. To quantify the effect of the collinear context, Gabor functions

were fitted to the classification images. Confidence intervals for the Gabor function

parameters were obtained by using the Bootstrap method (Efron & Tibshirani, 1993).

3.3.2 Results

Classification images are shown in figure 4. Collinear flankers improved the absolute

detection efficiency (F) by 7 % - 30 %. In the collinear flankers-condition, the

behavioral receptive fields were clearly horizontally elongated towards the flankers

when compared to the no-flankers condition. Figure 5 shows the best-fitting parameters.

The horizontal extent of the behavioral receptive field is 22 % to 143 % wider than in

no-flankers condition. Other parameters, i.e. amplitude, show no consistent differences.

Page 26: Feature integration in human vision

26

Figure 4. Study I: results. Classification images (5 subjects) (behavioral receptive fields mediating the

detection of a Gabor patch). Left columns: Gabor patch without flankers. Right columns: Gabor patch

with collinear flankers. Ideal (target stimulus) behavioral receptive field is shown in the first row, left. F-

value expresses the detection efficiency for each condition and observer (ideal observer = 1). Behavioral

receptive fields measured with collinear flankers are elongated to the direction of the flankers. SS, TR

were naïve subjects.

Page 27: Feature integration in human vision

27

Figure 5. Study I results: vertical (height) and horizontal (width) extent of the behavioral receptive field

(best-fitting Gabor) for 5 subjects. Blue bars: no flankers, red bars: collinear flankers. Dashed line

represents the ideal (target stimulus) extent. Behavioral receptive fields measured with the collinear

flankers are systematically elongated. Error bars represent 95% confidence interval.

The (linear) sampling efficiency (the match between the behavioral receptive field and

the target stimulus) was estimated by computing the cross-correlation between the

(normalized) fit and the target stimulus. A notable increase was observed in just two

(out of 5) subjects.

Surprisingly, we found that the behavioral receptive fields changed in the course of

the data collection. In particular, the difference between conditions in horizontal extent

parameter is evident only at the beginning of the experiments (figure 6). Practice

apparently retunes the behavioral receptive fields so that they converge during the

learning, in time period of about two sessions.

Page 28: Feature integration in human vision

28

Figure 6. Study I: results. Horizontal extents of the behavioral receptive fields analyzed separately from

the early sessions (first 2,000 trials) and from the late sessions (last 3,000 trials), two subjects. Solid lines:

no-flankers condition. Dashed lines: collinear flankers. The elongation of the behavioral receptive field is

apparent only in the early trials. Error bars represent 95 % confidence interval. Data from the other

subjects is comparable.

3.4 Study II: Edge integration and brightness perception

Study II investigated perception of surface brightness using the classification image

method with a novel brightness matching task. Noise causes small fluctuations in the

perceived brightness of the stimulus, depending on how the random visual features

within the noise match to the (unknown) receptive field of the perceptual mechanism

responsible for the neural brightness computation. By analyzing the correlation between

each point in the noise and the perceptual perceived brightness, it is possible to directly

estimate the receptive field profile, i.e. what parts of the stimulus are critical for the

brightness. Both illusory Craik-O’Brien-Cornsweet and “real” step stimuli were used.

Early filters typically have a narrow spatial frequency band (about 1 octave), whereas

stimulus borders are broad-band stimuli. How are the neural filters tuned to different

spatial scales integrated? By decreasing the stimulus size, the spatial frequency band

shifts to higher frequencies. It has been shown that in tasks like face (Nasanen, 1999)

(Pelli, 1999) and letter recognition (Majaj, Pelli, Kurshan, & Palomares, 2002; Solomon

& Pelli, 1994), the visual system uses just a very limited band of the broad-band

stimulus information. Recently, it has been suggested that brightness perception shows a

similar dependency on a single spatial frequency scale that is either low-frequency and

constant irrespective of the stimulus size (Perna & Morrone, 2007), or medium-

frequency and changes slightly with the stimulus size (Salmela & Laurinen, 2005).

Page 29: Feature integration in human vision

29

Classification images were measured with three retinal sizes (4-fold difference) to

investigate the tuning and scaling of the behavioral receptive field mediating the

perceived brightness. Spatial frequency tuning of the classification images was

characterized by fitting exponential functions.

3.4.1 Stimuli, procedure and data analysis

In the first experiment, the test stimulus was a circularly symmetric illusory Craik-

O’Brien-Cornsweet patch (see figure 2). The radius of the stimulus was 1.33 deg and

peak-to-peak contrast 10 %. The luminance of the stimulus at the center was the

background luminance 50 cd/m2. The stimulus was masked by adding a one-

dimensional annular “ring noise”, made by randomizing the luminance (s.d. 2 cd/m2) of

64 concentric rings whose total was radius 2.67 deg.

In the second experiment, the test stimulus was a circular “step” patch of uniform

luminance (65 cd/m2), radius was varied (0.33, 0.66 or 1.33 deg). The stimulus was

masked by a ring noise mask (s.d. 3 cd/m2) composed of 64 rings, total radius 0.66, 1.33

or 2.67 deg. In both experiments, the target and the noise contrasts were chosen so that

the target stimuli were clearly above the detection threshold, but the noise masks had

still a noticeable effect on the perceived brightness.

The comparison stimulus was always an unmasked uniform circular patch with the

same radius as the test. The luminance was varied (4 levels, chosen so that the lowest

luminance resulted in ca. 10 % brighter-than judgments and the highest ca. 90 %).

Page 30: Feature integration in human vision

30

Figure 7. Study II: stimuli and the method. Two presentation intervals, one containing the test stimulus

and the other containing the comparison stimulus were shown to the subject in random order in each trial.

The test stimulus was either a constant Craik-O’Brien-Cornsweet or a uniform step [in the illustration]

patch masked with white noise. Physical luminance of the comparison stimulus was varied (4 levels). The

subject chose the interval in which the stimulus appeared brighter.

The method of constant stimulus (MOCS) with a two-interval brightness matching task

was used. In every trial, subjects were asked to choose the interval in which the patch

(test or comparison, presented in random order) appeared brighter (figure 7). Noise

masks as well as comparison stimulus luminance and the response (comparison/test

brighter) were recorded for the classification image analysis.

Standard classification image analysis was used, except for first computing the

separate sub-classification images for each comparison stimulus level (see figure 8).

Page 31: Feature integration in human vision

31

Figure 8. Study II: Data analysis. Classification images were analyzed in two steps. First, sub-

classification images CIk for each comparison stimulus luminance level were computed by taking the

average (across the trials) of the noise mask profile with a “target brighter than the comparison” response

n> and subtracting from this the average of the noise masks associated with “target not brighter” response

n<. The classification image CI was then computed by taking the average of the sub-classification images.

Classification image estimates the “total” weight the subject gives to each of the

different rings in the brightness task. However, as the area of a stimulus ring is directly

proportional to its distance from the center, the “raw“ classification images do not

reflect the sensitivity per unit area of the stimulus but per ring. Thus, even if the

sensitivity to the brightness information would be uniform in the area of the target

stimulus; the raw classification image would have more weight farther from the center.

Therefore, the classification images were normalized by the number of pixels in the

stimulus to estimate the sensitivity per unit area.

Finally, the Bootstrap method (Efron & Tibshirani, 1993) was used to estimate the

standard error of the classification images.

3.4.2 Results

Classification image profiles (both raw and normalized) for the brightness of illusory

Craik-Cornsweet-O’Brien stimulus are shown in figure 9. Classification images peak

inside the stimulus border of the patch and have negative peaks in the background, next

to the border. The normalized classification image has clearly a narrower positive lobe

than the stimulus profile and the weight drops close to zero in the “illusory” area.

Page 32: Feature integration in human vision

32

Figure 9. Study II results: Craik-Cornsweet-O’Brien stimulus. Classification image profiles for three

subjects. The black line shows the target profile. The blue curve is the “raw” classification image and the

red curve is the spatially normalized classification image. The classification image reveals a positive peak

inside the border area. The weight is nonzero just at the border, whereas the illusory area is almost flat.

This implies that subjects use only stimulus information at the border when assessing the brightness of the

patch. Error bars: 1 standard error of the mean.

The mean perceived (point of subjective equality) brightness of the stimulus was about

77% of the peak luminance of the Craik-Cornsweet-O’Brien pattern, suggesting a vivid

illusion of a bright surface.

The results for the patch stimulus in the study II are shown in figure 10. The

classification image profiles peak at the location of the border, and have a less

prominent negative peak at the background. The tuning is clearly band-pass: lack of the

highest frequencies is evident with the smallest 0.33 deg stimulus in which the weight

drops gradually rather than sharply at the border. Lack of lowest spatial frequencies can

be seen from larger stimuli, as the weight of the classification images drops zero farther

from the center. Classification images for the 1.33 deg step stimulus and the illusory

Craik-Cornsweet-O’Brien stimuli are highly similar.

Page 33: Feature integration in human vision

Figure 10. Study II results: uniform step patch. Classification image profiles for three subjects and three

stimulus sizes. Black curves: target profile. Blue curves “raw” classification images and red curves:

spatially normalized classification images.

border. The relative extent of the classification image profile compared to the target size is dependent on

the size: with small 0.33 deg stimulus, profile covers the entire

stimulus, it covers just the border

The tuning of the normalized classification images can be seen when plotted to the same

coordinates in figure 11. Classification image p

33

II results: uniform step patch. Classification image profiles for three subjects and three

stimulus sizes. Black curves: target profile. Blue curves “raw” classification images and red curves:

ification images. Classification image profiles peak inside the stimulus, near the

border. The relative extent of the classification image profile compared to the target size is dependent on

the size: with small 0.33 deg stimulus, profile covers the entire stimulus but with the large 1.33 deg

border. Error bar: 1 standard error of mean.

The tuning of the normalized classification images can be seen when plotted to the same

Classification image profiles largely overlap.

II results: uniform step patch. Classification image profiles for three subjects and three

stimulus sizes. Black curves: target profile. Blue curves “raw” classification images and red curves:

Classification image profiles peak inside the stimulus, near the

border. The relative extent of the classification image profile compared to the target size is dependent on

stimulus but with the large 1.33 deg

The tuning of the normalized classification images can be seen when plotted to the same

Page 34: Feature integration in human vision

34

Figure 11. Study II results: normalized classification images profiles for the step stimuli plotted to the

same coordinates, for three subjects. Blue curve: 1.33 deg stimulus green curve: 0.66 deg stimulus red

curve 0.33 deg stimulus. Tuning of the classification image is almost invariant of the stimulus size.

3.5 Studies III & IV: Form integration in Glass patterns

Studies III and IV investigated the integration of form in Glass patterns.

Most of the Glass pattern studies have used near-threshold stimuli, measuring the

minimum proportion of signal fragments among the noise. How well do the results

generalize arguably more common suprathreshold stimuli? The relation is not always

trivial, for example contrast detection threshold does not predict the suprathreshold

apparent contrast of the gratings (Georgeson & Sullivan, 1975). In the study III the

sensitivity to concentric and translational orientation structures were investigated in a

suprathreshold regime using Glass patterns with a highly coherent structure. The target

pattern had 100 % proportion of the signal dots whereas the proportion of noise dots in

the comparison stimulus was varied. The task of the subject was to discriminate

between these two patterns, i.e. to detect the stimulus that had “more Glass pattern”.

In the study IV the spatial integration of concentric and translational Glass patterns

was investigated by introducing a global “irregularity”. Patterns were composed from a

number of areas which had a local coherent structure (translational or

concentric/curved) at random orientation. By increasing the number of areas, the

amount of globally coherent information in the pattern decreases while keeping the

signal area constant. We compared the effect of irregularity in (locally) translational and

curved patterns.

Page 35: Feature integration in human vision

35

3.5.1 Stimuli and procedure

Target stimuli were Glass patterns, random dot stimuli where a proportion of black dots

(4.3 x 4.3 arc min) on a bright background are paired to coherently oriented signal

dipoles. Orientation of the dipoles is controlled by a geometrical rule (concentric or

translational). Presentation time was 125 ms, to avoid multiple fixations. An illustration

of the stimuli in study III is shown in figure 12.

The number of stimulus sub-areas in study IV was varied (1-4-9), keeping the size of

the whole stimulus constant (see figure 13). The orientation structure within each sub-

element was randomized (translational: horizontal or vertical; curved: ones of the four

quadrants of a global concentric pattern).

The comparison stimulus in the study III was identical to the standard stimulus,

except for that the proportion of signal dots was varied (100% in the target). The rest of

the dots (“noise”) were dipoles in random orientations. The comparison stimulus in the

study IV was identical to the standard stimulus, except for the signal dipoles being in

random orientations. Patterns were thus matched in distribution of dipole lengths.

Page 36: Feature integration in human vision

Figure 12. Study III Glass pattern stimuli. Left:

orientation structure. Subject’s task was to discriminate between

coherent signal dots (in the figure) and the comparison stimuli, where a prop

orientations (< 100 % signal dots).

optimal 8.6 arc min to 21.5 arc min

2IFC procedure with an adaptive staircase method was used. Two stimulus intervals,

one containing the target and one containing the comparison was shown to the subject in

a random order. Subject chose the target interval.

proportion of signal dipoles to 77.9 % of correct choices.

36

ttern stimuli. Left: translational orientation structure

orientation structure. Subject’s task was to discriminate between a “perfect” Glass pattern with 100

coherent signal dots (in the figure) and the comparison stimuli, where a proportion of dots were at random

orientations (< 100 % signal dots). Dipole length (the separation between the signal dots)

21.5 arc min. The stimulus radius was 6.65 deg and the dot density 4.4 dots/deg

re with an adaptive staircase method was used. Two stimulus intervals,

one containing the target and one containing the comparison was shown to the subject in

random order. Subject chose the target interval. A staircase algorithm adjusted the

of signal dipoles to 77.9 % of correct choices.

orientation structure. Right: concentric

“perfect” Glass pattern with 100 %

ortion of dots were at random

between the signal dots) was varied from

dot density 4.4 dots/deg2.

re with an adaptive staircase method was used. Two stimulus intervals,

one containing the target and one containing the comparison was shown to the subject in

taircase algorithm adjusted the

Page 37: Feature integration in human vision

Figure 13. Study IV: Glass pattern stimuli. The area of the stimulus was divided to a number of sub

areas. The local orientation of each area was randomized. Both curved and

used. A: curved pattern, 1 sub-area. B:

curved pattern, 9 sub-areas. The area of the whole stimulus was always

6.25 dots/deg2. Dipole length was

3.5.2 Results

Results of the study III, expressed as

stimulus at 79.4 % discrimination threshold are shown in figure

of the highly coherent patterns is easier for concentric than

across range of dipole lengths. In a control experiment, the stimulus border was

smoothed with a Gaussian spatial window. This did not cause a systematic effect (2

37

Study IV: Glass pattern stimuli. The area of the stimulus was divided to a number of sub

areas. The local orientation of each area was randomized. Both curved and translational

area. B: translational pattern, 1 sub-area. C: curved pattern, 4 sub

areas. The area of the whole stimulus was always 6 x 6 deg. The dot density was

Dipole length was 8.6 arc min.

, expressed as a proportion of random dots in the comparison

stimulus at 79.4 % discrimination threshold are shown in figure 14. The discriminability

highly coherent patterns is easier for concentric than for translational

across range of dipole lengths. In a control experiment, the stimulus border was

smoothed with a Gaussian spatial window. This did not cause a systematic effect (2

Study IV: Glass pattern stimuli. The area of the stimulus was divided to a number of sub-

translational patterns were

area. C: curved pattern, 4 sub-areas. D:

6 x 6 deg. The dot density was

proportion of random dots in the comparison

. The discriminability

anslational patterns

across range of dipole lengths. In a control experiment, the stimulus border was

smoothed with a Gaussian spatial window. This did not cause a systematic effect (2

Page 38: Feature integration in human vision

observers), suggesting that the difference is not due to a “border arti

2002).

Figure 14. Results of the study III: discrimination thresholds of

a proportion of random orientation dipoles in the comparison stimulus versus

are mean of 6 subjects. Error bars represent

Results of the study IV, plotted as

are shown in figure 15. When increasing the number of sub

translational structure remains almost constant whereas thresholds for the curved Glass

pattern increase steeply. A possible problem with the experimental design was that the

number of possible orientations within each sub

in the translational. Observers might have more uncertainty of the local stimulus

orientation in the curved pattern, and thus use

could explain the difference. To rule out this possibili

which the orientations of the sub

The main result was the same: increasing the number of sub

38

observers), suggesting that the difference is not due to a “border artifact”

Results of the study III: discrimination thresholds of highly coherent Glass patterns

proportion of random orientation dipoles in the comparison stimulus versus the dipole length. Results

are mean of 6 subjects. Error bars represent ±1 standard error of mean.

, plotted as a detection threshold versus the number of sub

. When increasing the number of sub-areas, the

structure remains almost constant whereas thresholds for the curved Glass

pattern increase steeply. A possible problem with the experimental design was that the

number of possible orientations within each sub-area was 4 in the curved pattern but 2

. Observers might have more uncertainty of the local stimulus

orientation in the curved pattern, and thus use a less efficient detection strategy, which

could explain the difference. To rule out this possibility a control experiment was run in

the orientations of the sub-elements were fixed for the course of the experiment.

The main result was the same: increasing the number of sub-areas with

fact” (Dakin & Bex,

highly coherent Glass patterns plotted as

dipole length. Results

number of sub-areas

areas, the detectability of

structure remains almost constant whereas thresholds for the curved Glass

pattern increase steeply. A possible problem with the experimental design was that the

curved pattern but 2

. Observers might have more uncertainty of the local stimulus

less efficient detection strategy, which

a control experiment was run in

elements were fixed for the course of the experiment.

areas with the curved

Page 39: Feature integration in human vision

patterns increases the threshold whereas the performanc

remains constant.

Figure 15. Results of the study IV: detection thresholds for

plotted as a function of the number of sub

curve: curved structure. Black curve:

independently randomized structure deteriorates performance with

with the translational patterns. Erro

39

patterns increases the threshold whereas the performance with the translational

Results of the study IV: detection thresholds for translational and curved Glass patterns

number of sub-areas in the stimulus. Results are the mean of 4 subjects.

curve: curved structure. Black curve: translational structure. Increasing the number of sub

independently randomized structure deteriorates performance with the curved patterns but has little effect

patterns. Error bars represent ±1 standard error of mean.

translational patterns

and curved Glass patterns

mean of 4 subjects. Blue

structure. Increasing the number of sub-areas with an

curved patterns but has little effect

Page 40: Feature integration in human vision

40

4 Discussion

4.1 Mechanisms of Collinear facilitation

In the study I we directly measured how the behavioral receptive field changes with the

collinear context. With the collinear flankers, the behavioral receptive field is elongated

towards the flankers, i.e. the flankers increase the sensitivity in the collinear axis, at the

ends of the receptive field.

When the data was analyzed in two parts, the elongation was found to be significant

only in the early trials. It has been previously reported that also the collinear facilitation

disappears after prolonged perceptual practice when the flankers are placed at a single

distance (Polat & Sagi, 1994b). The similar dependence on the perceptual learning

provides further support that collinear facilitation without the noise masks and the

observed elongation of the behavioral receptive fields has a common neural cause.

Recently Petrov et al. (2006) showed that by cueing the spatial position of the Gabor

(without flankers) by a faint, low-contrast circle results in about the same improvement

in the detectability as a flanker-induced collinear facilitation; arguing that collinear

facilitation is spatial uncertainty reduction. However, Chan and Tyler (2008) observed

collinear facilitation in the contrast discrimination paradigm, which should be almost

immune to uncertainty effects.

Uncertainty-based explanation and especially spatial uncertainty based explanation is

problematic here, at the least: The flankers change only the length of the filter but not

the other properties, even when flankers could act as cues for the correct target spatial

frequency location, etc. Uncertainty effects were further investigated by a computer

simulation. It was found that only uncertainty about the orientation of the target may

cause “shrinking” of the behavioral receptive field and only with a massive uncertainty

about the target orientation. Moreover, our data shows that with some subjects (TP, TR,

VS; TR statistically significant at p<.05) the horizontal length of the behavioral

receptive field was longer than the ideal observer’s, which cannot be explained by a

reduction of orientation uncertainty. Lastly, the sampling efficiency of the behavioral

receptive field did not show systematic increase.

The classification images give more detailed picture of how the flankers change the

sensitivity to the target across the space. The results point towards a low-level

explanation of sensitivity changes between filters sensitive to the target. Arguably the

Page 41: Feature integration in human vision

41

simplest such explanation is to assume that flankers act like pedestal stimulus. The

results do not rule out the possibility that even more complex interactions between

cortical filters have a role in collinear interactions, if these act by increasing selectively

the sensitivity of the filters situated near to the end of the target. However, no general

increase in the gain of the behavioral receptive field mediating the target was seen in

collinear flankers-condition, as estimated from the amplitude of the classification image.

These results give some support to the idea that collinear facilitation might be related to

processes like filling-in contour gaps, as it increases the sensitivity selectively along the

collinear axis, sometimes even outside the stimulus area.

From the viewpoint of natural image statistics, the elongation of the perceptual

receptive field could possibly be understood as visual system’s more general adaptation

to the statistical structure of the natural stimuli. Contours cause strong spatial

correlations in the orientation content natural images. If we think two collinear flanker

Gabors as two samples from a natural image, it is more likely that the space between

them contains also orientation signal at the same direction than not. In a preliminary

study, an “ecologically valid” model observer that utilizes the statistical structure of

natural scenes was constructed. It was shown that the ecologically valid model observer

had an elongated receptive field for the flankers condition compared with no flankers

condition (Hyvärinen, 2005). If this explanation is correct, it is interesting that

observer’s cannot switch off the “ecological” strategy, even when it is not ideal with

this detection task and white noise and even when they are well accustomed with the

stimuli.

4.2 Edge integration and brightness perception

Study II investigated what stimulus information the visual system uses to compute the

surface brightness, using the classification image paradigm with a novel brightness

matching task.

Behavioral receptive fields for both “real” step and illusory Cornsweet-O’Brien-

Craik surfaces peaked at the border of the surface, suggesting that brightness is

computed from the border information of the stimulus. Results provide direct support to

the idea that brightness representations of surfaces are perceptually completed or ‘filled-

in’ from the border information: stimulus information farther from the border does not

contribute to the surface brightness. The illusory Craik-O’Brien-Cornsweet stimulus and

Page 42: Feature integration in human vision

42

the step stimulus had highly similar classification image profiles, supporting the idea

that the appearance of the illusion is because it shares the critical border features with a

“real” surface stimulus. Behavioral receptive field tuning for the illusory stimulus was

not wider than the stimulus profile. This is evidence against the “low-frequency

boosting” hypothesis of the Craik-O’Brien-Cornsweet illusion (Dakin & Bex, 2003a).

Changing the size of the step stimulus caused a remarkably small change to the

tuning of the behavioral receptive field estimated from the classification image. This

supports the idea that brightness information is processed in a specialized neural spatial

frequency channel (Perna & Morrone, 2007; Salmela & Laurinen, 2005). The tuning of

the channel, estimated from the exponential function fits, shows pass-band

characteristics with a low-frequency cut-off near 1 cpd. A similar low-frequency cut-off

was obtained also with a paradigm where the critical stimulus information for the

brightness was investigated by filtering the spatial frequency content of the stimulus

(Perna & Morrone, 2007).

Lastly, the classification images for the perceived brightness resemble closely the

classification images for contrast discrimination using similar stimuli (Shimozaki,

Eckstein, & Abbey, 2005). This supports the idea that perceived contrast and brightness

in simple stimulus displays (as here) reflect processing in similar mechanisms sensitive

to the border structure (Arend & Spehar, 1993a; Arend & Spehar, 1993b) .

4.3 Feature integration in Glass patterns

Studies III and IV investigated the form integration in Glass patterns. In the study III

discrimination of supra-threshold concentric and translational Glass patterns was

investigated. Detection of noise was easier (smaller proportion of noise dots) among

concentric than among translational Glass patterns, across a wide range of dot

separations. The results were not dependent on the border structure.

In study IV, a new paradigm to study the integration of translational and concentric

patterns was used by dividing the global Glass stimulus to a number of sub-areas that

each consisted of fragments of Glass patterns at random orientation. The number of sub-

areas was varied to test how the integration depends on the coherent global structure.

Dividing the stimulus to sub-areas caused a large drop in the detection performance of

concentric Glass pattern, whereas it had relatively little effect with the translational

patterns.

Page 43: Feature integration in human vision

43

The results support the idea that concentric and translational patterns are processed in

at least partially separate neural systems. Concentric patterns are more sensitive to the

global stimulus structure, suggesting that they are detected in a relatively late stage. The

difference between concentric and translational patterns may be even more pronounced

in suprathreshold stimuli, as studies with near-threshold stimuli have given somewhat

inconsistent results (Dakin & Bex, 2003b).

As a further evidence for separate mechanisms, translational and concentric patterns

have shown to have different spatio-temporal integration characteristics (Aspell,

Wattam-Bell, & Braddick, 2006). In addition, recent fMRI studies show that higher

cortical areas, such as V4 show stronger BOLD responses to concentric and radial

compared to translational patterns (Dumoulin & Hess, 2007; Wilkinson et al., 2000).

Page 44: Feature integration in human vision

44

5 Conclusions Despite a fairly good understanding of the early visual processing, relatively little is

known about the integration of the elementary features. Studies in this thesis

investigated integration of visual features in low- and mid-level visual tasks.

Two studies used recently developed classification image method (Eckstein &

Ahumada, 2002) to estimate directly in the spatial domain what stimulus information

the visual system uses in a psychophysical task. In study I, the effect of collinear

flanker stimuli that facilitate the visibility of a target was investigated. It was shown that

behavioral receptive fields measured with the flankers are elongated to the direction of

the flankers. The results provide direct information about how the local context changes

the visual processing and provides support for the idea that collinear facilitation

operates on the low-level neural filters. In Study II, the information the visual system

uses for brightness perception was measured using classification image method. A novel

brightness matching task with “real” step and illusory Craik-O’Brien-Cornsweet illusion

was used. The results show that brightness of a stimulus is computed from the

luminance borders of the surface. In addition, results provide support for the idea that

perceived brightness reflects the output of a fixed frequency channel.

Studies III and IV investigated form integration in Glass patterns. Study III

showed that concentric global form was more easily discriminated in highly coherent

patterns. Study IV showed that curved forms are more dependent on orientation

coherence of the pattern, measured by varying the number of stimulus sub-areas that

contained independently randomized local orientation.

Page 45: Feature integration in human vision

45

7 References

Abbey, C. K., & Eckstein, M. P. (1999). Theory for estimating human-observer templates in two-alternative forced choice experiments. Information Processing in Medical Imaging: 17th International Conference, IPMI 2001, 24-36.

Adini, Y., Sagi, D., & Tsodyks, M. (1997). Excitatory-inhibitory network in the visual cortex: Psychophysical evidence. Proceedings of the National Academy of Sciences of the United States of America, 94, 10426-10431.

Ahumada, A. J. (2002). Classification image weights and internal noise level estimation. Journal of Vision, 2, 121-131.

Ahumada, A. J., & Beard, B. L. (1999). Classification images for detection. Investigative Ophthalmology and Visual Science, S3015.

Angelucci, A., Levitt, J. B., Walton, E. J., Hupe, J. M., Bullier, J., & Lund, J. S. (2002). Circuits for local and global signal integration in primary visual cortex. The Journal of Neuroscience : The Official Journal of the Society for Neuroscience, 22, 8633-8646.

Arend, L. E., & Spehar, B. (1993a). Lightness, brightness, and brightness contrast: 1. illuminance variation. Perception & Psychophysics, 54, 446-56.

Arend, L. E., & Spehar, B. (1993b). Lightness, brightness, and brightness contrast: 2. reflectance variation. Perception & Psychophysics, 54, 457-68.

Aspell, J. E., Wattam-Bell, J., & Braddick, O. (2006). Interaction of spatial and temporal integration in global form processing. Vision Research, 46, 2834-2841.

Beard, B. L., & Ahumada, A. J. (1998). Technique to extract relevant image features for visual tasks. Proceedings of SPIE, 3299 79-85.

Campbell, F. W., & Green, D. G. (1965). Optical and retinal factors affecting visual resolution. Journal of Physiology, 181, 576-93.

Campbell, F. W., & Robson, J. G. (1968). Application of fourier analysis to the visibility of gratings. Journal of Physiology (London), 197, 551-566.

Cannon, M. W., & Fullenkamp, S. C. (1991). Spatial interactions in apparent contrast: Inhibitory effects among grating patterns of different spatial frequencies, spatial positions and orientations. Vision Research, 31, 1985-1998.

Carandini, M., Heeger, D. J., & Movshon, J. A. (1997). Linearity and normalization in simple cells of the macaque primary visual coretex. Journal of Neuroscience, 17, 8621-8644.

Cavanaugh, J. R., Bair, W., & Movshon, J. A. (2002). Nature and interaction of signals from the receptive field center and surround in macaque V1 neurons. Journal of Neurophysiology, 88, 2530-2546.

Chen, C. C., & Tyler, C. W. (2001). Lateral sensitivity modulation explains the flanker effect in contrast discrimination. Proceedings of the Royal Society - Series B, 268, 509-516.

Chen, C. C., & Tyler, C. W. (2002). Lateral modulation of contrast discrimination: Flanker orientation effects. Journal of Vision, 2, 520-530.

Chen, C. C., & Tyler, C. W. (2008). Excitatory and inhibitory interaction fields of flankers revealed by contrast-masking functions. Journal of Vision, 8, 10.1-14.

Chubb, C., Sperling, G., & Solomon, J. A. (1989). Texture interactions determine percieved contrast. Proceedings of the National Academy of Sciences, U.S.A., 86, 9631-9635.

Page 46: Feature integration in human vision

46

Chung, S. T., Levi, D. M., & Tjan, B. S. (2005). Learning letter identification in peripheral vision. Vision Research, 45, 1399-1412.

Cornelissen, F. W., Wade, A. R., Vladusich, T., Dougherty, R. F., & Wandell, B. A. (2006). No functional magnetic resonance imaging evidence for brightness and color filling-in in early human visual cortex. Journal of Neuroscience, 26, 3634-3641.

Dakin, S. C. (1997). The detection of structure in glass patterns: Psychophysics and computational models. Vision Research, 37, 2227-2246.

Dakin, S. C., & Bex, P. J. (2002). Summation of concentric orientation structure: Seeing the glass or the window? Vision Research, 42, 2013-2020.

Dakin, S. C., & Bex, P. J. (2003a). Natural image statistics mediate brightness 'filling in'. Proceedings of the Royal Society - Series B, 270, 2341-8.

Dakin, S. C., & Bex, P. J. (2003b). Response to wilson & wilkinson: Evidence for global processing but no evidence for specialised detectors in the visual processing of glass patterns. Vision Research, 43, 563-564.

David, S. V., Hayden, B. Y., & Gallant, J. L. (2006). Spectral receptive field properties explain shape selectivity in area V4. Journal of Neurophysiology, 96, 3492-3505.

De Valois, R. L., & De Valois, K. (1988). Spatial vision. New York: Oxford University Press.

DeAngelis, G. C., Freeman, R. D., & Ohzawa, I. (1994). Length and width tuning of neurons in the cat's primary visual cortex. Journal of Neurophysiology, 71, 347-374.

Dodwell, P. C. (1983). The lie transformation group model of visual perception. Perception & Psychophysics, 34, 1-16.

Dumoulin, S. O., & Hess, R. F. (2007). Cortical specialization for concentric shape processing. Vision Research, 47, 1608-1613.

Eckstein, M. P., & Ahumada, A. J. (2002). Classification images: A tool to analyze visual strategies. Journal of Vision, 2, i.

Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York: Chapman & Hall.

Foley, J. M. (1994). Human luminance pattern-vision mechanisms: Masking experiments require a new model. Journal of the Optical Society of America, 11, 1710-1719.

Foley, J. M., & Legge, G. E. (1981). Contrast detection and near-threshold discrimination in human vision. Vision Research, 21, 1041-1053.

Gallant, J. L., Braun, J., & Van Essen, D. C. (1993). Selectivity for polar, hyperbolic, and cartesian gratings in macaque visual cortex. Science, 259, 100-103.

Gallant, J. L., Connor, C. E., Rakshit, S., Lewis, J. W., & Van Essen, D. C. (1996). Neural responses to polar, hyperbolic, and cartesian gratings in area V4 of the macaque monkey. Journal of Neurophysiology, 76, 2718-2739.

Georgeson, M. A., & Sullivan, G. D. (1975). Contrast constancy: Deblurring in human vision by spatial frequency channels. The Journal of Physiology, 252, 627-656.

Gilbert, C. D., & Wiesel, T. N. (1983). Clustered intrinsic connections in cat visual cortex. Journal of Neuroscience, 3, 1116-1133.

Gilbert, C. D., & Wiesel, T. N. (1989). Columnar specifity in instrinsic horizontal connections. Journal of Neuroscience, 9, 2432-2442.

Glass, L. (1969). Moire effect from random dots. Nature, 223, 578-580. Glass, L., & Perez, R. (1973). Perception of random dot interference patterns. Nature,

246, 360-362.

Page 47: Feature integration in human vision

47

Glass, L., & Switkes, E. (1976). Pattern recognition in humans: Correlations which cannot be perceived. Perception, 5, 67-72.

Gold, J. M., Murray, R. F., Bennett, P. J., & Sekuler, A. B. (2000). Deriving behavioural receptive fields for visually completed contours. Current Biology, 10, 663-6.

Green, D. M., & Swets, J. A. (1974). Signal detection theory and psychophysics (Reprint ed.). New York: Jon Wiley and Sons.

Grossberg, S., & Todorovic, D. (1988). Neural dynamics of 1-D and 2-D brightness perception: A unified model of classical and recent phenomena. Perception & Psychophysics, 43, 241-77.

Heeger, D. J. (1992). Normalization of cell responses in cat striate cortex. Visual Neuroscience, 9, 181-197.

Hess, R. F., Dakin, S. C., & Field, D. J. (1998). The role of "contrast enchancement" in the detection and appearance of visual contours. Vision Research, 38, 783-787.

Hubel, D. H., & Wiesel, T. (1962). Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. Journal of Physiology (London), 160, 106-154.

Hubel, D. H., & Wiesel, T. N. (1959). Receptive fields of single neurones in the cat's striate cortex. Journal of Physiology, 148, 574-591.

Hyvärinen, A. (2005). Classification images and ecologically ideal observers [Abstract]. Perception, 34 235.

Kanwisher, N., & Yovel, G. (2006). The fusiform face area: A cortical region specialized for the perception of faces. Philosophical Transactions of the Royal Society of London - Series B, 361, 2109-2128.

Kobatake, E., & Tanaka, K. (1994). Neuronal selectivities to complex object features in the ventral visual pathway of the macaque cerebral cortex. Journal of Neurophysiology, 71, 856-867.

Kurki, I., Laurinen, P., Peromaa, T., & Saarinen, J. (2003). Spatial integration in glass patterns. Perception, 32, 1211-1220.

Levi, D. M., & Klein, S. A. (2003). Noise provides new signals about the spatial vision of amblyopes. Journal of Neuroscience, 23, 2522-2526.

Li, R. W., Levi, D. M., & Klein, S. A. (2004). Perceptual learning improves efficiency by re-tuning the decision 'template' for position discrimination. Nature Neuroscience, 7, 178-183.

Loffler, G. (2008). Perception of contours and shapes: Low and intermediate stage mechanisms. Vision Research, 48, 2106-2127.

Maffei, L., & Fiorentini, A. (1976). The unresponsive regions of visual cortical receptive fields. Vision Research, 16, 1131-1139.

Majaj, N. J., Pelli, D. G., Kurshan, P., & Palomares, M. (2002). The role of spatial frequency channels in letter identification. Vision Research, 42, 1165-84.

Maloney, R. K., Mitchison, G. J., & Barlow, H. B. (1987). Limit to the detection of glass patterns in the presence of noise. Journal of the Optical Society of America A-Optics & Image Science, 4, 2336-2341.

Marcelja, S. (1980). Mathematical description of the responses of simple cortical cells. Journal of the Optical Society of America, 70, 1297-1300.

Marr, D. (1982). Vision. A computational investigation into the human representation and processing of visual information. New York, NY: W.H. Freeman.

Meese, T. S., Hess, R. F., & Williams, C. B. (2001). Spatial coherence does not affect contrast discrimination for multiple gabor stimuli. Perception, 30, 1411-1422.

Page 48: Feature integration in human vision

48

Morrone, M. C., & Burr, D. C. (1988). Feature detection in human vision: A phase-dependent energy model. Proceedings of the Royal Society of London - Series B, 235, 221-45.

Murray, R. F., Bennett, P. J., & Sekuler, A. B. (2002). Optimal methods for calculating classification images: Weighted sums. Journal of Vision, 2, 79-104.

Murray, R. F., Bennett, P. J., & Sekuler, A. B. (2005). Classification images predict absolute efficiency. Journal of Vision, 5, 139-149.

Nasanen, R. (1999). Spatial frequency bandwidth used in the recognition of facial images. Vision Research, 39, 3824-33.

Neri, P., & Heeger, D. (2002). Spatiotemporal mechanisms for detecting and identifying image features in human vision. Nature Neuroscience, 5, 812-816.

Neri, P., & Levi, D. M. (2006). Receptive versus perceptive fields from the reverse-correlation viewpoint. Vision Research, 46, 2465-74.

Olzak, L. A., & Laurinen, P. I. (1999). Multiple gain control processes in contrast-contrast phenomena. Vision Research, 39, 3983-3987.

Orban, G. A., Van Essen, D., & Vanduffel, W. (2004). Comparative mapping of higher visual areas in monkeys and humans. Trends in Cognitive Sciences, 8, 315-324.

Pasupathy, A., & Connor, C. E. (1999). Responses to contour features in macaque area V4. Journal of Neurophysiology, 82, 2490-2502.

Pasupathy, A., & Connor, C. E. (2001). Shape representation in area V4: Position-specific tuning for boundary conformation. Journal of Neurophysiology, 86, 2505-2519.

Pasupathy, A., & Connor, C. E. (2002). Population coding of shape in area V4. Nature Neuroscience, 5, 1332-1338.

Pelli, D. G. (1985). Uncertainty explains many aspects of visual contrast detection and discrimination. Journal of the Optical Society of America A-Optics & Image Science, 2, 1508-1532.

Pelli, D. G. (1999). Close encounters--an artist shows that size affects shape. Science, 285, 844-6.

Perna, A., & Morrone, M. C. (2007). The lowest spatial frequency channel determines brightness perception. Vision Research, 47, 1282-91.

Perna, A., Tosetti, M., Montanaro, D., & Morrone, M. C. (2005). Neuronal mechanisms for illusory brightness perception in humans. Neuron, 47, 645-651.

Petrov, Y., Verghese, P., & McKee, S. P. (2006). Collinear facilitation is largely uncertainty reduction. Journal of Vision, 6, 170-178.

Polat, U., & Bonneh, Y. (2000). Collinear interactions and contour integration. Spatial Vision, 13, 393-401.

Polat, U., Mizobe, K., Pettet, M. W., Kasamatsu, T., & Norcia, A. M. (1998). Collinear stimuli regulate visual responses depending on cell's contrast threshold. Nature, 391, 580-584.

Polat, U., & Norcia, A. M. (1996). Neurophysical evidence for contrast dependent long-range facilitation and suppression in human visual cortex. Vision Research, 36, 2099-2109.

Polat, U., & Norcia, A. M. (1998). Elongated physiological summation pools in the human visual cortex. Vision Research, 38, 3735-3741.

Polat, U., & Sagi, D. (1993). Lateral interactions between spatial channels: Suppression and facilitation revealed by lateral masking experiments. Vision Research, 33, 993-999.

Polat, U., & Sagi, D. (1994a). The architecture of perceptual spatial interactions. Vision Research, 34, 73-78.

Page 49: Feature integration in human vision

49

Polat, U., & Sagi, D. (1994b). Spatial interactions in human vision: From near to far via experience-dependent cascades of connections. Proceedings of the National Academy of Sciences, U.S.A., 91, 1206-1209.

Rockland, K. S., & Lund, J. (1982). Widespread periodic intrinsic connections in the tree shrew visual cortex. Science, 215, 1532-1534.

Salmela, V. R., & Laurinen, P. I. (2005). Spatial frequency tuning of brightness polarity identification. Journal of the Optical Society of America, A, Optics, Image Science, & Vision, 22, 2239-45.

Sceniak, M. P., Hawken, M. J., & Shapley, R. (2001). Visual spatial characterization of macaque V1 neurons. Journal of Neurophysiology, 85, 1873-1887.

Schwabe, L., Obermayer, K., Angelucci, A., & Bressloff, P. C. (2006). The role of feedback in shaping the extra-classical receptive field of cortical neurons: A recurrent network model. The Journal of Neuroscience, 26, 9117-9129.

Seu, L., & Ferrera, V. P. (2001). Detection thresholds for spiral glass patterns. Vision Research, 41, 3785-3790.

Shimozaki, S. S., Eckstein, M. P., & Abbey, C. K. (2005). Spatial profiles of local and nonlocal effects upon contrast detection/discrimination from classification images. Journal of Vision, 5, 45-57.

Solomon, J. A. (2002). Noise reveals visual mechanisms of detection and discrimination. Journal of Vision, 2, 105-120.

Solomon, J. A., & Morgan, M. J. (2000). Facilitation from collinear flanks is cancelled by non-collinear flanks. Vision Research, 40, 279-286.

Solomon, J. A., & Pelli, D. G. (1994). The visual filter mediating letter identification. Nature, 369, 395-7.

Solomon, J. A., Watson, A. B., & Morgan, M. J. (1999). Transducer model produces facilitation from opposite-sign flanks. Vision Research, 39, 287-292.

Stevens, K. A. (1978). Computation of locally parallel structure. Biological Cybernetics, 29, 19-28.

Tjan, B. S., & Nandy, A. S. (2006). Classification images with uncertainty. Journal of Vision, 6, 387-413.

Westheimer, G. (1999). Gestalt theory reconfigured: Max wertheimer's anticipation of recent developments in visual neuroscience. Perception, 28, 5-15.

Wilkinson, F., James, T. W., Wilson, H. R., Gati, J. S., Menon, R. S., & Goodale, M. A. (2000). An fMRI study of the selective activation of human extrastriate form vision areas by radial and concentric gratings. Current Biology, 10, 1455-1458.

Williams, C. B., & Hess, R. F. (1998). Relationship between facilitation at threshold and suprathreshold contour integration. Journal of Optical Society of America, 15, 2046-2051.

Wilson, H. R., & Wilkinson, F. (1998). Detection of global structure in glass patterns: Implications for form vision. Vision Research, 38, 2933-2947.

Wilson, H. R., Wilkinson, F., & Asaad, W. (1997). Concentric orientation summation in human form vision. Vision Research, 37, 2325-2330.

Zenger, B., & Sagi, D. (1996). Isolating excitatory and inhibitory nonlinear spatial interactions involved in contrast detection. Vision Research, 36, 2497-2513.