Top Banner
Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions
43

Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Mar 28, 2015

Download

Documents

Gabriella Parks
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Conceptual Spaces

P.D. Bruza

Information Ecology Project

Distributed Systems Technology Centre

Part 1: Fundamental notions

Page 2: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Opening remarks

This tutorial is more about cognitive science than IR, is fragmented and offers a somewhat personal interpretation

The content is drawn mostly from Gärdenfors’ “Conceptual Spaces: The geometry of thought”, MIT Press, 2000.

Also driven by some personal intuition:– The model theory for IR should be rooted in cognitive semantics– How do you capture these computational semantics in a

computational form and what can you do with them?

Page 3: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Gärdenfors’ point of departure

How can representations (information) in a cognitive system be modelled in an appropriate way?

– Symbolic perspective: representation via symbol, a cognitive system is described by a Turing machine (cognition = computation = symbol manipulation)

– Associationist perspective: representation via associations between “different kinds of information elements” (e.g. connectionism – associations modelled by artificial neural networks)

Page 4: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

The problem with the symbolic and associationist perspectives

“mechanisms of concept acquisition, which are paramount for the understanding of many cognitive phenomena, cannot be given a satisfactory treatment in any of these representational forms”

– Concept acquisition (learning) closely tied with similarity– Geometric representation: similarity can be “modelled in a natural

way”

Page 5: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Gärdenfors’ cognitive model

symbolic

conceptual

associationist(sub-conceptual)

Propositionalrepresentation

Geometricrepresentation

Connectionistrepresentation

Page 6: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Conceptual spaces outline

Quality dimension

Domain

Conceptproperty

“Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics”

(Context)(Context)

How can conceptual spaces be realized (e.g., for IR)

Page 7: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Quality dimensions

Represent various “qualities” of an object:– Temperature– Weight– Brightness– Pitch– Height – Width – Depth

A distinction is made between “scientific” and “phenomenal” (psychological) dimensions

Page 8: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Quality dimensions (con’t)

“Each quality dimension is endowed with certain geometrical structures (in some cases topological or ordering relations)

Weight: isomorphic to non-negative reals

0

Page 9: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Quality dimensions may have a discrete geometric structure

Discrete structure divides objects into disjoint classes

Kinship relation: father, mother, sister etc,(geometric structure = discrete points)

“Even for discrete dimensions we can distinguish a rudimentarygeometric structure”

1.

2. t

Page 10: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Phenomenal vs. scientific interpretations of dimensions

Phenomenal interpretation: dimensions originate from cognitive structures (perception, memories) of humans or other organisms

– E.g. (height, width, depth), hue, pitch

Scientific interpretation: dimensions are treated as part of a scientific theory

– E.g., weight

Page 11: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Example: colour

Hue- the particular shade of colour – Geometric structure: circle– Value: polar coordinate

Chromaticity- the saturation of the colour; from grey to higher intensities

– Geometric structure: segment of reals– Value: real number

Brightness: black to white– Geometric structure: reals in [0,1]– Value: real number

Page 12: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Example: colour (hue, chromaticity, brightness)

NB geometric structure allows phenomenologically “complementary” and “opposite”hues can be distinguished

Page 13: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Integral and separable dimensions

Dimensions are integral if an object cannot be assigned a value in one dimension without giving it a value in another:

– E.g. cannot distinguish hue without brightness, or pitch without loudness

Dimensions that are not integral, are said to be separable Psychologically, integral and separable dimensions are assumed

to differ in cross dimensional similarity – – integral dimensions are higher in cross-dimensional similarity than

separable dimensions. – (This point will motivate how similarities in the conceptual space are

calculated depending on whether dimensions are integral or separable. N.B. IR matching functions treat all dimensions equally)

Page 14: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Where do dimensions originate from?

Scientific dimensions: tightly connected to the measurement methods used

Psychological dimensions:– Some dimensions appear innate, or developed very early; e.g.

inside/outside, dangerous/not-dangerous. (These appear to be pre-conscious)

– Dimensions are necessary for learning – to make sense of “blooming, buzzing, confusion”. Dimensions are added by the learning process to expand the conceptual space:

– E.g., young children have difficulty in identifying whether two objects differ w.r.t brightness or size, even though they can see the objects differ in some way. “Both differentiation and dimensionalization occur throughout one’s lifetime”.

Page 15: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

In summary,

Quality dimensions are the building blocks of representations within an conceptual space

Gärdenfors’ rebuttal of logical positivism:– “Humans and other animals can represent the qualities of objects, for

example, when planning an action, without presuming an internal language or another symbolic system in which these qualities are expressed. As a consequence, I claim that the quality dimensions of conceptual spaces are independent of symbolic representations and more fundamental than these”

Page 16: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Conceptual spaces outline

Quality dimension

Domain

Conceptproperty

“Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics”

(Context)(Context)

How can conceptual spaces be realized (e.g., for IR)

Page 17: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Domains and conceptual space

A domain is set of integral dimensions- a separable subspace (e.g., hue, chromaticity, brightness)

A conceptual space is a collection of one or more domains– Cognitive structure is defined in terms of domains as it is assumed

that an object can be ascribed certain properties independently of other properties

Not all domains are assumed to be metric – a domain may be an ordering with no distance defined

Domains are not independent, but may be correlated, e.g., the ripeness and colour domains co-vary in the space of fruits

Page 18: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Conceptual spaces outline

Quality dimension

Domain

Conceptproperty

“Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics”

(Context)(Context)

How can conceptual spaces be realized (e.g., for IR)

Page 19: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Properties and concepts: general idea

A property is a region in a subspace (domain) A concept is based on several separable subspaces

Page 20: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Example property: “red”

huechromaticity

brightness

Criterion P: A natural property is a convex region of a domain (subspace)

“natural” – those properties that are natural for the purposes of problem solving, planning, communicating, etc

Page 21: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Motivation for convex regions

x

y

x

yConvex

Not convex

x and y are points (objects) in the conceptual spaceIf x and y both have property P, then any object between x and y is assumed to have property P

Page 22: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Remarks about Criterion P

Criterion P: A natural property is a convex region of a domain (subspace)

Assumption: “Most properties expressed by simple words in natural languages can be analyzed as natural properties”

“The semantics of the linguistic constituents (e.g. “red”) is severely constrained by the underlying conceptual space” (I.e. no “bleen”)

“Criterion P provides an account of properties that is independent of both possible worlds and objects”

Strong connection between convex regions and prototype theory (categorization)

(Easier to understand how inductive inferences are made)

Page 23: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Example concept: “apple”

Apple = < , , , texture, fruit, nutrition>

< , , >

Criterion C: A natural concept is represented as a set of regions in a number of domainstogether with an assignment of salience weights to the domains and information abouthow the regions in the different domains are correlated

Page 24: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Concepts and inference (in passing)

The salience of different domains determines which associations can be made, and which inferences can be triggered

– Context: moving a piano – leads to association “heavy” More about this next time…..

Page 25: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

How to model relevance: concept?

Topicality About my topic

Novelty Unique or the only source; familiar

Currency Up-to-date

Quality Well written, credible

Presentation Comprehensive

Source aspects Prominent author

Info aspects Theoretical paper

Appeal enjoyable

Table from Yuan, Belkin and Kim, ACM SIGIR 2002 Poster

Page 26: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

How to model a document(s): ?

“An exosomantic memory is a computerized system that operates as an extension to human memory. Ideally, use of an exosomantic system would be transparent, so that finding information would seem the same as remembering it to the human user” (B.C. Brookes, 1975)

– To create computerized representations of data sets that are consistent with human perception of the data sets

– To enable personalized relations to representations of data sets– To provide natural interfaces for interaction with exosomantic memory

Newby, G. Cognitive space and information space. JASIST 52(12), 2001

Page 27: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Term = dimension

“Since many of the fundamental quality dimensions are determined by our perceptual mechanisms, there is a direct link between properties described by regions of such dimensions and perceptions” (rats!)

However, dimensional spaces based on terms have shown marked correlation with human information processing:

– HAL and note (“It is difficult to know how to encode abstract concepts with traditional semantic features. Global co-occurrence models, such as HAL, may provide a solution to part of this problem”)

– So, terms as dimensions in a global co-occurrence leads useful vector representations of abstract concepts

– HAL’s results seem to be echoed by Newby using Principal Component Analysis on a term-term co-occurrence matrix

Page 28: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Text fragment = dimension

For example, (term x document) matrix Latent semantic analysis produces vector representations of

words in a reduced dimensional space:– LSA correlates with human information processing on a number of

tasks, e.g., semantic priming– Landauer at al often use short fragments (dimension = 1 or 2

sentences) Dimensional reduction is apparently successful in re-producing

cognitive compatibility, but the reason for this is unknown

Determining the appropriate dimensional structure for IR models is still an open question, especially in light of cognitive aspects

Page 29: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Similarity: introductory remarks

Similarity is central to many aspects of cognition: concept formation (learning), memory and perceptual organization

Similarity is not an absolute notion but relative to a particular domain (or dimension)

– “an apple an orange are similar as they have the same shape”– Similarity defined in terms of the “number of shared properties” leads

to arbitrary similarity – “a writing desk is like a raven” Similarity is an exponentially decreasing function of distance

N.B. clustering in IR often uses an “absolute” notion of similarity

Page 30: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Metric spaces

A real-valued function d(x,y) is said to be a distance function for space S if itsatisfies the following conditions for all points x, y and z in S:

),(),(),(:

),(),(:

ifonly 0),(,0),(:

zxdzydyxdinequalityTriangle

xydyxdSymmetry

yxyxdyxdMinimality

A space that has a distance function is called a metric space

(There is debate about whether distance is symmetric from a psychological viewpoint.Eg Tversky et al “Tel Aviv judged more similar to New York” than vice versa.Gärdenfors accepts the symmetry axiom)

Page 31: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Equi-distance under the Euclidean metric

2)(),( ii

iE yxyxd

Set of points at distance d from a point x form a circlePoints between x and y are on a straight line

x

Page 32: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Equi-distance under the city-block metric

i

iiC yxyxd ),(

The set of points at distance d from a point x form a diamondThe set of points between x and y is a rectangle generated by x and y and the directions ofthe axes

x

Page 33: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Between-ness in the city-block metric

x

y

All points in the rectangle are considered to be between x and y

Page 34: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Metrics: integral and separable dimensions

For separable dimensions, calculate the distance using the city-block metric:

– “If two dimensions are separable, the dissimilarity of two stimuli is obtained by adding the dissimilarity along each of the two dimensions”

For integral dimensions, calculate distance using the Euclidean metric:

– “When two dimensions are integral, the dissimilarity is determined both dimensions taken together

Page 35: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Minkowski metrics

Euclidean and city-block are special cases of Minkowski metrics:

City-block: r = 1Euclidean: r = 2

ri

r

iik yxyxd ),(

Page 36: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Scaling dimensions

Due to context, the scales of the different dimensions cannot be assumed identical

ri

r

iiik yxwyxd ),(

Dimensional scaling factor

Page 37: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Similarity as a function of distance

A common assumption in psychological literature is that similarity is an exponentiallydecaying function of distance:

),(.),( yxdceyxs

The constant c is a sensitivity parameter.

The similarity between x and y drops quickly when the distance between the objectsis relatively small, while it drops more slowly when the distance is relatively large.

The formula captures the similarity-based generalization performances of human subjects ina variety of settings

Page 38: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

IR-related comments on similarity

In the vector-space model, similarity is determined by the cosine function, which is not exponentially decaying

IR models don’t distinguish between integral and separable dimensions, even though this distinction is significant from a cognitive point of view

Experience so far with computational cognitive models is mixed:– LSA uses cosine similarity (not exponentially decaying)!!– HAL used Minkowski (r = 1) to measure semantic distance, I.e a non-

Euclidean distance metric was employed– (Non-Euclidean metrics should perhaps be explored)

Page 39: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Prototypes and categorical perception: introductory remarks

Human subjects judge “a robin as a more prototypical bird than a penguin”

Classifying an object is accomplished by determining its similarity to the prototype:

– Similarity is judged w.r.t a reference object/region– Similarity is context-sensitive: a robin is a prototypical bird, but a

canary is a prototypical pet bird Continuous perception: membership to a category is graded

Page 40: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Prototype regions in animal space

reptile

mammalbat

platypuspenguinbird

robin

emu

archaeopteryx

Based on Gärdenfors & Williams IJCAI 2001

Categorical perception: stimuli between categories distinguished with more ease andaccuracy than within them

Page 41: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Computing categories in conceptual space: Voronoi tessellations

Given prototypes require that q be in the same category as its most similarprototype.Consequence: partitioning of the space into convex regions

npp ,,1

Page 42: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Voronoi Tessellations (con’t)

Much psychological data concords with tessellating conceptual spaces into star-shaped (and sometimes convex) regions around prototypes (e.g., stop consonants in phoneme classification”

Boundaries produced by Voronoi tesselations provide the threshold of similarity and support a mechanism explaining categorical perception

Gärdenfors & Williams, Reasoning about categories in conceptual spaces, Proceedings IJCAI 2001

Page 43: Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Part II

Concept combination Induction Semantics Non-monotonic aspects of concepts Realizing (approximating) conceptual spaces