Top Banner
Physics of Intelligence Robert L. Fry [email protected] Johns Hopkins University/Applied Physics Laboratory
62
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Physics of Intelligence

Physics of Intelligence

Robert L. Fry

[email protected]

Johns Hopkins University/Applied Physics Laboratory

Page 2: Physics of Intelligence

Outline Some background and history Introduce concept of a physics of

computation Demonstrate engineering formalism by

reverse-engineering a cortical neuron – remainder of talk

If time, discuss relevance to weapon systems and fire control

2

Page 3: Physics of Intelligence

Research History 1991 APL IRAD (back when APL paid for crazy ideas) 1995 IEEE Trans. paper “Observer-Participant Models of Neural

Processing.” 1995-2000: Several NIPS papers 2002-2006: 3-4 Computational Neuroscience (CNS) papers where

neural model is refined 2005 – 2007: AFOSR Project IONS jointly with Dr. Mung Chiang at

Princeton (Lagrange-duality and Geometric Programming) 2008, 2009: Consulted to Dr. Todd Hylton who managed the DARPA

Physical Intelligence Program. 2008: CNS Information theory workshop, “Computation by neural

and cortical systems.” 2015: “Computation by biological systems,” Modern Biomedicine

Conjoint computational morphological adaptation – formation of new connections

3

Page 4: Physics of Intelligence

Computational Theory Overview

4

Page 5: Physics of Intelligence

Systems are Expressible as Dynamics

5

Physically dissipative systems

“Far from Equilibrium”

Systems

Neurons

CommunicationSystems

Intelligent Systems

Biological Systems and Their

Hierarchical Organization

Internal Combustion

Engines

Internal Combustion

Engines

SystemEnvironment Environment

What the system “sees”

What the system “does”

One should take the “first-person” view in what follows

Page 6: Physics of Intelligence

Questions

Might there be a common computational explanation for system dynamics?

If so, what are its principles?

Two possible axioms for common physical principle are proposed.

6

Page 7: Physics of Intelligence

7

Axioms

(1) To distinguish is the most basic operation possible

(2) Computation must abide by causality

Can one develop a fundamental theory of computation based on quantifying causality and what is it means to distinguish?

Page 8: Physics of Intelligence

Let us first look at what it means to distinguish

Page 9: Physics of Intelligence

Do you see me?

Page 10: Physics of Intelligence

Do you see me?

Page 11: Physics of Intelligence

When we say there is “nothing” what we really mean is that we cannot distinguish

“something” within our local subjective frame.

Something from Nothing

When a system says there is “something,” then is must possess a minimum of two possible internal subjective states with one state corresponding to the presence of the “something” and the other indicating its absence.

Page 12: Physics of Intelligence

A Dynamical View of Boolean Algebra (Cox1)

12

a A

Logical Assertion Logical Question

1. A Question is defined by the assertions that answer it2. An Assertion is defined by the questions it answers

• This recursive definition gives rise to a symmetry breaking whereby two algebras are created

• One is a Boolean Algebra of Assertions

• The other is a Boolean Algebra of Questions

• Both are PHYSICAL and capture the dynamic processing of information with the subjective frame of a system

e.g., Action potentials e.g., Dendritic synapses

1 Richard T. Cox, JHU Physics Department, 1944-1987

Page 13: Physics of Intelligence

This Might be a Very Old Idea

13

All is One

Question

Assertion

Page 14: Physics of Intelligence

Detectors and Elementary Questions

P {d, ¬d}

P = “Is a photon present or absent?”

P “Is a photon present or absent?”

IncidentPhoton

Superimposed Photon Present and Absent

Detectors

Detectors are Elementary Questions

“PhotonPresent”

“PhotonAbsent ”

P

d

¬d

D

¬D

ReflectionOperator

Inquiring Physicist

~Complementation

Operator

D ¬D

¬DD

• Elementary questions can only be answered “true” or “nothing”• Coincidence detectors are D1D2 and so D1D2 = d1d2

Logical “Quartet”

Page 15: Physics of Intelligence

What Cox Proposed

~~a = a

a a = a a a = a

a b = b a a b = b a

~(a b) = ~a ~b ~(a b) = ~a ~b

a b c = a (b c) = (a b) c

a b c = a (b c) = (a b) c

(a b) c = (a c) (b c)

(a b) c = (a c) (b c)

(a b) b = b (a b) b = b

(a ~a) b = b (a ~a) b = b

a ~a b = a ~a a ~a b = a ~a

Boolean Algebra~~A = A

A A = A A A = A

A B = B A A B = B A

~(A B) = ~A ~B ~(A B) = ~A ~B

A B C = A (B C) = (A B) C

A B C = A (B C) = (A B) C

(A B) C = (A C) (B C)

(A B) C = (A C) (B C)

(A B) B = B (A B) B = B

(A ~A) B = B (A ~A) B = B

A ~A B = A ~A A ~A B = A ~A

Algebra of Questions

“Reflection Operator”

Page 16: Physics of Intelligence

What Cox Proposed

16

“Strict” logical implication provides is unique relational operation on assertions

“Strict” logical implication provides is unique relational operation on questions

a ba b = aa b = b

B AA B= A

A B = B

Soccer signup: “Is child a girl or boy?”s “It is my son!”b “He is a boy?”

s b

Card-Guessing: “What color is the card?S “What is the suit of the card?”

C “What is the color of the card?”S C

Page 17: Physics of Intelligence

Degrees of Implication

17

(a b) = p(b|a) (B A) = b(B|A)

of b on the premise a.

“Generalized” Information

Theorybp

“Bearing” B on a issue A (Entropy).

Reflections

Knowledge Uncertainty

Page 18: Physics of Intelligence

Generalized Information Theory

18

Bearing Information Theory Correspondence

b(X|X) H(X) b(XY|A) H(X,Y) “” “,”b(XY|A) I(X;Y) “” “;”b(X~Y|A) H(Y|X) “|” “~”

Page 19: Physics of Intelligence

19

Back to the Axioms

(1) To distinguish is the most basic computational operation possible

(2) Computation must abide by causality

Page 20: Physics of Intelligence

20

Causality and Claude Shannon

“You can know the past, but not control it. You can control the future, but have no knowledge of it.”

[1] “Source Coding with a Fidelity Criterion,” Proc. IRE, 1959.

• Cryptic comment he made 50 years ago [1]• Promised to expand on it later but never did• Has puzzled information theorists since

Page 21: Physics of Intelligence

Two Kinds of Computation Are Possible

I. “Machines” that reconstruct the past

II. “Machines” that control the future

In both cases, causality induces the computational requirements of memory,

memory storage, and memory reset.

Present When & where computation is done

No Such Thing as Time – Just changes in system states

Page 22: Physics of Intelligence

Computational Cycle Phases

22

Reset Memory

Acquire Information

Store in Memory

Make a Decision to

Control Future

Information Stored in the

Past

Increased Entropy Over Time

Attempt to Reconstruct Stored

Information

Decreased Entropy(at the present)

Think “Archeological Dig” Think “Intelligence Systems”

Type I Type II

Page 23: Physics of Intelligence

Energy, Information, and Entropy

23

Work to provide energy W = Area of Rectangle

Require energy E to work = Area of Rectangle

23

Type IIType I

Entropy EntropyTe

mp

erat

ure

Tem

per

atu

re

QH

QL

W = QH – QL

QH

QL E = QH – QL

Information

Entropy

Make as big as possible!

Make as little as possible!

Page 24: Physics of Intelligence

24

Carnot Cycles

1. Engines2. Communication systems3. Dissipative Physical

systems4. Archeological Digs5. Consumption of fossil fuel

1. Intelligent systems2. Living systems?3. Technologically

“Smart” stuff4. Refrigerators5. Air Conditioning

Type I Type II

Page 25: Physics of Intelligence

Type I System: Archeology

Archeology (at the Present)

Ancient civilization “state” at demise

PhysicalStorage

Passage of Time(Loss of information)

Archeologist perform a “dig”

Analysis of Artifacts, etc

Reconstruction of the Past at the Present

Page 26: Physics of Intelligence

Type I System: Undersea Fiber-Optic Communication System

Undersea Fiber-OpticCommunication System

Transmit Pulse

Storage in Fiber

Attenuation and Dispersion

Detection of Pulse

Recovery of InformationRe-Transmit

(Requires energy)

Channel

Detect and Amp

Detect and Amp

Channel Channel

Page 27: Physics of Intelligence

Energy and Computational Efficiency

1. Intelligent systems2. Biological systems3. Technologically

“Smart” stuff4. Refrigerators5. Air Conditioning

Type II• The area of the Type II Carnot cycle is

the energy that the system requires to operate

• We will see that this is important

Energy efficiency

Page 28: Physics of Intelligence

Engineering a Cortical Neuron

28

Page 29: Physics of Intelligence

Neuron Input-Output Codebooks

YX1

X2

Xn

.

.

.

Single-Neuron System

?

Input Codebook Size = 2n !!!

0 0 1 … 0 1 1 0 1 0 0 1 1 0 … 0 1 1 0 1 1 1 1 0 0 … 0 1 1 1 0 1 1 0 0 1 … 1 0 1 0 0 1 0

1 0 0 ... 1 0 1 1 1 0 0

01

Output Codebook Size = 2

What Look-Up Table?

n~10,000

??

2n = 210000 whew !

“Subjective” Information “Subjective” Decision

Page 30: Physics of Intelligence

Cortical Information Architecture

30

“What do I see?” “What do I do?”

X=X1X2 …Xn

b(XY|A) = b[ (X1 Y) (X2 Y) . . . (Xn Y) | A ]

A = “How can I maximize my information throughput?”

Y

XYActionable Information:

Information measure to be optimized:

Page 31: Physics of Intelligence

Subjective Cortical Inquiry

Suppose a cortical neuron can ask

XY =(X1 Y) (X2 Y) . . . (Xn Y)

Further suppose that it can observe its own answers Y. Together, this mean it asks(XY) Y .

(XY) Y = (x1y) (x2y) … (xny) y

(Mult. and Hebbian)Superposition

(Add)“Reflect”

Page 32: Physics of Intelligence

Maximum Entropy Formalism

Suppose that the cortical neuron can approximate the following expectations.

Maximum

1 2, , , ,nx y x y x y y

0exp exp(1 )n y BB

TZ y y

x

λ x 0exp exp(1 )

n y BB

TZ y y

x

λ x

Plus the Entropy Principle

Page 33: Physics of Intelligence

This Distribution has Many Fascinating Properties

Synaptic Efficacies

Decision Threshold

Page 34: Physics of Intelligence

04/15/202334

p( 1| ) [ | 1] p( 1)ln ln ln

p( 0 | ) [ | 0] p( 0)

y p y y

y p y y

x x

x x

FiringThreshold

Induced SomaticPotential

Total SomaticPotential

Neural Decisioning Using Log-Bayes’ Theorem

- Statistical Evidence (Log Odds)- Sufficient Statistic- Bayes’ Theorem- Optimum Nonlinear Estimator

For y Given x- Induced Somatic Potential

Some Properties of (x)

Optimal In Almost Every Regard

T( )x x

Pro

ba

bili

ty

-5 0 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Evidencek(t)

Somatic Decisioning

Pro

bab

ility

of

Fir

ing

Evidence (x)

Page 35: Physics of Intelligence

Neural Adaptation

35

Page 36: Physics of Intelligence

The Concept of Double-Matching

36

• Simultaneously optimizes information processing efficiency and the energy efficiency of a communication system or an “intelligent” system.

• M. Gastpar, et al., “To code, or not to code: Lossy source-channel communications revisited, IEEE Trans. on Information Theory, 1147-1158, May, 2003.

• R. Fry, “Dual Matching as a Problem Solved by Neurons,” Computational Neuroscience Meeting, Neurocomputing, 69, pp. 1086–1090, 2005.

Page 37: Physics of Intelligence

Solving Min-Max Optimization Problem Gives 3 Adaptation Equations Over 2n+1 Neural System Parameters

37

| 1 | 1 | 1TTR y y y y xx x x

(1) vector is largest eigenvector

(2) Decision threshold is expected Hebbian induced potential

(3) Dendritic delay equalization rules

Define:

Page 38: Physics of Intelligence

04/15/202338

Single-Neuron Adaptation

Hebbian Gating for (1) – (3)

.

.

.

1

n

Y

X1

X2

Xn

.

.

.

n

X

Y

Three Hebbian Learning Equilibria Result that Can be Realized Using Simple Biologically Plausible Algorithms

1) Threshold Adaptation () The Optimal Decision Threshold Is

Average Somatic Potential

2) Gain Adaptation ()The Optimal Gain Vector Is the Largest

Eigenvector of the Input Covariance Matrix R

3) Delay Equalization ()Elements of the Optimal Time Delay Vector Must Satisfy “Momentum” Equalization:

di / dt = i y(t) dxi(ti)/dt = 0

2( ) ( ) ( )[ ( ) ( ) ( )]t t t t tx x x

( ) (1 ) ( )t t t T x

| 1 | 1

TE y y x x x xR( )

Page 39: Physics of Intelligence

04/15/2023 39

xi

xi+1

xi+2

xi+3

xi+4

t – j

Gai

n Sp

ace

Delay Space

All Times and Spatial Inputs are Referenced to Somatic Location and Decision Time t

Som

atic

Inte

grat

ion

and

Dec

isio

ning

Stable

Neu

ral S

yste

m

Pre-Synaptic Signal Space

1

1

0

1

0

Binary Space-Time Codes are Defined by “Local” Single-Neuron System Coordinate Frame

Action Potentials Are Binary Signals That Define Space-time Codes. These Codes Are “Subjectively” Referenced To and Defined By Each Neuron Through Its Learned Spatial Dendritic Gains and Temporal Delays With Each Code Referenced to Somatic Decisioning Event Times

y(t)

orAdvanced

Delayed

t – j+1 t – j+2 t – j+3 t – j+4 t – j+5

Outputs

Page 40: Physics of Intelligence

10 100 1000 10000

0.10

1.00

10.0

0.10

0.01

Inve

rse

Te

mp

era

ture

=

1/T

Number of Neural Inputs n

1.00

0.001

Closed-Form Partition Function Z

2 / 4

1

2 2 cosh2

nn n i

i

Z e

• System has fixed Z=2n and no outputs possible

• Higher n means higher energy efficiencySystem Entropy

Approaches n Bits

Z1/2n

Scale

“Carnot Cyclic Operation”Z varies between 2n and 2n+1

Page 41: Physics of Intelligence

41

Summary: Neural Carnot Cycle

A single neuron operates as a Carnot refrigerator with Carnot efficiency =85-90%.

H(Y)=b(Y|A)Z=2n

I(X;Y)=b(XY|A)Z=2n+1

Z: 2n+1 2n Z: 2n2n+1

4. Reset Memory during the refractory

3. Decision Made by soma

0.9 Bit/Decision

T=1/1. Acquire Information through synapses

2. Information Stored in soma

T=1

T=0.2Tem

per

atu

re

Entropy

Re-establish Na+, K++ ion concentrations across membrane

Page 42: Physics of Intelligence

Simulation Output

Synaptic Gains

1 5 10 15 20-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

Gains on non-informative inputs are driven to zero.

Training vector bit

Vectors Inducing Firing

1 5 10 15 200

1

The neuron learns to fire on almost exactly half of the training vectors.

Training vector index

Page 43: Physics of Intelligence

43

Do Not Fire!

Geometric View of Single Cortical Model

2N Codes

Hyperplane is Defined by and

Adaptation Serves to Define a Look-Up Table Defined by Two Equally Probable Partitions of the Input Codebook X Thereby

Ensuring that H(Y) = 1 Bit/Decision

Decide to Fire!

Model says that all neurons do is to learn how to distinguish

Maxim

um eigenvector direction

Page 44: Physics of Intelligence

44

Essence of the Fire Control Loop

Target localization space

Weapon kinematic space

Y

X

Fire Control Loop snapshot at an instant in time

So is the “state” of the weapon system state depicted above good or bad?

Y = Places where the missile can go in the futureX = Places where the target can be at the same future time

The weapon system must operate its fire control loop such that the weapon can always kinematically contains the places where the target can be.

Page 45: Physics of Intelligence

Additional

45

Page 46: Physics of Intelligence

46

Page 47: Physics of Intelligence

04/15/2023 47

0x 0x

0x

ti t

Post-synaptic Potentials

0x 0x 0x

i > 0

i < 0

Somatic Decision Event

Time

i

Positive i

Negative i

Temporal Adaptation

(1) Equilibrium Condition for Temporal Adaptation:

di /dt = i y(t) dxi(ti)/dt

(2) Equilibrium Condition Represents “Zero Average

Momentum Transfer”Mass Weight i

Velocity dxi / dtHebbian Gate: y(t)=1

(3) Positive or Negative Gains Processed the Same

Way

Page 48: Physics of Intelligence

04/15/2023 48

Temporal Adaptation

( a )

( b )

( c )

A c t io n p o te n t ia lg e n e r a te d b y q

A c q u is i t io n w in d o w

L a te in p u t( d e c r e a s e d e la y )

E a r ly in p u t( in c r e a s e d e la y )

N o m in a l a r r iv a l t im e( n o d e la y c h a n g e )

L e a r n e d d e n d r i t icc h a n n e l d e la y

di / dt = i y(t) dxi(ti)/dt

Hebbian temporal adaptation equation:

Equalization of inputs at the soma also guarantee maximal “delivered” power and in some sense provides an impedance matching function.

Page 49: Physics of Intelligence

04/15/2023 49

Condition:

Output: y(t)

Efficacy: i

Derivative: x(t

id )/t

Explanation

1 No adaptation due to lack of output 2 No adaptation due to lack of output 3 No adaptation due to lack of output 4 No adaptation due to lack of output 5 No or little adaptation due to minimal 6 or nonexistent synaptic efficacy 7 No input or equilibrated adaptation 8 +/ Temporal adaptation occurs

TTaabbllee IIII:: SSuummmmaarryy ooff tthhee eeiigghhtt ccoonnddiittiioonnss uunnddeerr wwhhiicchh tthhee iinnppuutt mmoommeennttuumm aass ddeeffiinneedd bbyy ppii == ii ddxxii((tt

ii))//ddtt yy((tt)) iiss zzeerroo.. TThhee ssyymmbbooll ddeennootteess zzeerroo;; ddeennootteess nnoonn--zzeerroo.. TThhee nnoottaattiioonn ++// ddeennootteess tthhee ppoossiittiivvee oorr nneeggaattiivvee ddeerriivvaattiivvee,, rreessppeeccttiivveellyy,, ooff tthhee iinnppuutt aaccttiioonn ppootteennttiiaall.. TTeemmppoorraall aaddaappttaattiioonn oonnllyy ooccccuurrss iinn ccaassee 88..

Temporal Adaptation

Delay Adaptation Seeks the Spatiotemporal to Ensure a the Spatiotemporal Confluence of Information Within the Soma to Allow Decisions to be Made on the Instantaneously Available Evidence.

di / dt = i y(t) dxi(ti)/dt

Page 50: Physics of Intelligence

Example

Page 51: Physics of Intelligence

04/15/2023 51

Modeling and SimulationBit

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

1 1 1 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 0 1 2 0 1 0 0 0 1 0 0 0 0 0 1 1 1 0 1 1 1 0 1 3 1 1 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 4 1 1 1 1 0 1 1 0 1 0 0 0 1 0 0 1 1 0 0 1 5 1 1 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 0 1 6 1 1 1 1 1 0 1 0 0 0 1 0 0 0 0 0 1 0 0 1 7 0 1 0 1 0 1 1 0 1 0 0 0 0 1 0 1 1 0 0 1 8 1 1 1 1 0 1 1 0 1 0 0 0 1 0 0 1 1 0 0 1 9 1 1 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 0 1 10 1 0 0 1 0 0 0 1 1 0 1 1 1 1 0 1 1 0 0 1 11 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 1 0 1 1 12 1 1 1 1 0 1 1 0 1 0 0 0 1 0 0 1 1 0 0 1 13 1 1 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 0 1 14 1 0 0 0 1 1 1 0 1 0 1 1 0 0 1 1 1 1 0 1 15 0 0 0 0 1 1 1 1 0 0 1 0 1 0 1 0 0 1 1 1 16 1 1 1 1 0 1 1 0 1 0 0 0 1 0 0 1 1 0 0 1 17 0 1 1 0 1 1 1 0 0 0 1 1 1 1 0 1 0 0 1 1 18 1 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 1 1 19 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1 1 1

Cod

e In

dex

20 1 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 1

Table I: Training set of codes xi , i=1,2,...,20 with each code containing 20 bits.

Page 52: Physics of Intelligence

04/15/2023 52

Modeling and Simulation

Code Number

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

1 0 7 10 1 0 7 3 1 0 7 6 1 0 10 20 1 8 10 12 7

2 7 0 11 8 7 12 6 8 7 8 7 8 7 9 13 8 7 9 9 10

3 10 11 0 9 10 7 9 9 10 9 8 9 10 12 10 9 14 10 6 5

4 1 8 9 0 1 6 4 0 1 8 7 0 1 9 19 0 9 11 11 6

5 0 7 10 1 0 7 3 1 0 7 6 1 0 10 20 1 8 10 12 7

6 7 12 7 6 7 0 8 6 7 10 7 6 7 9 13 6 9 11 9 6

7 3 6 9 4 3 8 0 4 3 8 5 4 3 9 17 4 9 11 11 6

8 1 8 9 0 1 6 4 0 1 8 7 0 1 9 19 0 9 11 11 6

9 0 7 10 1 0 7 3 1 0 7 6 1 0 10 20 1 8 10 12 7

10 7 8 9 8 7 10 8 8 7 0 11 8 7 9 13 8 11 9 7 8

11 6 7 8 7 6 7 5 7 6 11 0 7 6 10 14 7 8 8 8 7

12 1 8 9 0 1 6 4 0 1 8 7 0 1 9 19 0 9 11 11 6

13 0 7 10 1 0 7 3 1 0 7 6 1 0 10 20 1 8 10 12 7

14 10 9 12 9 10 9 9 9 10 9 10 9 10 0 10 9 10 8 10 9

15 20 13 10 19 20 13 17 19 20 13 14 19 20 10 0 19 12 10 8 13

16 1 8 9 0 1 6 4 0 1 8 7 0 1 9 19 0 9 11 11 6

17 8 7 14 9 8 9 9 9 8 11 8 9 8 10 12 9 0 8 14 11

18 10 9 10 11 10 11 11 11 10 9 8 11 10 8 10 11 8 0 10 11

19 12 9 6 11 12 9 11 11 12 7 8 11 12 10 8 11 14 10 0 9

Cod

e N

um

ber

20 7 10 5 6 7 6 6 6 7 8 7 6 7 9 13 6 11 11 9 0

Figure 5.4: Hamming distances between the 20 codes listed in Table 5.1.

Page 53: Physics of Intelligence

04/15/2023 53

i = 1

i = 30

i = 120

Decisions y p(y=1|xi) 0.5

Code Index

Code Index0 2 4 6 8 10 12 14 16 18 20

0

1

Code Index0 2 4 6 8 10 12 14 16 18 20

0

1

Code Index0 2 4 6 8 10 12 14 16 18 20

0

1

Code Index0 2 4 6 8 10 12 14 16 18 20

-8

-6

-4

-2

0

2

4 104

104

0 2 4 6 8 10 12 14 16 18 20-8

-6

-4

-2

0

2

4

Code Index0 2 4 6 8 10 12 14 16 18 20

-8

-6

-4

-2

0

2

4 104

Dendritic Gains0 2 4 6 8 10 12 14 16 18 20

-4

-3

-2

-1

0

1

2

34

104

Dendritic Gains

104

0 2 4 6 8 10 12 14 16 18 20-4

-3

-2

-1

0

1

2

3

4

Dendritic Gains

104

0 2 4 6 8 10 12 14 16 18 20-4

-3

-2

-1

0

1

2

3

4

Sample Learning DynamicsTr

ain

ing

Ite

rati

on

Nu

mb

er

Page 54: Physics of Intelligence

04/15/2023 54

103

102

101

100

101

0

0.5

1

Inverse Temperature β

Ou

tpu

t E

ntr

op

y H

(Y)

Onset of Criticality

• System Entropy Transitions from n+1 to n

• Output Freezes . . .

Adaptation and Criticality

Page 55: Physics of Intelligence

Monte-Carlo Simulation Results

Decision Threshold Undergoes Only Modest Changes

0.00E+00

2.00E-01

4.00E-01

6.00E-01

8.00E-01

1.00E+00

0.00

01

0.01 1

10

1000

2

0.1

0.00

1

100

4

6

8

10

12

Gamma-squared

Dimen

sion n

E{H

}

Dimen

sion

N

E{H

}

Output Entropy

Output Entropy Maximized at 1 Bit per Decision

0.00

01

0.01 1 10

1000

2

0.1

0.00

1

100

4

6

8

10

12

Gamma-squared

Dim

ensi

onn

Dim

ensi

on N

E{

}Decision

Threshold

Stable Over Wide Operational Ranges of Parameters, Gains, and Number of Inputs N

1 Bit

Small Variations

Page 56: Physics of Intelligence

56

Intelligent Systems

• Intelligent system require energy to work

• “ Intelligence” whatever it is allows us to design more energy efficient systems

“Smart” grid “Smart” buildings

“Smart” phones

Page 57: Physics of Intelligence

57

Biological Systems

. . . How can we build stuff like this?

• Biological systems require energy to work

• They are the most energy efficient systems known

Efficiencies of 90% or more!

Page 58: Physics of Intelligence

The Maximum Entropy Principle

0

p( , | ) ln p( , | )

p( , | )[( , ) , ]

p( , | )[ ]

p( , | ) 1

y Y X

T

y Y X

y Y X

y Y X

J y a y a

y a y y

y a y y

y a

x

x

x

x

x x

x x x

x

x

ME Objective Function

1( , ) exp Tp y y y

Z x λ x

Single-Neuron Input-Output Distribution

exp T

X y Y

Z y y

x

λ x

Partition Function*

*Determines all neural dynamical properties

Page 59: Physics of Intelligence

Optimization Details (1994)Minimize I(X;Y,,) over subject to a normalization constraint on the length of over or | |2 w/ fixed.

( ) 2( ; , , ) | |L I X Y λ λ( )L

Maximize I(X;Y,,) over with fixed( ) ( ; , , )L I X Y λ

( )

0L

Can also make I(X;Y,,) = I(X;Y,,,) a function of the individual dendritic transmission delays to find the temporal adaptation rule.

Page 60: Physics of Intelligence

Modeling and Simulation Training SetTraining Vector Bit

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

1 1 1 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 0 1 2 0 1 0 0 0 1 0 0 0 0 0 1 1 1 0 1 1 1 0 1 3 1 1 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 4 1 1 1 1 0 1 1 0 1 0 0 0 1 0 0 1 1 0 0 1 5 1 1 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 0 1 6 1 1 1 1 1 0 1 0 0 0 1 0 0 0 0 0 1 0 0 1 7 0 1 0 1 0 1 1 0 1 0 0 0 0 1 0 1 1 0 0 1 8 1 1 1 1 0 1 1 0 1 0 0 0 1 0 0 1 1 0 0 1 9 1 1 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 0 1 10 1 0 0 1 0 0 0 1 1 0 1 1 1 1 0 1 1 0 0 1 11 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 1 0 1 1 12 1 1 1 1 0 1 1 0 1 0 0 0 1 0 0 1 1 0 0 1 13 1 1 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 0 1 14 1 0 0 0 1 1 1 0 1 0 1 1 0 0 1 1 1 1 0 1 15 0 0 0 0 1 1 1 1 0 0 1 0 1 0 1 0 0 1 1 1 16 1 1 1 1 0 1 1 0 1 0 0 0 1 0 0 1 1 0 0 1 17 0 1 1 0 1 1 1 0 0 0 1 1 1 1 0 1 0 0 1 1 18 1 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 1 1 19 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1 1 1

Vec

tor

Ind

ex

20 1 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 1

Page 61: Physics of Intelligence

Hamming Distance Between Vectors Conveys Structure

Code Number

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

1 0 7 10 1 0 7 3 1 0 7 6 1 0 10 20 1 8 10 12 7

2 7 0 11 8 7 12 6 8 7 8 7 8 7 9 13 8 7 9 9 10

3 10 11 0 9 10 7 9 9 10 9 8 9 10 12 10 9 14 10 6 5

4 1 8 9 0 1 6 4 0 1 8 7 0 1 9 19 0 9 11 11 6

5 0 7 10 1 0 7 3 1 0 7 6 1 0 10 20 1 8 10 12 7

6 7 12 7 6 7 0 8 6 7 10 7 6 7 9 13 6 9 11 9 6

7 3 6 9 4 3 8 0 4 3 8 5 4 3 9 17 4 9 11 11 6

8 1 8 9 0 1 6 4 0 1 8 7 0 1 9 19 0 9 11 11 6

9 0 7 10 1 0 7 3 1 0 7 6 1 0 10 20 1 8 10 12 7

10 7 8 9 8 7 10 8 8 7 0 11 8 7 9 13 8 11 9 7 8

11 6 7 8 7 6 7 5 7 6 11 0 7 6 10 14 7 8 8 8 7

12 1 8 9 0 1 6 4 0 1 8 7 0 1 9 19 0 9 11 11 6

13 0 7 10 1 0 7 3 1 0 7 6 1 0 10 20 1 8 10 12 7

14 10 9 12 9 10 9 9 9 10 9 10 9 10 0 10 9 10 8 10 9

15 20 13 10 19 20 13 17 19 20 13 14 19 20 10 0 19 12 10 8 13

16 1 8 9 0 1 6 4 0 1 8 7 0 1 9 19 0 9 11 11 6

17 8 7 14 9 8 9 9 9 8 11 8 9 8 10 12 9 0 8 14 11

18 10 9 10 11 10 11 11 11 10 9 8 11 10 8 10 11 8 0 10 11

19 12 9 6 11 12 9 11 11 12 7 8 11 12 10 8 11 14 10 0 9

Cod

e N

umbe

r

20 7 10 5 6 7 6 6 6 7 8 7 6 7 9 13 6 11 11 9 0

Figure 5.4: Hamming distances between the 20 codes listed in Table 5.1.

Page 62: Physics of Intelligence

Simulation Output

Synaptic Gains

1 5 10 15 20-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

Gains on non-informative inputs are driven to zero.

Training vector bit

Vectors Inducing Firing

1 5 10 15 200

1

The neuron learns to fire on almost exactly half of the training vectors.

Training vector index