1 ECE 517 Reinforcement Learning in Artificial Intelligence Lecture 21: Deep Machine Learning Dr. Itamar Arel College of Engineering Department of Electrical.
Post on 02-Jan-2016
215 Views
Preview:
Transcript
11
ECE 517 Reinforcement Learning in ECE 517 Reinforcement Learning in Artificial IntelligenceArtificial Intelligence
Lecture 21: Deep Machine LearningLecture 21: Deep Machine Learning
Dr. Itamar ArelDr. Itamar Arel
College of EngineeringCollege of EngineeringDepartment of Electrical Engineering and Computer ScienceDepartment of Electrical Engineering and Computer Science
The University of TennesseeThe University of TennesseeFall 2010Fall 2010
November 8, 2010November 8, 2010
ECE 517 - Reinforcement Learning in AI 22
RL and General AIRL and General AI
RL seems like a good AI frameworkRL seems like a good AI framework
Some pieces are missingSome pieces are missing Long/short term memory: what is the optimal value (or cost-Long/short term memory: what is the optimal value (or cost-
to-go) function to be used?to-go) function to be used? How do we treat multi-dimensional reward signals?How do we treat multi-dimensional reward signals? How do we deal with high-dimensional inputs (observations)?How do we deal with high-dimensional inputs (observations)? How to generalize to address an near-infinite state space?How to generalize to address an near-infinite state space? How long will it take to train such a system?How long will it take to train such a system?
If we want to use hardware – how do we go about doing it?If we want to use hardware – how do we go about doing it? Storage capacity – human brain ~10Storage capacity – human brain ~101414 synapses (i.e. synapses (i.e.
weights)weights) Processing power - ~10Processing power - ~101111 neurons neurons Communications – fully or partially connected architecturesCommunications – fully or partially connected architectures
ECE 517 - Reinforcement Learning in AI
Why Deep Learning?Why Deep Learning?
Mimicking the way the brain represents Mimicking the way the brain represents information is a key challengeinformation is a key challenge
Deals efficiently with high-dimensionalityDeals efficiently with high-dimensionality Handle multi-modal data fusionHandle multi-modal data fusion Capture temporal dependencies spanning large scalesCapture temporal dependencies spanning large scales Incremental knowledge acquisitionIncremental knowledge acquisition
The challenge with high-dimensionalityThe challenge with high-dimensionality Real-world problemReal-world problem Curse of dimensionalityCurse of dimensionality (Bellman) (Bellman) Spatial and temporal dependenciesSpatial and temporal dependencies How to represent key features??How to represent key features??
33
ECE 517 - Reinforcement Learning in AI
Main application: classificationMain application: classification
Hard (unsolved) problem due to …Hard (unsolved) problem due to … High-dimensionality dataHigh-dimensionality data Distortions (noise, rotation, displacement, perspective, Distortions (noise, rotation, displacement, perspective,
lighting conditions, etc.)lighting conditions, etc.) Partial observabilityPartial observability
Mainstream approach …Mainstream approach …
44
ROI detection Feature Extraction Classification
104 102106
ECE 517 - Reinforcement Learning in AI
The power of hierarchical representationThe power of hierarchical representation
Core idea: Core idea: partition high-dimensional data to small partition high-dimensional data to small patches, model them and discover dependencies between patches, model them and discover dependencies between themthem
Decomposes the problemDecomposes the problem
Suggests a trade offSuggests a trade off
more scope more scope less detail less detail
Key ideas:Key ideas: Basic cortical circuitBasic cortical circuit Massively parallel architectureMassively parallel architecture Discovers structure based onDiscovers structure based on
regularities in the observationsregularities in the observations Multi-modalMulti-modal Goal: Goal: situation/state inferencesituation/state inference
55
ECE 517 - Reinforcement Learning in AI
The power of hierarchical representation (con’t)The power of hierarchical representation (con’t)
HypothesisHypothesis: the brain represents information using a : the brain represents information using a hierarchical architecture that comprises basic cortical hierarchical architecture that comprises basic cortical circuitscircuits
Effective way of dealing with large-Effective way of dealing with large-scale POMDPsscale POMDPs
DL – state inferenceDL – state inference RL – for decision making underRL – for decision making under
uncertainty uncertainty
Suggest a Suggest a semi-supervisedsemi-supervised learning learningframeworkframework
UnsupervisedUnsupervised – learns structure of – learns structure ofnatural datanatural data
SupervisedSupervised – mapping states to – mapping states toclassesclasses
66
ECE 517 - Reinforcement Learning in AI 77
The Deep Learning TheoryThe Deep Learning Theory
Basic idea is to decompose the large image into smaller Basic idea is to decompose the large image into smaller images that can each be modeledimages that can each be modeled
The hierarchy is one of The hierarchy is one of abstractionabstraction Higher levels of the state represent more abstract notionsHigher levels of the state represent more abstract notions The higher the layer the more scope it encompasses and The higher the layer the more scope it encompasses and
less detail it offersless detail it offers Multi-scale spatial-temporal context representationMulti-scale spatial-temporal context representation Lower levels interpret or control limited domains of Lower levels interpret or control limited domains of
experience, or sensory systemsexperience, or sensory systems
Connections from the higher level states predispose Connections from the higher level states predispose some selected transitions in the lower-level state some selected transitions in the lower-level state machinesmachines
ECE 517 - Reinforcement Learning in AI 88
Inspiration: Role of Cerebral CortexInspiration: Role of Cerebral Cortex
The cerebral cortex (aka The cerebral cortex (aka neocortexneocortex), made up of four ), made up of four lobes, is involved in many complex cognitive functions lobes, is involved in many complex cognitive functions including: including: memorymemory, , attention, perceptual awareness, attention, perceptual awareness, "thinking", language and consciousness"thinking", language and consciousness
The cortex is the primary brain subsystem responsible The cortex is the primary brain subsystem responsible for learning …for learning …
Rich in neurons (>80% in human brain)Rich in neurons (>80% in human brain) It is the one embedding the hierarchicalIt is the one embedding the hierarchical
auto-associative memory architectureauto-associative memory architecture Receives sensory information from many Receives sensory information from many
different sensory organs e.g.: eyes, ears,different sensory organs e.g.: eyes, ears,etc. and processes the informationetc. and processes the information
Areas that receive that particularAreas that receive that particularinformation are called information are called sensory areassensory areas
ECE 517 - Reinforcement Learning in AI 99
Deep Machine Learning – general frameworkDeep Machine Learning – general framework
The lower layers predict short-term sequencesThe lower layers predict short-term sequences
As you go higher in the hierarchy – “As you go higher in the hierarchy – “less accuracy, broader less accuracy, broader perspectiveperspective””
Analogy to a general commanding an army, or poem being Analogy to a general commanding an army, or poem being recited recited
““Surprise” sequences should propagate up to the appropriate Surprise” sequences should propagate up to the appropriate layerlayer
ECE 517 - Reinforcement Learning in AI 1010
DL for Invariant Pattern RecognitionDL for Invariant Pattern Recognition
Initial focus on the visual cortexInitial focus on the visual cortex Offers an invariant visual pattern recognition in the Offers an invariant visual pattern recognition in the
visual cortexvisual cortex Recognizing objects despite different Recognizing objects despite different scalingscaling, , rotationsrotations
and and translationstranslations is something humans perform without is something humans perform without conscious effortconscious effort
Lighting conditions, various noises (additive, Lighting conditions, various noises (additive, multiplicative)multiplicative)
Currently difficult for machines learning to achieveCurrently difficult for machines learning to achieveThe approach taken is that The approach taken is that geometric invariancegeometric invariance is is linked to linked to motionmotion
When we look at an object, the patterns on our retina When we look at an object, the patterns on our retina change a lot while the object (change a lot while the object (causecause) remains the same) remains the same
Thus, learning persistent patterns on the retina would Thus, learning persistent patterns on the retina would correspond to learning objects in the visual worldcorrespond to learning objects in the visual world
Associating patterns with their Associating patterns with their causescauses corresponds to corresponds to invariant pattern recognitioninvariant pattern recognition
ECE 517 - Reinforcement Learning in AI 1111
DL for Invariant Pattern Recognition (cont’)DL for Invariant Pattern Recognition (cont’)
Each level in the system hierarchy has several Each level in the system hierarchy has several modules modules that model cortical regionsthat model cortical regions A module can have several children and one A module can have several children and one
parent, thus modules are arranged in a parent, thus modules are arranged in a tree tree structurestructure
The bottom most level is called The bottom most level is called level 1level 1 and the and the level number increases as you go up in the level number increases as you go up in the hierarchyhierarchy
Inputs go directly to the modules at level 1Inputs go directly to the modules at level 1 The level 1 modules have small The level 1 modules have small receptive fieldsreceptive fields
compared to the size of the total image, i.e., these compared to the size of the total image, i.e., these modules receive their inputs from a small patch of modules receive their inputs from a small patch of the visual fieldthe visual field
Several such level 1 modules tile the visual field, Several such level 1 modules tile the visual field, possibly with overlappossibly with overlap
ECE 517 - Reinforcement Learning in AI 1212
General System ArchitectureGeneral System Architecture
Thus a level 2 module covers more of the visual field Thus a level 2 module covers more of the visual field compared to a level 1 module. However, a level 2 module compared to a level 1 module. However, a level 2 module gets it information only through a level 1 modulegets it information only through a level 1 module
This pattern is repeated in the hierarchyThis pattern is repeated in the hierarchy Receptive field sizes increase as one goes up the hierarchyReceptive field sizes increase as one goes up the hierarchy The module at the root of the tree covers the entire visual field, The module at the root of the tree covers the entire visual field,
by pooling inputs from its child modulesby pooling inputs from its child modules
ECE 517 - Reinforcement Learning in AI 1313
Learning FrameworkLearning Framework
Let Xn(1) and Xn
(2) denote the sequence of inputs to modules 1 and 2 Learning occurs in three phases:
First, each module learns the most likely sequences of its inputs Second, each module passes an index of its most-likely observed
input sequence Third, each module learns the most frequent “coincidences” of
indices originating from the lower layer modules Next …
ECE 517 - Reinforcement Learning in AI 1414
Contextual Embedding (if exists)Contextual Embedding (if exists)
Feedback loop from layer 2 back to layer 1 (its children)Feedback loop from layer 2 back to layer 1 (its children)This feedback provides contextual inference (from higher This feedback provides contextual inference (from higher layers)layers)
This stage is initiated once the level 2 module has formed its This stage is initiated once the level 2 module has formed its alphabet, alphabet, YYkk
Lower layer nodes eventually learn the CPD matrix Lower layer nodes eventually learn the CPD matrix PP((XX(1)(1)||YY))
ECE 517 - Reinforcement Learning in AI 1515
Bayesian Network ObtainedBayesian Network Obtained
Bottom layer random variables correspond to quantizations Bottom layer random variables correspond to quantizations on input patternson input patternsThe r.v. at the intermediate layers represent object parts The r.v. at the intermediate layers represent object parts that move together persistentlythat move together persistentlyR.V. at the top layer correspond to R.V. at the top layer correspond to objectsobjects
ECE 517 - Reinforcement Learning in AI 1616
Learning algorithm (cont.)Learning algorithm (cont.)
After the system has learned (seen many example) After the system has learned (seen many example) and obtained the CPD at each layer, we seekand obtained the CPD at each layer, we seek
wherewhere I I is the image observed. is the image observed.
A Bayesian Belief Propagation method is typically used A Bayesian Belief Propagation method is typically used to determine the above, based on hierarchy of beliefsto determine the above, based on hierarchy of beliefs
Drawbacks of current schemesDrawbacks of current schemes No “natural” spatiotemporal information representationNo “natural” spatiotemporal information representation Layer-by-layer training is neededLayer-by-layer training is needed Modality independent (most current schemes limited to Modality independent (most current schemes limited to
image data sets)image data sets)
IzPIzPz
|max|*
ECE 517 - Reinforcement Learning in AI 1717
Alternative Explanations for Biological PhenomenaAlternative Explanations for Biological Phenomena
Physiological experiments found that neurons Physiological experiments found that neurons sometimes respond to illusory contours in a sometimes respond to illusory contours in a Kanizsa figureKanizsa figure
In other words, a neuron responds to a contour In other words, a neuron responds to a contour that does not exist in its receptive fieldthat does not exist in its receptive field
Possible interpretation: Possible interpretation: activity of a neuron represents activity of a neuron represents the probability that a contour should be presentthe probability that a contour should be present
Originates from its own state and the state information Originates from its own state and the state information of higher-level neuronsof higher-level neurons
DL based explanation for this phenomena DL based explanation for this phenomena Contrary to current hypothesis that assumeContrary to current hypothesis that assume
“signal subtraction” occurs for some reason “signal subtraction” occurs for some reason
top related