Transcript
Buenos Aires, junio de 2016Eduardo Poggi
Agenda
EBL Inductive vs Analytical learning Perfect Domain Theories Examples Problems Summary
¿EBL?
Learning general problem-solving techniques by observing and analyzing human solutions to specific problems.
EBL attempts to formulate a generalization after observing only a single (few) example.
Introduced by Gerald De Jong in 1981. Aplicado originalmente en robótica y resolución de problemas.
The EBL Hypothesis
EBL is based on the hypothesis that an intelligent system can learn a general concept after observing only a single example.
By understanding why an example is a member of a concept, can learn the essential properties of the concept.
EBL uses prior knowledge to analyze or explain each training example in order to infer what properties are relevant to the target function and which are irrelevant.
Learning by Generalizing Explanations
Given Goal concept (e.g., some predicate calculus statement) Training example (facts) Domain Theory (inference rules) Operationality Criterion
Given this four inputs, the task is to determine a generalization of the training example that is sufficient concept definition for the goal concept and that satisfies the operationality criteria.
The operationality criterion requires that the final concept definition be described in terms of the predicates used to describe the training example.
Learning algorithms like neural networks, decision trees, inductive logic programming, etc. require a good number of examples to be able to do good predictions.
dataset
Learning Algorithm hypothesis
Dataset must be sufficiently large
Inductive learning
We don’t need a large dataset if besides taking examples as input, the learning algorithm can take prior knowledge.
dataset
Learning Algorithm hypothesis
Dataset does not need to be large
Prior knowledge
Analytical learning
Prior knowledge is used to reduce the size of the hypothesis space.
It analyzes each example to infer which features are relevant and which ones are irrelevant.
Hypothesis Space HS
Priorknowledge Reduced HS
Explanation-based learning
Example, learning to play chess
Suppose we want to learn a concept like “what is a board position in which black will lose the queen in x moves?”.
Chess is a complex game. Each piece can occupy many positions.
We would need many examples to learn this concept.
But humans can learn these type of concepts with very few examples. Why?
Example, learning to play chess
Humans can analyze an example and use prior knowledge related to legal moves.
From there it can generalize with only few examples.
Reasoning like: “Because the white king is attacking both king and queen; black must avoid check, letting white capture the queen”
Example, learning to play chess
What is the prior knowledge involved in playing chess?
It is knowledge about the rules of chess: Legal moves for the pieces. Players alternate moves in games. To win you must capture the opponent’s king.
Inductive Learning Input: HS, D Output: hypothesis h / h is consistent with D
Analytical Learning Input: HS, D, BK Output: hypothesis h / h is consistent with D and BK (-h no se infiere de BK)
With: HS: Hypothesis Space D: Training Set BK: Background knowledge (domain theory)
Inductive and Analytical Learning
Standard Approach to EBL
goal
facts
After Learning (go directly from facts to solution):
goal
facts
An Explanation (detailed proof of goal)
The EBL Process
Example Analytical Learning
Examples: Dataset where each instance is a pair of objects represented by the following predicates: Color, Volume, Owner, Material, Density, On. Example:
On(Obj1,Obj2) Type(Obj1,Box) Owner(Obj2,Louise) Density(Obj1,0.3) Color(Obj1,Red) Material(Obj1,Cardboard) Type(Obj2,Endtable) Color(Obj2,Blue) Material(Obj2,Wood) Volume(Obj1,2) Owner(Obj1,Fred)
Example, Analytical Learning
Domain theory SafeToStack(x,y) ¬Fragile(y) SafeToStack(x,y) Lighter(x,y) Lighter(x,y) Weight(x,wx), Weight(y,wy), LessThan(wx,wy) Weight(p,w) Density(p,d), Volume(p,v), Equal(w, times(d,v)) … Fragile(x) Material(x,Glass)
Example, Analytical Learning
Hypothesis space: set of Horn clause rules. The head of each rule has the predicate SafeToStack. The body of each rule is based on the instances and the
predicates LessThan, Equal, Greater and plus, minus and times. Example: SafeToStack(x,y) Volume(x,vx) ^ Volume(y,vy) ^
LessThan(vx,vy)
Example, Analytical Learning
Determine Hypothesis consistent with both the training examples
and the domain theory Notes
The domain theory refers to predicates not contained in the examples.
The domain theory is sufficient to prove the example is true.
Perfect Domain Theories
A domain theory is correct if each statement is true.
A domain theory is complete if it covers every positive example of the instance space (ie: a target concept and instance space).
A perfect domain theory is correct and complete.
Perfect Domain Theories
Examples of where to find perfect domain theories:
Rules of chess Examples of where not to find perfect domain
theories: SafetoStack problem
In general, only learning problems with perfect domain theories is considered.
EBL Algorithm
We consider an algorithm that has the following properties:
It is a sequential covering algorithm considering the data incrementally
For each positive example not covered by the current rules it forms a new rule by:
Explain how training example satisfies target concept, in terms of domain theory.
Analyze the explanation to find a the most general conditions under which this explanation holds.
Refine the current hypothesis by adding a new Horn Clause rule to cover the example.
Explanation
Explanation
There might be more than one explanation to the example. In that case one or all explanations may be used.
An explanation is obtained using a backward chaining search as is done by Prolog. Prolog-EBG stops when it finds the first proof.
Analyze
Many features appear in an example. Of them, how many are truly relevant?
We consider as relevant those features that show in the explanation.
Example: Relevant feature: Density Irrelevant feature: Owner
Analyze
Taking the leaf nodes of the explanation and substituting variables x and y for Obj1, and Obj2:
SafeToStack(x,y) Volume(x,2), Density(x,0.3), Type(y,Endtable)
Remove features that are independent of x and y such as Equal(0.6,times(2,0.3)) and LessThan(0.6,5).
The rule is now more general and can serve to explain other instances matching the rule.
A more general form of generalization (called “regression”) finds the most general rule explaining the example.
Refine
The current hypothesis is the set of Horn clauses that we have constructed up to this point.
Using sequential covering we keep adding more rules, thus refining our hypothesis.
A new instance is negative if it is not covered by any rule.
Computing the weakest preimage of explanation
Discovering new features
The PROLOG-EBG system described can formulate new features that are not in the training examples:
Example: Volume * Density > 5, derived from the domain theory.
Inductive Bias in Explanation-Based Learning
What is the inductive bias of explanation based learning?
The hypothesis h follows deductively from D (database) and B (Background knowledge)
Bias: prefer small sets of maximally general Horn Clauses
Problems with EBL
The number of control rules that must be learned is very large.
If the control rules are many, much time will be spent looking for the best rule.
Utility analysis is used to determine what rules to keep and what rules to forget.
Another problem with EBL is that it is sometimes difficult to create an explanation for the target concept.
For example, in chess, learning a concept like: “states for which operator A leads to a solution”
The search here grows exponentially.
Summary Different from inductive learning, analytical
learning looks for a hypothesis that fit the background knowledge and covers the training examples.
Explanation based learning is one kind of analytical learning that divides into three steps:
Explain the target value for the current example Analyze the explanation (generalize) Refine the hypothesis
PROLOG-EBG constructs intermediate features after analyzing examples.
Explanation based learning can be used to find search control rules.
Depend on a perfect domain theory.
Componentes del EBL
Componentes del EBL Resolución del problema:
Utilizar el dominio y el ejemplo para llegar al objetivo. Análisis de la traza:
Analizamos la solución para obtener una explicación. Usamos criterios:
Criterio de relevancia: aquella información que forme parte en el camino de llegar a la solución.
Criterio de operatividad: aquellas reglas que se activan directamente. Filtrado:
Filtra la información de la traza para llegar a la explicación. Generalización:
Sustituye las constantes por variables de forma que la expresión siga siendo válida.
El método más usado es el algoritmo de regresión de objetivos. Regresionar una fórmula f a través de una regla r consiste en determinar las
condiciones necesarias y suficientes bajo las cuales puede usarse la regla r para obtener f.
Construir nueva información: La información que construimos puede ser:
Reglas de dominio que expresaran nuevas definiciones de conceptos en las que la parte derecha de la regla serán las combinaciones necesarias y suficientes del árbol de explicación generalizado mientras que la raíz del mismo será la parte izquierda de la regla.
Reglas de control que se construyen de forma similar. Incorporar:
Las nuevas reglas hay que incorporarlas a la base de conocimiento, pero a veces se obtienen demasiadas reglas para introducir en nuestro sistema.
Hay que llegar a una solución de compromiso (velocidad de inferencia o base de afirmaciones mayor).
Ejemplo 1
Ejemplo 1
Ejemplo 2
Concepto objetivo: TAZA Definición funcional:
TAZA(x) <- RECIPIENTE_ABIERTO(x) ^ ESTABLE(x) ^ ALZABLE(x)
Ejemplo de entrenamiento: TIENE_PARTE(OBJ1, CONCAVIDAD1) COLOR(OBJ1, ROJO) CONCAVIDAD(CONCAVIDAD1) ORIENTADA_HACIA_ARRIBA(CONCAVIDAD1) TIENE_DUEÑO(OBJ1, SAM) TIENE_PARTE(OBJ1, FONDO1) FONDO(FONDO1) PLANO(FONDO1) LIGERO(OBJ1) TIENE_PARTE(OBJ1, ASA1) ASA(ASA1) LONGITUD(ASA1, 5)
Ejemplo 2
Teoría del dominio: ESTABLE(OBJ) <- TIENE_PARTE(OBJ, F) ^ FONDO(F) ^
PLANO(F) RECIPIENTE_ABIERTO(OBJ) <- TIENE_PARTE(OBJ, C) ^
CONCAVIDAD(C) ^ ORIENTADA_HACIA_ARRIBA(C) ALZABLE(OBJ) <- TIENE_PARTE(OBJ, A) ^ ASA(A) ^
LIGERO(OBJ)
Criterio operacional: El concepto tiene que definirse en términos de los
predicados usados en el ejemplo.
Ejemplo 2
El proceso de aprendizaje tiene 2 pasos: Se usa la teoría del dominio para construir una
explicación (demostración) de que el ejemplo de entrenamiento es un ejemplo positivo del concepto objetivo. Los nodos terminales del árbol de explicación tienen que ser operativos
Transformar los nodos terminales en un conjunto de condiciones suficientes para que la demostración siga siendo válida (generalizar la explicación de acuerdo con la teoría del dominio)
Ejemplo 2
Resultado del proceso de aprendizaje: TIENE_PARTE(X, Y) ^ CONCAVIDAD(Y) ^
ORIENTADA_HACIA_ARRIBA(Y) ^ TIENE_PARTE(X, Z) ^ FONDO(Z) ^ PLANO(Z) ^ TIENE_PARTE(X, W) ^ ASA(W) ^ LIGERO(X)
Problemas de EBL
Cuando el EBL se aplica a dominios reales surgen un conjunto de problemas que se deben resolver, estos se pueden agrupar en dos clases:
Problemas derivados del nuevo conocimiento incorporado a la teoría (reformulación de la teoría)
Problemas derivados de la calidad de la teoría (revisión de la teoría)
Problemas de EBL – Reformulación de la teoría
La incorporación sistemática de nuevo conocimiento lleva a la degradación de la teoría debido a dos causas:
Baja frecuencia de aplicación: es posible que el nuevo conocimiento se use pocas veces.
Alto costo de cotejar las reglas: el comprobar si el nuevo conocimiento es útil es muy costoso.
Esto produce una reducción de la eficiencia del sistema.
Como solución se puede reescribir la teoría para reducir el costo de cotejo de las reglas, evaluar la frecuencia de uso de las reglas y eliminar las que se usen pocas veces.
Problemas de EBL – Revisión de la teoría
La teoría que se utiliza no permite solucionar correctamente los problemas y se debe corregir:
Teorías incompletas: la teoría no es capaz de solucionar todos los problemas por falta de reglas adecuadas. Se puede completar la teoría por métodos inductivos
Teorías incorrectas: la teoría da soluciones incorrectas a algunos problemas. Se puede intentar determinar que reglas están mal y eliminarlas o corregirlas inductivamente.
Teorías inconsistentes: llegan a soluciones contradictorias.
Teorías intratables: para poder solucionar un problema necesitan mas recursos que los que se dispone.
eduardopoggi@yahoo.com.ar
eduardo-poggi
http://ar.linkedin.com/in/eduardoapoggi
https://www.facebook.com/eduardo.poggi
@eduardoapoggi
top related