Top Banner
Presented by Alp Sardağ Algorithms for POMDP
27

Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M | | where M vectors of previous.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Presented by Alp Sardağ

Algorithms for POMDP

Page 2: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Monahan Enumeration Phase

Generate all vectors:Number of gen. Vectors = |A|M||

where M vectors of previous state

Page 3: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Monahan Reduction PhaseAll vectors can be kept:

Each time maximize over all vectors.Lot of excess baggageThe number of vectors in next step will be even large.

LP used to trim away useless vectors

Page 4: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Monahan Reduction PhaseFor a vector to be useful, there must be at least one belief point it gives larger value than others:

Page 5: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Monahan Algorithm

Page 6: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Monahan’s LP Complication

Formulate LP and check for :

Page 7: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Eagle’s Variant of MonahanThe optimization occurs in enumaration phase.If, in the enumaration process, a vector’s components are completely dominated by another vector’s component, discard it. Generate j

i(t) and following condition holds:

Discard ji(t).

Can be applied to check new vector dominates any vector previously enumarated.

Page 8: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Sondik’s One-Pass Algorithm

Find theproper set of belief states to plug into the below formula to get all necessary vectors:

The algorithm is guaranteed to visit finite number of regions.The union of these regions is the entire belief space.

Page 9: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Simplified version of Sondik’s algorithm:

Sondik’s One-Pass Algorithm

Page 10: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

How to define a region around this belief state where that vector is guaranteed to be true linear portion of the value function?Construct a series of constraints when satisfied, region is found.Then go step (5)

Sondik’s One-Pass Algorithm

Page 11: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

The condition *(t), generated at , larger for all other a(t), as varies:

Variations in can cause changes in a(t).Need a new constraint to ensure components of a(t) stay the same.

Sondik’s One-Pass Algorithm

Page 12: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

What affects *(t) and a(t)?

To ensure that every part of the function does not change, these constraint exists for every combination of a and

Sondik’s One-Pass Algorithm

Page 13: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Constraints restrict belief states to lie on the belief state space simplex:

Sondik’s One-Pass Algorithm

Page 14: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

A constraint consists of a region with all the points on one side of the line:

Sondik’s One-Pass Algorithm

Page 15: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

The LP constraints at step (4):

Sondik’s One-Pass Algorithm

Page 16: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

In step (5), find belief states guaranteed not to be in region defined in step (4).With the new point proceed exactly as step (4).The algorithm goes until a complete partition of the belief space found.

Sondik’s One-Pass Algorithm

Page 17: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

To find points in the neighboring regions, points lying on the edge of the region defined by the constraints is used:

Sondik’s One-Pass Algorithm

Page 18: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Which constraints are binding:For each constraint, change its inequality into an equality,Solve this LP.

If the LP has solution, it is a binding constraint, a non-binding constraint can not pass through the region defined by all other constraints.

Sondik’s One-Pass Algorithm

Page 19: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Cheng’s Relaxed Region

Same as Sondik’s One Pass algorithm except each region specified with fewer constraints.Defines regions that will typically be larger than the actual vectors’ regions.

Page 20: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Set of constraints for the relaxed regions of Cheng:

Cheng’s Relaxed Region

Page 21: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Cheng’s Relaxed Region

Corners found with interior algorithm:

Page 22: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

The algorithm defines an approximate value function over the entire belief space.Refine this approximation until it reaches the optimal value function.

Cheng’s Linear Support

Page 23: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Difference between two algorithms:

Cheng’s Linear Support

Page 24: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Initiliaze a search list with extreme points on the belief simplex(e.g. [1,0,0...],[0,1,0,0...]), and an empty set of vectors.For each of these points the true (t) vector calculated, and added to the set of vectors.

Cheng’s Linear Support

Page 25: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Since both the true and the approximation are PWLC, the largest difference must occur at a corner point.Cheng then finds all the corner points of the regionsinduced by the approximation.Disregard the corner points seen before and add those not seen before to search list.Pick a point from the search list, generate the vector. If it is different all the other approximation, add it to the approximation set. Repeat whole procedure with the new approximation

Cheng’s Linear Support

Page 26: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Cheng’s Linear Support

Page 27: Presented by Alp Sardağ Algorithms for POMDP. Monahan Enumeration Phase Generate all vectors: Number of gen. Vectors = |A|M |  | where M vectors of previous.

Cheng’s Linear Support