Heuristic Search Techniques in Video-Game Pathfinding- A Survey of Issues and Techniques By

International Journal of Research (IJR) Vol-1, Issue-7, August 2014 ISSN 2348-6848

PATHOLOGICAL PATTERN OF ATYPICAL MENINGIOMA: DIAGNOSTIC CRITERIA AND TUMOR RECURRENCE

PREDICTORS Azeem Mohammad, Supreethi K.PP a g e | 851

HeuristicSearchTechniquesinVideo-GamePathfinding:

ASurveyofIssuesandTechniques Azeem Mohammad1, Supreethi K.P2

1Department of Computer Science and Engineering, JNTU College of Engineering, Hyderabad, India

Email: [email protected] 2Department of Computer Science and Engineering, JNTU College of Engineering,

Hyderabad, India Email: [email protected]

AbstractIndependent of its problem size the real-time heuristic search algorithms need to maintain a time bound. In environments where memory and time are limited and where fast response required. Pathfinding in video games is a best example, where multiple units are need to react promptly according to the players commands. Classical heuristic search techniques cannot be applied because of their state re-visitation problem. Recent algorithms use database of pre-computed subgoals to improve the performance. Pre-computation time can be long and there is no guarantee that pre-computed data can yield the search space. To address these sort comings Hill climbing and dynamic programming are added to eliminate the state re-visitation problems.

Keywords:Pathfinding, search space, real-time search.

1. INTRODUCTION

Path finding is an active research area in many computer domains and one of the crucial areas is gaming. Many methodologies have been devised to find the best least cost path between two points. As movement is main aspect in videogames there is a need to develop most feasible methods which can calculate the path in less time and consume less memory. Finding a shortest path in a bounded time period,

which needs to be met to suit the real-time gaming environment is tedious task. This makes pathfinding methodologies more complex and necessitates them to process the path in very less amount of time and using less memory. Path-finding calculates the best possible shortest route between any two nodes, thereby making it easy to move from one point to another. One of the real time applications is video games. The Heuristic Search methods provide a significant part in video game pathfinding, still there are better and advanced methods being developed to minimize the time and memory requirements.

2. HEURISTIC SEARCH

Heuristic search is a core area of Artificial Intelligence (AI) research and its algorithms have been used in planning, game playing and agent control. The heuristic function is used to inform the search about the goal. It gives an informed way to guess which neighbour of a node will lead to a goal. One way of this heuristic information about which nodes seem the most feasible is a heuristic function h(n), which takes a node n and returns a non-negative real number that is an estimate of the path cost from node n to a goal node. Following are the list of heuristic search techniques Generate and Test Algorithm, Hill Climbing, Stimulated Annealing, Depth-First-Search, Breadth-First Search, Best-First Search (or) A* Search




The term heuristic function used for algorithms which find solutions among all possible ones, but there is guarantee that best one will be found. Therefore they may be considered approximate algorithms but not accurate ones.

3. REALTIME HEURISTIC SEARCH METHODS

Real Time heuristic search algorithms satisfy a constant upper bound on amount of planning per action, independent of problem size. This property is important in number of applications including autonomous robots and agents in video games. A general problem in video games is searching for the path between two points. In most real time games, agents are expected to act quickly in response to players commands and other agents actions.

3.1 LRTA*: CORE ALGORITHM

The core of most real-time heuristic search algorithms is an algorithm called Learning Real-Time A* (LRTA*). LRTA* is a special case of value iteration or real-time dynamic programming and has a problem that has prevented its use in video game path-finding. Specifically, the algorithm updates a single heuristic value per move on the basis of heuristic values of near-by states. This means that when the initial heuristic values are overly optimistic (i.e., too low), LRTA* will frequently re-visit these states multiple times, each time making updates of a small magnitude. This behaviour is known as scrubbing and appears highly irrational to an observer. There have been attempts to speed up the learning process in LRTA*. Most of the resulting algorithms can be described by the following four attributes:

The local search space is the set of states whose heuristic values are accessed in the planning stage. The local learning space is

the set of states whose heuristic values are updated. Common choices are: the current state only, all states within the local search space and previously visited states and their neighbours. A learning rule is used to update the heuristic values of the states in the learning space. The control strategy decides on the move following the planning and learning phases. Commonly used strategies include: the first move of an optimal path to the most promising frontier state, the entire path and back tracking moves.

3.2 THE ADVENT OF LRTA*

With the dynamic programming style learning rule, researchers have attempted to speed up the learning process and make state re-visitation less apparent. The next version of LRTA*, LSS LRTA* expands the local search space using the A* and updating the heuristics of all states in the local search space in order to speed up the learning. This significantly eliminates state re-visitation and does not eliminate scrubbing problem and can still result in highly suboptimal paths. 3.2.1 Pre-computed subgoals The performance can be improved significantly by solving a number of problems offline and storing them in a database. Then, online, these solved problems can be used to guide the agent by directing it to a nearby subgoal instead of a distant goal. There are several, previously developed, real-time heuristic search algorithms that use pre-computed subgoals.

4. D LTRA*

Although in general planning a goal is often represented as a conjunction of simple subgoals, so far considered, the only real-time heuristic search algorithm to implement subgoaling is D LRTA*([1]). In its pre-processing phase, D LRTA* uses the




clique abstraction of Sturtevant and Buro (2005) to create a smaller search graph. The clique abstraction collapses a set of fully connected states into a single abstract state and can be applied iteratively to compute progressively smaller graphs. For example, a 2-level abstraction applies the clique abstraction to a graph that has already been abstracted once. Similarly, an a-level abstraction applies the clique abstraction a times. If we assume that each abstraction reduces the graph by a constant factor, an a-level abstract graph would contain a times fewer states than the original graph. This abstraction technique in effect partitions the map into a number of regions, with each region corresponding to a single abstract state. Then for every pair of distinct abstract states, D LRTA* computes an optimal path between corresponding representative states (e.g., centroids of the regions) in the original non-abstracted space.

EXAMPLE OF D LRTA* OPERATION

(a) off-line, the map is partitioned into seven regions (or abstract states). Each vacant cell is labelled with its region number.

(b) off-line, an optimal path between centroids of two regions (C1 and C2) is computed and the entry state to the next region (E) is recorded as a sub-goal for this pair of regions.

.

(c) online, the agent intends to travel from S to G, it determines the corresponding regions and sets the pre-computed entry state E as its sub-goal.

There are three key problems with D LRTA*.




First, due to the fact that entry states (i.e., subgoals) have to be computed and stored for each pair of distinct regions, the number of regions has to be kept relatively small. In D LRTA* this is accomplished by applying the clique abstraction procedure multiple times so that the regions become progressively larger and fewer in number. A side effect is that regions will no longer be cliques and may, in fact, be quite complex in themselves. As a result, LRTA* may encounter heuristic depressions within a region. Second, each state in the original space needs to be assigned to a region. Since the regions are irregular in shape, explicit membership records must be maintained. This may require as much additional memory as storing the original grid-based map. Third, clique abstraction is a non-trivial process and puts an extra programming burden on practitioners (e.g., game developers).

5. TIME BOUNDED A* SEARCH

Another recent high-performance real-time search algorithm is Time Bounded A* search (TBA*), a time bounded variant of classic A*. It expands states in an A* fashion using a closed list and an open list, away from the original start state, towards the goal until the goal state is expanded. However, unlike A* that computes complete path before committing first action, TBA* time slices the planning by interrupting its search periodically and acts. Initially before a complete path to the goal is known, the agent takes an action that moves it towards the most promising state on the open list. If on a subsequent time slice an alternative most promising path is formed and the agent is not on that path, it backtracks its steps as necessary. This interleaving of planning, acting, and backtracking is done in such a way that both real-time behaviour and completeness are ensured. The size of the time-slice is given as a parameter to the algorithm, using as a

metric the number of states allowed to expand before the planning must be interrupted. Within a single time-slice, however, operations for both state expansions and backtracking the closed list (to form the path to the most promising state on the open list) must be performed. The cost of the latter type of operations is thus converted to state expansion equivalence (typically several backtracking steps can be performed at the same computational cost as a single state expansion). A key aspect of TBA* over LRTA*-based algorithms is that it retains closed and open lists over its planning steps. Thus, on each planning step it does not start planning from scratch, but continues with its open and closed lists from the previous planning step. Also, it does not need to update heuristics online to ensure completeness, nor does it require a pre-computation phase. While the lack of pre-computation is certainly its strong side, the negatives include high sub-optimality if the amount of time per move is low and high on-line space complexity due to storing closed and open lists.

6. INTUITION FOR KNN LRTA*

This attempts to address the short comings of D LRTA* by not using the abstraction. In our design of kNN LRTA* we address the three shortcomings of D LRTA* listed earlier. In doing so, we identify two key aspects of a subgoal-based real-time heuristic search. First, we need to define a set of subgoals that would be efficient to compute and store off-line. Second, we need to define a way for the agent to find a subgoal relevant to its current problem on-line. Intuitively, if an LRTA*-controlled agent is in the state s going to the state sgoal then the best subgoal is a state sideal subgoal that resides on an optimal path between s and sgoal and can be reached by LRTA* along an optimal path with no state re-visitation. Given that




there can be multiple optimal paths between two states, it is unclear how to computationally efficiently detect the LRTA* agents deviation from an optimal path immediately after it occurs. On the positive side, detecting state re-visitation can be done computationally efficiently by running a simple greedy hill-climbing agent. This is based on the fact that if a hill-climbing agent can reach a state b from a state a without encountering a local minimum or a plateau in the heuristic then an LRTA* agent can travel from a to b without state re-visitation. Thus, we propose an efficiently computable approximation to sideal subgoal. Namely, we define the subgoal for a pair of states s and sgoal as the state skNN LRTA* subgoal farthest along an optimal path between s and sgoal that can be reached by a simple hill-climbing agent. In summary, we select subgoals to remove any scrubbing but do not guarantee that the LRTA* agent keeps on an optimal path between the subgoals In practice, however, only a tiny fraction of our subgoals are reached by the hill-climbing agent suboptimally and even then the suboptimality is minor. This approximation to the ideal subgoal allows us to effectively compute a series of subgoals for a given pair of start and goal states. Intuitively, we compress an optimal path into a series of key states such that each of them can be reached from its predecessor without scrubbing. The compression allows us to save a large amount of memory without much impact on time-per-move. Indeed, hill-climbing from one of the key states to the next requires inspecting only the immediate neighbors of the current state and selecting one of them greedily. The re-visitation-free reachability of one subgoal from another addresses the first key shortcoming of D LRTA* where the agent may get trapped within a single complex region and thus be unable to reach its prescribed subgoal. However, it is still infeasible to compute and then compress an optimal path between

every two distinct states in the original search space. This problem can be solved by compressing only a pre-determined fixed number of optimal paths between random states off-line. Then on-line kNN LRTA*, tasked with going from s to sgoal, retrieves the most similar compressed path from its database and uses the associated subgoals. We define (dis-)similarity of a database path to the agents current situation as the maximum of the heuristic distances between s and the paths beginning and between sgoal and the paths end. Maximum is used because we would like both ends of the path to be heuristically close to the agents current state and the goal respectively. Indeed, the heuristic distance ignores walls and thus a large heuristic distance to the paths either end tends to make that end hill-climbing unreachable. We illustrate this intuition with a simple example. Following figure shows kNN LRTA* operation offline. On this map, two random start and goal pairs are selected and optimal paths are computed between them. Then each path is compressed into a series of subgoals such that each of the subgoals can be reached from the previous one via hill-climbing. The path from S1 to G1 is compressed into two subgoals and the other path is compressed into a single subgoal.

EXAMPLE OF KNN LRTA* OFF-LINE OPERATION:




(a): two subgoals (start, goal) pairs are chosen: (S1;G1) and (S2;G2).

(b): optimal paths between then are computed by running A*.

(c): the two paths are compressed into a total of three subgoals. Once this database of two records is built, kNN LRTA* can be tasked with solving a problem on-line. In previous figure it is tasked with going from the state S to the state G. The database is scanned and similarity between (S;G) and each of the two database records is determined. The records are sorted by their similarity: (S1;G1) followed by (S2;G2). Then the agent runs reachability checks: from S to Si and from Gi to G where i runs the database indices in the order of record similarity. In this example, S1 is found unreachable by hill-climbing from S and thus the record (S1;G1) is discarded. The second record passes hill-climbing checks and the agent is tasked with going to its first subgoal. EXAMPLE OF KNN LRTA* ON-LINE OPERATION :




(a): the agent intends to travel from S to G.

(b): similarity of (S;G) to (S1;G1) and (S2;G2) is computed.

(c): while (S1;G1) is more similar to (S;G) than (S2;G2), its beginning S1 is not reachable from S via hill-climbing and hence the record (S2;G2) is selected and the agent is tasked with going to subgoal 1. The similarity plus hill-climbing check approach makes the state abstraction of D LRTA* unnecessary, thereby addressing its other two key shortcomings: high memory requirements and a complex pre-computation phase.

7. HILL CLIMBING AND DYNAMIC PROGRAMMING SEARCH (HCDPS) : The HCDPS algorithm operates in two stages: offline and online. The offline stage is performed once, before any searches, and pre-computes information to speed up subsequent searches. The offline stage may take a considerable amount of time and is not real-time. The online stage takes a given search problem and uses the pre-computed information to efficiently solve the problem in real-time. During the offline stage, the algorithm analyzes its search space and pre-computes a database of subgoals. The database covers the space such that any pair of start and goal states will have a series of subgoals in the database. This is accomplished by abstracting the space. We partition the space into regions in such a way that any state in the region is mutually reachable via




hill climbing with a designated state, called the representative of the region. Since the abstraction builds regions using hill climbing, which is also used in the online phase, we are guaranteed that for any start state , our agent can hill climb to a region representative of some region . Likewise, for any goal state , there is a region that the goal falls into, which means that the agent will be able to hill climb from s representative to . All we need now is a hill-climbable path between the representative of region and the representative of region. For every pair of close regions, we run A in the ground-level space to compute an optimal path between region representatives. We then use dynamic programming to assemble the computed optimal paths into paths between more distant regions, until we have an approximately optimal path between representatives of any two regions. Once the paths are computed, they are compressed into a series of subgoals in the kNN LRTA fashion. Specifically, each subgoal is selected to be reachable from the preceding one via hill climbing. Each such sequence of subgoals is stored as a record in the subgoal database. Finally, we build an index for the database that maps any state to its region representative in constant time. Online, for a given pair of start and goal states, we use the index to find their region representatives. The subgoal path between the region representatives is retrieved from the database. The agent first hill climbs from its start state to the region representative. The agent then uses the records subgoals one by one until the end of the record is reached. Finally, the agent hill climbs from the region representative to the goal state.

8. CONCLUSION

In this paper we considered the problem of real-time heuristic search whose planning time per move does not depend on the number of states. A new mechanism for selecting subgoals automatically. The resulting algorithm was shown to be theoretically complete and, on large video game maps, substantially outperformed the previous state-of-the-art algorithms D LRTA* and TBA* along several important performance measures. HCDPS, the first real-time heuristic search algorithm with neither heuristic learning nor maintenance of open and closed lists. Database precomputation with HCDPS is two orders of magnitude faster than kNN LRTA and D LRTA . Finally, its read-only database gives it a smaller per-agent memory footprint than A or TBA with two or more agents. Overall, we feel HCDPS is presently the best real-time search algorithm for video-game pathfinding on static maps.

9. REFERENCES

[1] Vadim Bulitko, Yngvi Bjornsson, Ramon Lawrence Case- Based Subgoaling in Real-Time Heuristic Search for Video Game Pathfinding Journal of Artificial Intelligence Research 39 (2010) 269 - 300

[2] W. Zhang, Complete anytime beam search, in Proc. 15th Nat. Conf. Artif. Intell., 1998, pp. 425430.

[3] Ramon Lawrence, Vadim Bulitko Database-Driven Real-Time Heuristic Search in Video-Game Pathfinding COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 5, NO. 3, SEPTEMBER 2013, pp 227-241

[4] R. Korf, Real-time heuristic search, Artif. Intell., vol. 42, no. 23, pp.189211, 1990. [5] S. Koenig and X. Sun, Comparing real-time and incremental heuristic search for real-time situated agents, Autonom. Agents




Multi-Agent Syst., vol. 18, no. 3, pp. 313341, 2009 .[6] V. Bulitko, M. Lutrek, J. Schaeffer, Y. Bjrnsson, and S. Sigmundarson, Dynamic control in real-time heuristic search, J. Artif. Intell. Res., vol. 32, pp. 419452, 2008. [7] Vadim Bulitko, Yngvi Bjornsson, Nathan R. Sturtevant, Ramon Lawrence Real-time Heuristic Search for Pathfinding in Video Games July 7, 2010 [8] Ramon Lawrence, Vadim Bulitko Taking Learning Out of Real-Time Heuristic Search for Video-Game Pathfinding AI 2010: Advances in Artificial Intelligence, December 2010 [9] N. Sturtevant and M. Buro, Partial pathfinding using map abstraction and refinement, in Proc. Nat. Conf. Artif. Intell., 2005, pp. 13921397. [10] N. Sturtevant, Memory-efficient abstractions for pathfinding, in Proc. Artif. Intell. Interactive Digit. Entertain., 2007, pp. 3136. [11] R. Korf, Depth-first iterative deepening: An optimal admissible tree search, Artif. Intell., vol. 27, no. 3, pp. 97109, 1985 [12] M. Shimbo and T. Ishida, Controlling the learning process of real-time heuristic search, Artif. Intell., vol. 146, no. 1, pp. 141, 2003.

[13] Y. Bjrnsson, V. Bulitko, and N. Sturtevant, TBA : Time-bounded A , in Proc. Int. Joint Conf. Artif. Intell., 2009, pp. 431436. [14] I. Pohl, Heuristic search viewed as path finding in a graph, Artif. Intell., vol. 1, no. 3, pp. 193204, 1970. [15] C. Hernndez and J. A. Baier, Fast subgoaling for pathfinding via real-time search, in Proc. Int. Conf. Artif. Intell. Planning Syst., F. Bacchus, C. Domshlak, S. Edelkamp, and M. Helmert, Eds., 2011, pp. 327330

Heuristic Search Techniques in Video-Game Pathfinding- A Survey of Issues and Techniques By

Documents