A SIMD APPROACH TO LARGE-SCALE REAL-TIME SYSTEM AIR TRAFFIC CONTROL USING ASSOCIATIVE PROCESSOR AND CONSEQUENCES FOR PARALLEL COMPUTING A dissertation submitted to Kent State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy by Man Yuan August 2012
110
Embed
A SIMD APPROACH TO LARGE-SCALE REAL-TIME …parallel/papers/MikeYuanPhD... · A SIMD APPROACH TO LARGE-SCALE REAL-TIME SYSTEM AIR TRAFFIC CONTROL USING ASSOCIATIVE PROCESSOR ... Chair,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A SIMD APPROACH TO LARGE-SCALE REAL-TIME SYSTEM AIR TRAFFIC CONTROLUSING ASSOCIATIVE PROCESSOR AND CONSEQUENCES FOR PARALLEL COMPUTING
A dissertation submitted toKent State University in partial
fulfillment of the requirements for thedegree of Doctor of Philosophy
by
Man Yuan
August 2012
Dissertation written by
Man Yuan
B.S., Hefei University of Technology, China 2001
M.S., University of Western Ontario, Canada 2003
Ph.D., Kent State University, 2012
Approved by
Dr. Johnnie W. Baker , Chair, Doctoral Dissertation Committee
Dr. Lothar Reichel , Members, Doctoral Dissertation Committee
Dr. Mikhail Nesterenko
Dr. Ye Zhao
Dr. Richmond Nettey
Accepted by
Dr. Javed I. Khan , Chair, Department of Computer Science
Dr. Raymond Craig , Dean, College of Arts and Sciences
This chapter describes the algorithms for each of the eight real-time tasks, report
correlation and tracking, cockpit display, controller display update, sporadic requests,
automatic voice advisory, terrain avoidance, conflict detection and resolution (CD&R)
and final approach (runway optimization). The solutions are implemented on the CSX600
architecture, but it is easy to scale up from 96 PEs to an AP with 14, 000 PEs using similar
algorithms. The best algorithm will always depend on the exact architecture of the AP.
For instance, the use of the swazzle network in CSX600 is more efficient here but in
a true AP, ordinarily each radar report would be broadcast from the host to all PEs
simultaneously, as this can be done in constant time on a true AP.
7.1 Report Correlation and Tracking
Report correlation and tracking is a task that correlates aircraft data about position
as reported by radar and the predicted data of established tracks for the aircraft in the
system [3–5]. A track is the best estimate of position and velocity of each aircraft under
observation. This task is present in many command and control real-time systems and is
executed every 0.5 second, so it is a major limitation in ATC performance. If total time
consumed is considered, this is easily the ATC task that consumes the most time, as it
is performed much more frequently than the other tasks. It is challenging because each
report has to be compared with each track, and some aircraft are changing flight mode.
48
49
The main idea of this task is from [3–5, 87]. Since all these data are stored in a
shared relational database, the input data are two database relations: aircraft data from
radar reports(Relation R) and the predicted positional data of established tracks for the
aircraft (Relation T ). The correlated radar reports are used to smooth the position and
velocity of the tracks to obtain the next estimate of position and velocity. Given an
unordered set of tracks, each report record must be evaluated with every track record in
the system to assure a match (correlation) is not missed. Multiple matches are treated
different from unique matches, and any report records that do not match a track record
are entered as new tentative tracks.
It is challenging because each report has to be compared with each track, and some
aircraft are changing flight mode. We use the SIMD architecture to do the task. First,
we create a box for each report and each track to accommodate uncertainties of report
and track. The two database relations R and T contain information about the x and y
coordinates of points in 3-D space. The objective of the task is to determine the join of
the two relations as the intersection of two boxes, which is a many-to-many join. Each
box from R is evaluated for intersection with every box in T until all boxes from R have
been compared with all boxes in T.
As presented in [19, 21], the data structures of R and T are shown in Table 4 and
Table 5, in which positions of next period are predicted by smoothed current positions
and velocities. Let T have columns X, Y , j, X1, Y1, and q, and R have columns X, Y , r
and k. Here (X, Y ) is the position of aircraft on the records of T or R; j and r are used
to give the sizes of boxes developed for aircrafts in T and R, respectively. (X1, Y1) in T
is the reported position for the track record that is based on the current correlated radar
50
Table 4: Data Structures for Radar Reports
Attribute Type Comments
report id int Report identity
r float Report box size
X float X position
Y float Y position
Hr(tk) int Altitude
Match count int Number of correlated tracks
Match id int ID of the correlated track
k int Correlation flag
Figure 8: Track/Report correlation
report. Both q and k are flags set during the correlation procedure. As shown in Figure 8,
a box is developed around each point (x, y) with the four corner points (X ± r, Y ± r) in
T. A similar box is developed around each point (x, y) in R with the four corner points
(X ± j, Y ± j). r > 0 is based on uncertainties in the radar report, and j > 0 is based
on uncertainties of each track to compensate turning or larger errors in the radar report.
Initially j = 0.5 nautical mile for all tracks, and r = 0.5 nautical mile for all radar reports.
Each box of the radar report in R is compared with each box of the track record in T. If
an intersect is found between one report box and one track box, for example, r2 and t2
in Figure 8, then the report data is entered into X1 and Y1 of the correlated record in T
and a correlation flag is set in column k, the match count of this report is incremented,
51
Table 5: Data Structures for TracksAttribute Type Comments
ID int Flight identity
q int Track state(seven values)
C int Error measures(3 values)
j float Track box size
report id int ID of correlated report
X float Current X position
Y float Current Y position
Ht int Current altitude
Vx(tk) float Current X velocity
Vy(tk) float Current Y velocity
X1 float X position of correlated report
Y1 float Y position of correlated report
its ID is entered into the correlated track’s report id, and the ID of the track is entered
into the radar report’s match id. We calculate the distance between them, which is the
track’s shortest distance. If two tracks correlate to the same radar report, i.e., the radar
report’s match count > 1, then this radar report is discarded. If a second radar report
correlates with the same track, calculate the distance between the track and radar report
and call it the track’s current distance; if it is less than shortest distance, update the
shortest distance to be this distance, and record this report’s position as a candidate
report position for this track. Next, another radar report is broadcast to all PEs to
compare to all of the track records using the same procedure above. After all report
boxes have been compared with all track boxes, set error measure C = 1 for the tracks
that have correlated reports. The details of the smoothing process have been explained
in [4, 5, 19, 21, 87].
When all report boxes have been compared with all track boxes, a flight that is not
52
Figure 9: Search box size for flight maneuver
produced by noise might not correlate to any reports because the flight that it corresponds
to is accelerating, turning, maneuvering or by greater noise in the report. We increase
track box size for any track that has not correlated with a radar report to increase its
probability to intersect a radar report box as shown in Figure 9. First we double the box
sizes of tracks that do not correlate any reports, i.e, j = j × 2. Next, we apply the same
algorithm to compare them with uncorrelated reports whose match count are 0. The
error measures C are set to 2 for tracks that correlate reports in this round. The process
is repeated for all unmatched radar reports, which have the ”not match flag” of column
q set in R. When no intersections are found, the process is repeated with j = j × 3.
If there still remain unmatched radar reports, new tentative tracks are started for the
reports in order to detect arrival of any new flights. If reports are due to noise, they
usually will be dropped in a few periods (several seconds) based on further evaluation of
other radar reports. The details of this task are shown in Algorithm 4. This task is done
both accurately and within deadline because of the SIMD features of AP.
If a track created for a potentially new aircraft does not correlate with any aircraft
after several passes through Algorithm 4, it is viewed as being due to noise and dropped.
Note that if this algorithm were to be executed on an AP, it would need to be modified so
53
Algorithm 4 Algorithm for Aircraft Tracking
1: Radar reports are transferred from host to mono memory, then distributed frommono to PEs.
2: for i = 1 → 96 in parallel do
3: Boxes are created around each radar report and each track in each PE to accom-modate report and track uncertainties.
4: Check intersection of each report box with every track box in each PE.5: If there is an intersection, the radar report and the track are correlated. The
match count of this report is incremented, which indicates that it correlates withone track, and its ID and positions are entered into the correlated track’s record.
6: All radar reports in each PE are transferred to next PE using the ring/swazzlenetwork, and steps 3 to 6 are repeated.
7: end for
8: After the 96 iterations, all reports have been compared with all tracks. A track thatis not produced by noise might not correlate to any reports because the aircraft thatit corresponds to is maneuvering.
9: Double the box sizes of tracks that have not correlated with any reports to increasetheir probability of intersecting a report box and repeat steps of for loop above tocompare them with uncorrelated reports.
10: Triple the original box sizes of tracks that have not correlated yet, and run thealgorithm again.
11: After 3 rounds, if there are still any uncorrelated reports, they are used to start newtracks.
that each radar report would be broadcast in constant time to all PEs and processed prior
to broadcasting the next radar report to the PEs. The modification of the algorithms
in this section to run on an AP will be easy but will depend on the exact architecture
of the AP. It is noted that, in the AP solution, each report record is tested with every
track record in one set operation that requires constant time. That is, the overall AP
time is O(n) where n is the number of reports in this period. On the other hand, the
MIMD process is O(n2) because it has |R|× |T | processes, assuming O(1) processors can
access and update needed data from the dynamic data base and work on this problem
simultaneously. In the worst case situation we anticipate 12, 000 reports against 14, 000
tracks. For the AP this is 12, 000 operations which is O(n), while in the MIMD it is
R(T − 1)/2 or 8.4 × 107 which is O(n2). It is noted that the number does not include
54
many operations that are needed by MIMD but not by AP: dynamic scheduling, load
balancing, data distribution, processor assignment, mutual exclusion of data access, etc.
7.2 Cockpit Display
First the associative operation PickOne is used to select one aircraft. Next, the
broadcast operation is used to broadcast the x, y and altitude coordinates of the plane
picked in the previous step. For each of its aircraft records, each processor computes
the x-distance, y-distance and altitude distance between the location of its aircraft and
the location of the aircraft broadcast. Then the processor identifies the aircraft that are
approaching this aircraft. This is done by using the conflict detection algorithm covered
in Section 7.6.1, find all aircraft that will be within 2 × 2 nm in x and y and within
1000 feet in altitude in 2 minutes. Next, transfer these selected aircraft’s identity, x,
y positions, altitude, velocity, heading, conflict information, etc to the display server.
Finally, use the conflict resolution algorithm in section 7.6 to obtain a conflict advisory
and transfer it to the server. This algorithm uses the SIMD architecture to parallelize
the computation and to improve its efficiency.
7.3 Controller Display Update
This task is similar to cockpit display. We transfer the updated flight identity, posi-
tions, altitude, speed, heading, etc from PEs to the ClearSpeed server, which plays the
role of the controller display in this simulation. It uses the SIMD architecture to speed
up the display process.
55
7.4 Automatic Voice Advisory
Automatic Voice Advisory (AVA) automatically advises an uncontrolled flight (VFR)
of near term conditions of other aircraft and terrain by voice. This task is simulated by
having the ClearSpeed server print advisories of conflict detection and resolution, terrain
avoidance tasks, etc. For example, if there is an aircraft that is approaching the aircraft
called, the message might be ”aircraft at 4 miles ahead, 4, 500 feet above, in 1 minute”; if
the aircraft called is heading for a terrain, the message might be ”terrain, 4 miles, 3, 100
feet ahead”. We use an AP style of computation to do this efficiency.
7.5 Sporadic Requests
Sporadic requests include information requests or changes in data. For example,
aircraft have to avoid an area that has bad weather, aircraft make maneuvers to avoid bad
weather, or controllers make a request for runway usage, etc. This task is executed once
every second. Although the requests are not processed immediately, they are processed
very quickly. We simulate this task as follows. We first process the next unprocessed
sporadic task (assuming there are more than one). If it is to divert aircraft so they will
miss a storm area, all affected aircraft will be processed using the associative operation
PickOne to select one aircraft to redirect at a time.
7.6 Conflict Detection and Resolution (CD&R)
7.6.1 Conflict Detection
This paper considers a conflict to occur when two aircraft are predicted to be within
a distance of three nautical miles in x and y and within 1, 000 feet in altitude. To
assure timely evaluation we let the detection cycle be eight seconds, and we determine
56
the possibility of a future conflict between any pairs of aircraft within a 5 minute ”look
ahead” period (i.e., 300 seconds). [3–5]
All IFR flights(defined in section 2.3.2) are evaluated for their future relative space
positions between each other; and each of these IFR flights is also evaluated against
all VFR flights(defined in section 2.3.2) for their future space positions. In the worst
case situation we anticipate 4, 000 IFR records and 10, 000 VFR records according to
section 2.3.2. This means that every 8 seconds we must evaluate each of the 4, 000 IFR
flights with the other 13, 999 IFR and VFR flights in the worst environment. The process
is essentially a recursive join on the track relation where the best estimate of each flights
future position is projected as an envelope into future time. The envelope has some
uncertainty in its future position that is a function of the track state. It is modified by
adding 1.5 miles to each x, y edge of the future position (to provide a 3 mile minimum
miss distance) and 1, 000 feet in the altitude (to provide a 2, 000 feet minimum miss
height). For each future space envelope, its data is broadcast to all remaining PEs first.
Then an intersection of this envelope with every other space envelope in the environment
can be checked simultaneously in constant time. First, all future flight envelopes are
generated simultaneously, which takes O(1) steps. Then, the first ”trial envelope” of
the 4, 000 IFR future flight envelopes is compared for possible conflict with all the other
13, 999 flight envelopes for a look-ahead period of 5 minutes. Thus the equivalence of
maximum 13, 999 jobs is completed simultaneously in this jobset. This occurs because
each of the 14, 000 records in the track table is simultaneously available to each of the
14, 000 PEs that are active in the AP. The next operation selects the second trial envelope
and repeats the conflict tests against the remaining 13, 998 tracks. When a trial envelope
57
has been tested, it is marked ”done”, and future trial envelopes will exclude all prior trial
tracks. When the last of the 4, 000 trial envelopes is tested, the conflict detection will
have finished.
We implement the algorithm on CSX600 to emulate AP solution. The input data
is from the track records in the PEs. We copy each track ’s ID, 3D position Xt, Yt and
Ht, and velocity Vxt and Vyt at time t to the following variables, respectively, ID, Xc,
Yc, Hc, Vxc and Vyc in the poly structure trial in PEs. Initialize their time till which is
the aircraft’s earliest collision time with other aircraft to 300.00. The intuitive idea is to
compare each trial aircraft with all the other ones, but we use the CSX600 architecture
to parallelize the computation. The details of the conflict detection algorithm are shown
in Algorithm 5 (see Figure 10), known as Batcher’s algorithm.
Algorithm 5 Algorithm for Conflict Detection
1: for i = 1 → 96 in parallel do
2: if for each trial and track record in each PE, their flight IDs are different andaltitudes are within 1000 feet then
3: Project their positions into 5 minutes ahead, add 1.5 to each x and y coordinateto provide a 3.0 minimum miss distance in each dimension. The x dimensioncase is shown in Figure 10.
4: Calculate the min x, max x, min y and max y for minimum and maximumintersection times in x and y dimensions, as shown in equations 1, 2, 3 and 4.
5: Find the largest minimum time time min and smallest maximum time time maxacross the two dimensions using equations 5 and 6.
6: If time min is less than time max, there is a potential conflict between theaircraft whose ID is trial.ID and another aircraft whose ID is track.ID.
7: If time min is less than trial.time till, then trial.time till is updated totime min.
8: end if
9: All trial records in each PE are passed along the swazzle/ring network to the nextPE and the above steps are repeated to calculate trial.time till(trial, track).
10: end for
11: After 96 iterations, all trial records have been compared with all track records. Thetime till of each trial is its soonest collision time with another track.
58
Figure 10: Conflict detection
The following formulas are used in Algorithm 5:
min x =|trial.Xc − track.Xt| − 3
|trial.Vxc − track.Vxt|(1)
max x =|trial.Xc − track.Xt| + 3
|trial.Vxc − track.Vxt|(2)
min y =|trial.Yc − track.Yt| − 3
|trial.Vyc − track.Vyt|(3)
max y =|trial.Yc − track.Yt| + 3
|trial.Vyc − track.Vyt|(4)
time min = max{min x, min y} (5)
time max = min{max x, max y} (6)
In the AP solution, each IFR record is tested with every IFR except itself and every
VFR record in one set operation that requires constant time, according to the descriptions
above. So the overall AP process requires only 4, 000 operations at most, which is O(n)
steps where n is the number of IFR flights in this period. On the other hand, the MIMD
59
process is O(n2) because it has IFR × (IFR − 1)/2 + IFR × V FR processes, assuming
O(1) processors can access and update needed data from the dynamic data base and work
on this problem simultaneously. In the MIMD it is 4, 000× 3, 999/2 + 4, 000× 10, 000 =
40, 799, 800 which is O(n2). The AP will complete the entire set of jobs in at most
4, 000 steps due to the simultaneous execution of all jobs in each jobset. It is noted
that the number does not include many operations that are needed by MIMD but not by
AP: dynamic scheduling, load balancing, data distribution, processor assignment, mutual
exclusion of data access, etc.
7.6.2 Conflict Resolution
We use the SIMD architecture of CSX600 to do conflict resolution accurately and
efficiently. Conflict resolution will occur as quickly as possible after conflict detection has
been executed, so that each aircraft will have an updated time till value. In practice,
conflicts should be reasonably rare. First, we find which aircraft have the shortest conflict
time. This is the trial aircraft that will make the initial heading change. Next, the PEs
will evaluate in parallel different trial trajectories for the trial aircraft. These trajectories
will be created by changing the direction of the current trajectory of the trial aircraft both
clockwise and counterclockwise in increments of 5 degrees, up to a maximum increment
change of 30 degrees. The various trajectories are rotated through all processors using the
ring (or swazzle) network, and the conflict detection algorithm is used to determine the
longest time before each trajectory collides with another aircraft. Finally, the maximum
of the collision times of all trajectories is determined and used as the best heading change
that the trial aircraft can make. If more than one of the best heading change are found,
60
the one with the minimal degree is used. If two have the same degree changes (e.g., +10
and −10 degree change), one of them is arbitrarily chosen. Although conflicts cannot be
fully resolved theoretically, it runs very well in our simulation and during our tests, all
conflicts are resolved after several rounds. In an actual implementation, any unresolved
conflicts could be resolved by changing the altitude of a plane that still has a conflict
after Algorithm 6 is executed.
The algorithm for conflict resolution is described in Algorithm 6. The input data are
the tracks and trial records in PEs. Each PE can have 1 to 17 tracks and trial records.
Algorithm 6 is based on the CSX600, not an AP. The AP could handle collision avoidance
in a much simpler manner, due to its ability to do a constant-time broadcast [3–6]. For an
AP, the conflict resolution can be included as part of conflict detection. Proceeding one
aircraft at a time, each selected aircraft’s trajectory information is broadcast (in constant
time) to all other aircraft and any potential conflict is immediately corrected as follows.
The heading of the initial aircraft is altered by perhaps an increment of 10 degrees and
immediately rechecked for a conflict. This procedure continues until a conflict free path
with all other aircraft is found with the heading being altered a maximum of 30 degrees.
This method is more efficient on an AP since the broadcast of trajectory information of
one aircraft’s trajectory information can be done in constant time.
7.7 Terrain Avoidance
Terrains are lines that make a box shape that encloses a terrain height, e.g., a TV
tower is a 1.0 by 1.0 nm box with a height equal to 3, 100 feet. All terrains and tracks
are entered in each PE. Terrain avoidance is as challenging as the report correlation and
61
Algorithm 6 Algorithm for Conflict Resolution
1: In parallel, find the minimum time till of all trial records sequentially in each PE.2: Compute the global minimum time till by taking the minimum of the local time till
value in each PE.3: Use the pick-one command to select a processor whose local time till is equal to the
global time till, and transfer the trajectory record of an aircraft in this PE whosetime till is minimal to the mono memory.
4: Create array projectedpath[11] in mono memory, copy the best trial aircraft’s IDand positions to each of the records of projectedpath[11], initialize collision time, thesoonest collision time with other aircraft to ∞.
5: Each of the projectedpath[i] represents a path where the best aircraft makes a dif-ferent heading change from left to right 5 to 30 degrees in increments of 5 degrees,alternating between left and right (e.g., 5, −5, 10, −10 degrees, etc). This will allownumerous different possible paths for the trial aircraft to be evaluated in parallel.
6: Transfer the projectedpath[0], · · · , projectedpath[11] from mono memory to the PEsin parallel with each PE receiving one projectedpath[i].
7: for i = 1 → 96 do
8: Each projectedpath is compared to all aircraft records in each PE with a differentID (so that the trial aircraft will not compare to itself) unless their altitudes differby more than 1000 feet.
9: Calculate minimum and maximum intersection times in x and y dimension usingthe earlier conflict detection equations 1, 2, 3 and 4.
10: Use equations 5 and 6 to get time min and time max. If time min is less thantime max, there is a potential conflict between the aircraft whose ID is the trialID and this path.
11: Check whether time min is less than the collision time of this projectedpath, if so,this projectedpath.collision time is updated to time min. If no conflict is found,the time is not updated and nothing needs to be done.
12: The projectedpath records in each PE are then passed to the next PE along thering network to compare with the records in the neighbor PE.
13: end for
14: After 96 iterations, all projectedpath records have been compared to all the otheraircraft.
15: Find the maximum projectedpath.collision time across all PEs. This path is thebest scenario.
16: Change the best trial aircraft’s x and y velocity according to the best scenario path,display the resolution advisory and change the flight plan in the host.
62
tracking task. The terrain avoidance algorithm in Algorithm 7 is similar to the conflict
detection algorithm. The challenge is the computational intensiveness and we use the
CSX600 architecture again to speed up this computation.
Algorithm 7 Algorithm for Terrain Avoidance
1: for i = 1 → 96 do
2: if for each terrain and track record in each PE, the track record’s height is lowerthan the terrain record’s then
3: Project the track record’s position to 2 minutes in the future, add 1.5 to each xand y edge of the future position to provide a 3.0 minimum miss distance. Theterrain records are 1.0 by 1.0 nm boxes.
4: Calculate the minimum and maximum intersection times in both x and y di-mensions.
5: Set time min to be the larger of the two minimum intersection times in both xand y dimensions in step 4. Similarly, set the record time max to be the largerof the two maximum intersection times in both x and y dimensions in step 4.
6: if time min < time max then
7: There is a potential conflict between the track and the terrain.8: end if
9: end if
10: All track records in each PE are passed to next PE and steps 2 to 8 are repeated.11: end for
12: After 96 iterations, all track records have been compared with all terrain records forterrain avoidance.
7.8 Final Approach (Runways)
The final approach task is to optimize runway usage. Each flight has a flight plan
that specifies its departure terminal, planned departure time, its destination terminal,
and planned arrival time. The runways that occur in the region being managed by an
ATC system could be distributed among the processors. Each processor manages the
information for the runways assigned to it. Here, we assume that there are 96 runways in
the sector being managed by this ATC system and assign one runway to each processor.
The PE assigned to a runway collects aircraft departure times from this runway and
arrival times to this runway and sorts the times. The PEs will instruct the aircraft to
63
increase or decrease their speed to optimize runway usage and also optimize fuel cost.
The last step is currently done by controllers manually.
7.9 Comparison of AP and MIMD solutions for ATC Tasks
We assume that there is a different PE for each flight and that the data for each of
the n flights is stored in its PE. We apply static scheduling in our AP solution to the
ATC problem, which is fundamentally different from the heuristic scheduling normally
used in MIMD solutions [3–5]. The reasons that static scheduling is possible are: the
simultaneous execution of thousands of jobs in each jobset, the wider data access band-
width (200 to 500 times increase), and the ability to predict the worst case execution
time for the AP, which have been explained in section 2.4. However, like most other
real-time systems, the ATC problem cannot be efficiently scheduled using MIMD. One of
the main reasons is that an MIMD approach uses dynamic scheduling to schedule all the
tasks. The data keeps changing, so the MIMD system cannot determine, a priori, how
many records will actually participate in the computation of a task or how much actual
computation time a task in a cycle will require. In the AP, since tasks are executed as
jobsets, there is no need to differentiate between a jobset with one record and a jobset
with thousands of records, since both will require the same amount of time. Thus, the
number of records in a jobset is simply a ”don’t care” parameter. All set operation times
are based on the worst case assumption. This makes it possible to schedule ATC tasks
statically in the AP solution.
Table 6 shows the comparison of the complexities of AP and MIMD solutions for
ATC tasks [4, 5]. It is noted that we do not include the MIMD software for dynamic
version handling, race conditions, data dependencies, and deadlocks, etc. In particular,
this solution avoids use of solutions or approximate solutions to any of the numerous
multiprocessor NP-hard problems of the type given in [8]. The solution used is very sim-
ilar to a sequential solution for this problem, both in style and in code size. As a result,
the size of this software solution is only a very small fraction of the size of the MIMD
solutions to the ATC problem. This results in a dramatic drop in the cost in both the
85
86
cost of creating and in maintaining this software when compared to the MIMD solutions
that have been given to the ATC problem.
Second, the proposed AP solution will support accurate and meaningful predictions of
worst case execution times and will guarantee all deadlines are met. In contrast, MIMD
systems optimize average case running time and have highly unpredictable worst case
running time. These contributions can provide major help in meeting the goals of FAA’s
NextGen Plan: fly more aircraft, more safely, more precisely, more efficiently and use
less fuel [17].
Third, the AP hardware easily scales to handle larger problem sizes by either increas-
ing the maximum number of records stored in each processor (which will slow down the
run-time) or by building a larger AP computer with more processors, which is easy to do.
Based on the simplicity of the architecture of the STARAN and ASPRO, which are the
only APs that have been built, building a larger AP system should be both easy to do and
much cheaper to build than a MIMD system of comparable size. The CM-2 Connection
Machine was a SIMD computer built by Thinking Machines in the late 1980’s that had
over 64K processors. Paracel developed a parallel processor which was generally believed
to be a SIMD that had one million processors. It would seem reasonable to expect that
an AP with a million processors could easily be built currently. Fourth, the result is that
the Validation and Verification (V&V) is simpler than for current MIMD software.
The ATC problem has similar requirements to most embedded real-time problems
with periodic tasks and hard deadlines, such as command and control problems in mil-
itary, autonomous driving, video surveillance, image processing, simulations of complex
87
physical systems (e.g., weather forecasting, molecular modeling), and massive data pro-
cessing, etc. The AP’s ability to easily handle the ATC problem would also enable it
to easily handle many other real-time problems, both large and small. As a result, this
research is relevant not only to the ATC problem, but also to numerous other important
applications that involve real-time problems with hard deadlines. For example, command
and control systems such as an air defense system would be natural candidates. Other
examples may include embedded real-time systems as well.
The ATC problem is a large dynamic database problem. The AP excels in handling
the ATC since it was designed to handle real-time dynamic database activities rapidly.
For example, locating records with a particular property, reading a value from this record,
changing a value in this record, and determining whether a record with a certain property
exists are actions that can be done in constant time. By use of the flip network, large
pieces of records can be moved into parallel memory or copied from parallel memory in
constant time, making it possible to enter and to ship these records elsewhere rapidly.
The AP’s capability of handling dynamic databases makes it very useful for numerous
other applications.
A natural application area is controlling fleets of Unmanned Aircraft Systems (UASs).
In fact, the ATC system developed here with the ClearSpeed CSX600 could be expanded
to manage flight control for a small fleet of 480 to 960 UASs (depending on the number
of tasks performed and the size of the records stored, etc) in applications such as: (1)
patrolling areas such as international borders and reporting unusual activities; (2) early
identification of forest fires and during actual fires, maintaining an updated information
about regions that are burnt, threatened, etc; and (3) surveillance of agriculture crops and
88
performing functions like spraying when needed, turning water on at various locations
when plants need water, etc. The ClearSpeed CSX700 provides a larger SIMD system
with 192 processors and roughly a 20% faster clock time for processors should more than
double the capabilities of the CSX600. As ClearSpeed SIMD accelerators are well-known
for their very small power consumption, size, and weight, it should be possible to build
an AP that also has similar characteristics. For example, it should be possible to build
an AP that requires less power than a light bulb and have a sufficiently low weight that
would enable it to be easily deployed in the field and still be able to control 1000 UASs in
the immediate vicinity. Since UASs will soon be permitted to enter airspace controlled
by FAA, many avionics experts expect the total aircraft in the skies to rapidly increase
in numbers as the anticipated civilian use of UASs explodes. Since the demands on our
ATC system are likely to increase at a much more rapid pace than in the past, the FAA
should quickly initiate an investigation into the use of APs for ATC and how to best
integrate the APs into the FAA system.
From Sections 9 and 10 above, we know that AP approach has novelty and contribu-
tions in many other applications. However, we discuss its limitation now. It is obvious
that AP is ideal for solutions that have a lot of data parallelism in them, especially large
scale, while MIMD fits distributed applications. For example, at the end of Section 8.2.6,
it takes the same time to process 10 runways and 96 ones. A natural way to overcome
this limitation is to combine AP and MIMD in the same system. The ClearSpeed has
been used extensively as an accelerator to MIMD systems. In several cases, supercom-
puters increased their ranking in the top 500 by adding multiple ClearSpeed accelerators
to their system. MIMD processors could hand off problems that the SIMD accelerator
89
could compute more efficiently and perform other work while waiting for ClearSpeed to
return the solution. It may be useful to investigate whether the use of both AP and
MIMD hardware in the same system as co-equal partners could also prove to be benefi-
cial. Either system could serve as the main system, but would have the option of handing
off problems to the other system that it could not handle efficiently. Such a system would
need to be able to convert from one mode to the other efficiently or else transfer data
from one system to the other efficiently. Perhaps such combination might be useful in
building a system that could handle a wider range of problems efficiently. It also might
be useful in large systems (e.g., exascale) which would be able to handle a variety of very
large applications.
A possibly important extension to our current research would be to consider an im-
plementation of ATC on NVIDIA GPU hardware using CUDA, which has many SIMD
PE groups on its chips. The NVIDIA technology including the latest FERMI chip has
a lot in common with the MTAP approach of ClearSpeed. Implementing the CSX600
ATC algorithms on this architecture may provide another useful platform to use in this
project and would provide useful information about its ability to provide another useful
platform to use for the types of real-time applications mentioned earlier. Furthermore,
using CUDA GPU, we can investigate whether the shortcomings or bottlenecks of AP
can be improved. For example, GPUs have evolved into highly parallel multicore sys-
tems allowing very efficient manipulation of large blocks of data. This design is more
effective than general-purpose CPUs for algorithms where processing of large blocks of
data is done in parallel. We will investigate whether ATC can run well using CUDA
90
and whether the bottlenecks can be improved. Another potential project is to investi-
gate implementing our prototype on other parallel systems, e.g., a Cray Systems, such
as their vector processor, IBM’s Cell processor, and Convex with FPGA reconfigurable
hardware, etc.
BIBLIOGRAPHY
[1] S. Kahne and I. Frolow, “Air traffic management: Evolution with technology,”IEEE Control Systems Magazine, vol. 16, no. 4, pp. 12–21, November 1996.
[2] M. Nolan, Fundamentals of Air Traffic Control, 3rd ed. Wadsworth: Brooks/Cole,1998.
[3] M. Jin, “Evaluating the power of the parallel masc model using simulations andreal-time applications,” Ph.D. dissertation, Department of Computer Science, KentState University, August 2004.
[4] W. Meilander, M. Jin, and J. Baker, “Tractable real-time air traffic control au-tomation,” in Proc. of the 14th IASTED International Conference on Parallel andDistributed Computing and Systems (PDCS), Cambridge, MA, November 2002, pp.483–488.
[5] W. Meilander, J. Baker, and M. Jin, “Predictable real-time scheduling for air trafficcontrol,” in Fifteenth International Conference on Systems Engineering, August2002, pp. 533–539.
[6] ——, “Importance of simd computation reconsidered,” in Proc. of the 17th In-ternational Parallel and Distributed Processing Symposium (IEEE Workshop onMassively Parallel Processing), Nice, France, April 2003.
[7] W. Meilander, J. Potter, K. Liszka, and J. Baker, “Real-time scheduling in com-mand and control,” in Proc. of the 1999 Midwest Workshop on Parallel Processing,August 1999.
[8] M. Garey and D. Johnson, Computers and Intractability: a Guide to the Theory ofNP-completeness. New York: W.H. Freeman, 1979.
[9] M. N. J.A. Stankovic, M. Spuri and G. Buttazzo, “Implications of classical schedul-ing results for real-time systems,” IEEE Computer, pp. 16–25, June 1995.
[10] M. Jin, J. Baker, and K. Batcher, “Timings for associative operations on the mascmodel,” in Proc. of the 15th International Parallel and Distributed Processing Sym-posium (IEEE Workshop on Massively Parallel Processing), San Francisco, CA,April 2001, pp. 193–200.
[11] K. Batcher, “Staran parallel processor system hardware,” in National ComputerConference and Exposition (AFIPS74), New York, NY, May 1974, pp. 405–410.
[12] W. Meilander, “Staran an associative approach to multiprocessing,” in Multipro-cessor Systems, Infotech State of the Art Reports, Infotech International, 1976, pp.347–372.
91
92
[13] J. A. Rudolph, “A production implementation of an associative array processor -staran,” in The Fall Joint Computer Conference (FJCC), Los Angeles, CA, De-cember 1972.
[14] W. Meilander, “Aspro-vme hardware/architecture,” June 1992, eR3418-5 LORALDefense Systems.
[15] J. Potter, J. Baker, S. Scott, A. Bansal, C. Leangsuksun, and C. Asthagiri, “Asc:An associative-computing paradigm,” Computer, vol. 27, no. 11, pp. 19–25, 1994.
[16] “Faa grants for aviation research program solicitation no. faa-06-01,” 2011. [Online].Available: http://www.tc.faa.gov/logistics/grants/solicitation/97solict.doc
[19] M. Yuan, J. Baker, F. Drews, and W. Meilander, “Efficient implementation ofair traffic control (atc) using the clearspeed csx620 system,” in Proc. of the 21stIASTED International Conference on Parallel and Distributed Computing and Sys-tems (PDCS), Cambridge, MA, November 2009, pp. 353–360.
[20] S. Guy, J. Chhugani, C. Kim, N. Satish, M. Lin, D. Manocha, and P. Dubey,“Clearpath: highly parallel collision avoidance for multi-agent simulation,” in ACMSIGGRAPH/Eurographics Symposium on Computer Animation(SCA). ACM, Au-gust 2009, pp. 177–187.
[21] M. Yuan, J. Baker, F. Drews, L. Neiman, and W. Meilander, “An efficient asso-ciative processor solution to an air traffic control problem,” in Large Scale ParallelProcessing (LSPP) IEEE Workshop at the International Parallel and DistributedProcessing Symposium (IPDPS), Atlanta, GA, April 2010.
[22] M. Yuan, J. Baker, W. Meilander, and K. Schaffer, “Scalable and efficient asso-ciative processor solution to guarantee real-time requirements for air traffic con-trol systems,” in Large Scale Parallel Processing (LSPP) IEEE Workshop at theInternational Parallel and Distributed Processing Symposium (IPDPS), Shanghai,China, May 2012, pp. 1682–1689.
[23] M. Yuan, J. Baker, and W. Meilander, “Comparisons of air traffic control implemen-tations on an associative processor with a mimd and consequences for parallel com-puting,” Journal of Parallel and Distributed Computing(JPDC), 2012, accepted, toappear.
[25] S. Akl, Parallel Computing: Models and Methods. New York: Prentice Hall, 1997.
[26] M. Quinn, Parallel Programming in C with MPI and OpenMP. McGraw-Hill,2004.
93
[27] T. Cormen, C. Leisterson, and R. Rivest, Introduction to Algorithms, 1st ed. Mc-Graw Hill and MIT Press, 1990, chapter 30 on Parallel Algorithms.
[28] M. Flynn, “Some computer organizations and their effectiveness,” IEEE Transac-tions on Computers, vol. C, no. 21, pp. 948–960, 1972.
[29] M. Quinn, Parallel Computing: Theory and Practice. McGraw-Hill, 1994.
[30] W. Chantamas, “A multiple associative computing model to support the executionof data parallel branches using the manager-worker paradigm,” Ph.D. dissertation,Department of Computer Science, Kent State University, December 2009.
[31] M. Garey, R. Graham, and D. Johnson, “Performance guarantees for schedulingalgorithms,” Operations Research, vol. 26, no. 1, pp. 3–21, Jan-Feb 1978.
[32] J. Stankovic, “Misconceptions about real-time computing,” IEEE Computer,vol. 21, no. 10, pp. 17–25, Oct. 1988.
[33] J. Stankovic, S. Son, and J. Hansson, “Misconceptions about real-time databases,”Computer, June 1999.
[34] J. Stankovic, M. Spuri, K. Ramamritham, and G. Buttazzo, Deadline Schedulingfor Real-time Systems. Kluwer Academic Publishers, 1998.
[35] J. W. Liu, Real-Time Systems. Prentice Hall, 2000.
[36] G. Buttazzo, Hard Real-Time Computing Systems: Predictable Scheduling Algo-rithms and Applications, 2nd ed. New York: Springer Science, 2005.
[37] C. Murthy and G. Manimaran, Resource Management in Real-time Systems andNetworks. MIT Press, 2001.
[38] M. Klein, J. Lehoczky, and R. Rakumar, “Rate-monotonic analysis for real-timeindustrial computing,” IEEE Computer, pp. 24–33, 1994.
[39] C. L. Liu and J. Layland, “Scheduling algorithms for multiprogramming in a hard-real-time environment,” Journal of the ACM, vol. 20, no. 10, 1973.
[40] J. Anderson, J. Calandrino, and U. Devi, “Real-time scheduling on multicore plat-forms,” in Proc. of the 12th IEEE Real-Time and Embedded Technology and Appli-cations Symposium, San Jose, CA, April 2006, pp. 179–190.
[41] J. Potter, Associative Computing: A Programming Paradigm for Massively ParallelComputers. New York: Plenum Press, 1992.
[42] J. Trahan, M. Jin, W. Chantamas, and J. Baker, “Relating the power of the multipleassociative computing model (masc) to that of reconfigurable bus-based models,”Journal of Parallel and Distributed Computing, Elsevier Publishers, vol. 70, pp.458–466, May 2010.
[43] R. Chandra, R. Menon, L. Dagum, D. Kohr, D. Maydan, and J. McDonald, ParallelProgramming in OpenMP, 1st ed. Morgan Kaufmann, 2000.
94
[44] B. Chapman, G. Jost, and R. Pas, Using OpenMP Portable Shared Memory ParallelProgramming. MIT Press, 2007.
[45] H. Casanova, A. Legrand, and Y. Roberts, Parallel Algorithms. CRC Press, 2009.
[46] B. Wilkinson and M. Allen, Parallel Programming: Techniques and ApplicationsUsing Networked Workstations and Parallel Computers. Prentice Hall, 1999.
[47] K. Lakshmanan, S. Kato, and R. Rajkumer, “Scheduling parallel real-time tasks onmulti-core processors,” in Proc. of the 31st IEEE Real-Time Systems Symposium(RTSS’10), San Diego, CA, December 2010, pp. 259–268.
[48] I. Hwang, “Air traffic surveillance and control using hybrid estimation and protocol-based conflict resolution,” Ph.D. dissertation, Stanford University, 2003.
[49] L. Yang and J. Kuchar, “Prototype conflict alerting logic for free flight,” AIAAJournal of Guidance, Control, and Dynamics, vol. 20, no. 4, pp. 768–773, July-August 1997.
[50] J. Kuchar and L. Yang, “A review of conflict detection and resolution modelingmethods,” IEEE Transactions on Intelligent Transportation Systems, vol. 1, no. 4,pp. 179–189, 2000.
[51] Y. Bar-Shalom and T. Fortmann, Tracking and Data Association. Academic Press,1988.
[52] H. Blom, R. Hogendoorn, and B. vanDoorn, “Design of a multisensor trackingsystem for advanced air traffic control,” in Multitarget-Multisensor Tracking: Ap-plication and Advances, Y.Bar-Shalom, Ed., vol. 2. Artech House, 1990, pp. 31–63.
[53] I. Hwang, H. Balakrishnan, K. Roy, and C. Tomlin, “Multiple-target tracking andidentity management in clutter for air traffic control,” in Proceedings of the AACCAmerican Control Conference, Boston, MA, June 2004.
[54] I. Hwang, J. Hwang, and C. Tomlin, “Flight-mode-based aircraft conflict detectionusing a residual-mean interacting multiple model algorithm,” in Proceedings of theAIAA Guidance, Navigation, and Control Conference, Austin Texas, August 2003.
[55] I. Hwang and C. Tomlin, “Protocol-based conflict resolution for finite informationhorizon,” in Proceedings of the AACC American Control Conference, Anchorage,Alaska, May 2002.
[56] K. Liu, “Composition of kalman and heuristic tracking algorithms for air trafficcontrol,” Master’s thesis, Department of Computer Science, Kent State University,August 1999.
[57] E. Mazor, A. Averbuch, Y. Bar-Shalom, and J. Dayan, “Interacting multiple modelmethods in tracking: A survey,” in IEEE Transactions on Aerospace and ElectronicSystems, vol. 34(1), 1998, pp. 103–123.
95
[58] D. Lainiotis, “Partitioning: A unifying framework for adaptive systems i: Estima-tion,” in Proceedings of the IEEE, vol. 64, August 1976, pp. 1126–1142.
[59] Y. Bar-Shalom and X. Li, Estimation and Tracking: Principles, Techniques andSoftware. Boston, Massachusetts: Artech House, 1993.
[60] D. Sworder and J. Boyd, “Estimation problems in hybrid systems,” in CambridgeUniversity Press, 1999.
[61] H. Blom and Y. Bar-Sharlom, “The interacting multiple model algorithm for sys-tems with markovian switching coefficients,” IEEE Transactions on AutomaticControl, vol. 33, no. 8, pp. 780–783, August 1988.
[62] X. Li and Y. Bar-Shalom, “Design of an interacting multiple model algorithm forair traffic control tracking,” IEEE Transactions on Control Systems Technology,vol. 1, no. 3, pp. 186–194, September 1993.
[63] P. Menon, G. Sweriduk, and B. Sridhar, “Optimal strategies for free flight air trafficconflict resolution,” Journal of Guidance, Control, and Dynamics, vol. 22, no. 2,pp. 202–211, 1999.
[64] J. Krozel, M. Peters, K. Bilimoria, C. Lee, and J. Mitchell, “System performancecharacteristics of centralized and decentralized air traffic separation strategies,” inFourth USA/Europe Air Traffic Management Research and Development Seminar,2001.
[65] Y.-J. Chiang, J. Klosowski, C. Lee, and J. Mitchell, “Geometric algorithms forconflict detection/resolution in air traffic management,” in 36th IEEE Conferenceon Decision and Control, San Diego, CA, December 1997, pp. 1835–1840.
[66] R. Paielli and H. Erzberger, “Conflict probability estimation for free flight,” Journalof Guidance, Control, and Dynamics, vol. 20, no. 3, pp. 588–596, 1997.
[67] M. Prandini, J. Hu, J. Lygeros, and S. Sastry, “A probabilistic approach to air-craft conflict detection,” IEEE Transactions on Intelligent Transportation Systems,vol. 1, no. 4, pp. 199–219, 2000.
[68] “Faa grants for aviation research program solicitation,” 2011. [Online]. Available:http://www.tc.faa.gov/logistics/grants/
[69] K. Park, N. Singhal, M. Lee, S. Cho, and C. Kim, “Design and performance evalu-ation of image processing algorithms on gpus,” IEEE Transactions on Parallel andDistributed Systems, vol. 22, no. 1, pp. 91–104, January 2011.
[70] S. Reddaway, W. Meilander, J. Baker, and J. Kidman, “Overview of air trafficcontrol using an simd cots system,” in Proc. of the International Parallel and Dis-tributed Processing Symposium (IPDPS’05), Denver, CO, April 2005.
[71] K. Batcher, “Staran/radcap hardware architecture,” in Sagamore Computer Conf.on Parallel Processing, 1973, pp. 147–152.
96
[72] ——, “The multi-dimensional access memory in staran,” in Sagamore ComputerConf. on Parallel Processing, 1975, pp. 167–168.
[73] ——, “The flip network in staran,” in International Conf. on Parallel Processing,1976, pp. 65–71.
[74] ——, “Staran series e,” in International Conf. on Parallel Processing, 1977, pp.140–143.
[75] ——, “The multi-dimensional access memory in staran,” IEEE Transactions onComputers, vol. C-26, no. 2, pp. 174–177, Feb 1977.
[76] H. Wang and R. Walker, “Implementing a scalable asc processor,” in Proc. of the17th International Parallel and Distributed Processing Symposium (Workshop inMassively Parallel Processing), Nice, France, April 2003, pp. abstract on page 267,full text on CDROM.
[77] ——, “Implementing a multiple-instruction stream associative masc processor,” inProc. of the 18th International Conference on Parallel and Distributed Computingand Systems(PDCS), Dallas, Texas, November 2006, pp. 460–465.
[78] R. Walker, J. Potter, Y. Wang, and M. Wu, “Implementing associative processing:Rethinking earlier architectural decisions,” in Proceedings of the 15th InternationalParallel and Distributed Processing Symposium (Workshop on Massively ParallelProcessing), San Francisco, CA, April 2001, pp. abstract on p. 195, full text onaccompanying CDROM.
[79] K. Schaffer and R. Walker, “A prototype multithreaded associative simd proces-sor,” in Proceedings of the 21st International Parallel and Distributed ProcessingSymposium (IPDPS)-Workshop on Advances in Parallel and Distributed Comput-ing Models (APDCM), Long Beach, CA, March 2007, p. 228.
[80] ——, “Using hardware multithreading to overcome broadcast/reduction latencyin an associative simd processor (extended version),” Parallel Processing Letters,vol. 18, no. 4, pp. 491–509, Dec 2008.
[81] ——, “Using hardware multithreading to overcome broadcast/reduction latency inan associative simd processor,” in Proc. 22nd Int’l Parallel and Distributed Process-ing Symp.(IPDPS)-Workshop on Large-Scale Parallel Processing (LSPP), Miami,FL, April 2008, p. 289.
[85] J. Gustafson and B. Greer, “Clearspeed whitepaper: Accelerating theintel math kernel library,” Tech. Rep., 2007. [Online]. Available: http://www.clearspeed.com/docs/resources/ClearSpeedIntelWhitepaperFeb07.pdf
[86] K. Schaffer, “Asc library for clearspeed,” 2012. [Online]. Available: http://www.cs.kent.edu/∼kschaffe/asc/
[87] E. Eddey and W. Meilander, “Application of an associative processor to aircrafttracking,” in Proceedings of the Sagamore Computer Conference on Parallel Pro-cessing. Springer-Verlag, Aug 1974, pp. 417–428.
[88] “The video of the performance of the staran at dulles is posted at,” 2012, this is areproduction of part of the original 16mm film made in the 1980’s. If a completeprofessional restoration is feasible, it will also be posted here. [Online]. Available:http://www.cs.kent.edu/∼jbaker/ATC/andXXXXatElseviersite.
[89] A. Marowka, “Back to thin-core massively parallel processors,” IEEE ComputerJournal, vol. 44, no. 12, pp. 49–54, December 2011.
[90] W. Chantamas, J. Baker, and M. Scherger, “An extension of the asc language com-piler to support multiple instruction streams in the masc model using the manager-worker paradigm,” in Proc. of the 2006 International Conference on Parallel andDistributed Processing Techniques and Applications (PDPTA 2006), June 2006, pp.521–527.
[91] W. Chantamas and J. Baker, “A multiple associative model to support branches indata parallel applications using the manager-worker paradigm,” in Proc. of the 19thInternational Parallel and Distributed Processing Symposium (WMPP Workshop),April 2005, pp. 266–273.
[92] J. Potter, ASC Software, 1992, includes a Primer, Windows Compiler,and Windows Emulator, can be downloaded at:. [Online]. Available: http://www.cs.kent.edu/∼parallel/
[93] M. Jin and J. Baker, “Two graph algorithms on an associative computing model,”in International Conference on Parallel and Distributed Processing Techniques andApplications (PDPTA), Las Vegas, June 2007, p. 7 pages.
[94] M. Atwah and J. Baker, “An associative dynamic convex hull algorithm,” in Proc.of the Tenth IASTED International Conference on Parallel and Distributed Com-puting and Systems, Las Vegas, NV, October 1998, pp. 250–254.
[95] ——, “An associative implementation of a parallel convex hull algorithm,” in Proc.of the 15th International Parallel and Distributed Processing Symposium (IEEEWorkshop on Massively Parallel Processing), San Francisco, CA, April 2001, pp.abstract on page 64, full text on CDROM.
[96] ——, “An associative static and dynamic convex hull algorithm,” in Proc. of the16th International Parallel and Distributed Processing Symposium (IEEE Work-shop on Massively Parallel Processing), Ft. Lauderdale, FL, April 2002, pp. ab-stract on page 249, full text on CDROM.
98
[97] M. Atwah, J. Baker, and S. Akl, “An associative implementation of graham’s con-vex hull algorithm,” in Proc. of the Seventh IASTED International Conference onParallel and Distributed Computing and Systems, Washington D.C., October 1995,pp. 273–276.
[98] ——, “An associative implementation of classical convex hull algorithm,” in Proc.of the Eighth IASTED International Conference on Parallel and Distributed Com-puting and Systems, Chicago, IL, October 1996, pp. 435–438.
[99] M. Esenwein, “String matching algorithms for an associative computer,” Master’sthesis, Department of Computer Science, Kent State University, 1995.
[100] M. Esenwein and J. Baker, “Vlcd string matching for associative computing andmultiple broadcast mesh,” in Proc. of the IASTED International Conference onParallel and Distributed Computing and Systems, October 1997, pp. 69–74.
[101] S. Steinfadt and J. Baker, “Swamp: Smith-waterman using associative massive par-allelism,” in IEEE Workshop on Parallel and Distributed Scientific and Engineer-ing Computing, 2008 International Parallel and Distributed Processing Symposium(IPDPS), Miami, FL, April 2008.
[102] S. Steinfadt, M. Scherger, and J. Baker, “A local sequence alignment algorithmusing an associative model of parallel computation,” in Proc. of IASTED Compu-tational and Systems Biology (CASB 2006), Dallas, TX, Nov 2006, pp. 38–43.
[103] D. Ulm and J. Baker, “Solving a 2d knapsack problem on an associative computeraugmented with a linear network,” in Proc. of the International Conference onParallel and Distributed Processing Techniques and Applications, Sunnyvale, CA,Aug 1996, pp. 29–32.
[104] J. Lee, “Developing parallel simd algorithms for the traveling salesman problem,”Master’s thesis, Department of Computer Science, Kent State University, November1989.
[105] P. Berra, “Some problems in associative processor applications to database man-agement,” in Proceedings of the National Computer Conference and Exposition,May 1974, pp. 1–5.
[106] P. Berra and E. Oliver, “The role of associative array processors in database ma-chine architecture,” IEEE Transactions on Computers, no. 4, pp. 53–61, 1979.
[107] C. Asthagiri and J. Potter, “Associative parallel lexing,” in Proceedings of the 6thInternational Parallel Processing Symposium(IPPS), March 1992, pp. 466–469.
[108] ——, “Parallel compilation on associative processors,” in Proceedings of the IFIPWG10.3 Working Conference on Parallel Architectures and Compilation Tech-niques(PACT), North-Holland Publishing Co, The Netherlands, 1994, pp. 315–318.
[109] ——, “Parallel context-sensitive compilation,” Software-Practice and Experience,vol. 24, no. 9, pp. 801–822, Sept 1994.
99
[110] A. Bansal and J. Potter, “Exploiting data parallelism for efficient execution oflogic programs with large knowledge bases,” in Proceedings of the 2nd InternationalIEEE Conference on Tools for Artificial Intelligence, 1990, pp. 674–681.
[111] B. Reed, “An implementation of lisp on a simd parallel processor,” in First annualaerospace applications of AI, Dayton, OH, 1985, pp. 81–90.
[112] G. Steele and W. Hillis, “Connection machine lisp: Fine-grained parallel symbolicprocessing,” in Proceedings of the 1986 ACM conference on LISP and FunctionalProgramming(LFP). New York, NY: ACM, 1986, pp. 279 – 297.
[113] T. Hasten, “An ops5 implementation on a simd computer,” Master’s thesis, De-partment of Computer Science, Kent State University, 1987.
[114] J. Potter, M. Rivett, and T. Hasten, “Rule-based systems on simd computers,” inProceedings of ROBEXS, 1987, pp. 198–204.
[115] B. Reed, “The aspro parallel inference engine (p.i.e.): A real time production rulesystem, Tech. Rep. 85-6048, 1985.