UNDERSTANDING MAP INTEGRATION USING GIS SOFTWARE Submitted July 29, 2016 Michelle Pasco Undergraduate Research Assistant Civil and Environmental Engineering Department Old Dominion University 135 Kaufman Hall, Norfolk, VA 23529 E-mail: [email protected]Total Word Count = 3880 words + 7 Figures * 250 + 4 Tables * 250 = 6630 words
15
Embed
Understanding Map Integration Using GIS Software_ff
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Overlap (OL) Milepost (MPST) LRS.” The segments are based off of mileposts displayed in
Virginia. The LRS contains all the VDOT-maintained roads in Virginia. The features in this layer of
the LRS are road links, usually directional and short in length, that are broken at significant
points in the geometry of the road.
INRIX is a company that provides traffic data and services through connected devices that
include road speed parameters, incident information, and text alerts . The speed and travel time
information is based mainly on vehicle probes (1). This information is delivered based on
geographical links called XD segments, which define specific sections of roadways (6). XD’s road
system is not as specific as LRS and only provides certain roads that are within LRS’s secondary
and urban roads for which INRIX can collect probe data. Like the LRS links, the XD links are
directional and short in length and are broken at logical points. Generally, the XD links are
shorter than 1.5 miles, but many of them are longer than LRS links.
VDOT also provided a sample conflation between the LRS and XD road networks. When
observing this sample conflation, it was apparent that it had gaps and overlaps. After further
research and cross referencing with Google Earth and Google Maps, it was concluded that
possible reasons for these issues might have included disjointed segmentation in the road
geometry, road geometric disparity, mismatched attributes, and outdated information. To
further understand these issues, this project attempted to perform a conflation of the five
study area roadways using several methods in GIS:
Spatial Join (with variants of same and different GCS and EDGE- vs. non-EDGE matching),
Transfer Attributes,
Other methods (edge-matching and join by attributes).
All the analysis for this work was done using the ESRI ArcGIS software suite.
Spatial Join: Spatial join is joining features by intersecting their geometry (7). Dataset objects
are digitized with vector components such as points, lines, and polygons. With this tool, the
vector components for each of the two datasets become either the “source” or “target” layer.
After the spatial join conflation process, an output field called “Count_” is created on the
attribute table. This “Count_” is the number of intersections (or matches) that each feature has.
Pasco 6
Figure 2: The "Count_" field is shown in light blue. 0 indicates that the feature did not match with any other features.
In this project, the LRS was set as the source layer and the XD was set as the target layer. In an
attempt to increase the accuracy, the influence of two factors was studied. The first factor was
the geographical coordinate system, with the possibilities of keeping the original GCS and
converting to a common GCS. In the first case, the spatial join was performed with the LRS and
XD in their respective original GCSs. LRS had “GCS_North_American_1983” GCS and XD had
“GCS_WGS_1984” GCS as their original GCS. In the second attempt the same GCS was used for
both datasets. XD’s GCS was projected to match the LRS’s GCS using the Project tool under Data
Management Toolbox in GIS.
The second factor was the layer within the LRS: EDGE and Non-EDGE. EDGE means that the
roads are broken up into multiple segments to increase accuracy of the road geometry. “Non-
EDGE” means that the road is made of a two segments with each segment bearing the direction
of the road. For example, I-64’s two segments are labeled as I-64 west and I-64 east.
Taking into account these two situations, each with two options, four sets of spatial join
conflations were generated in total.
Pasco 7
Figure 3: GCS vs Original (ORG) spatial join case on part of I-64 where the top segments are the source and target layers and the bottom segments are the conflations
To increase the GIS processing speed, each of the spatial joins was done individually for each of
the five interstates. After the completion of each spatial join, the attribute table was analyzed
to see how many features had matches in the “Count_” field. An equation was created to
measure the “Conflation Accuracy” to study the relationship between the count and spatial join
where “no <Null>” means that the features that did not match were removed to only include
the features that did match.
Pasco 9
Figure 5: Comparison between 0.1 mi and 0.3 mi conflations for transfer attributes on part of I-64 where the top segments are the source and target layers and the bottom segments are the conflations.
To further understand how the search distance affects the conflation result, the buffer tool was
used to visually observe the relationship between the search distance and roads. The tool
creates a buffer surrounding a feature that is determined by the search distance (7). First, the
source layer was LRS and the target feature was XD. Additionally, the analysis was completed in
the reverse order. Buffer has two types: round and flat. Round buffer surrounds the segment
with a radial buffer, while a flat buffer creates one that terminates at the endpoint.
Figure 6: The top segment represents each of the round buffers and the bottom segment represents the flat buffer.
Other methods: The two other methods that were tested are called “edge-matching” and “join
by attributes.” Edge-match is a tool that physically transforms features during the conflation
process (7). Prior to using the edge-match tool, a separate tool called “generate edge-match
links” is required to assess each of the segment features. These tools need a source layer,
target layer, and require that the source and target layers have identical GCS. After using the
generate edge-match tool, an output table is created including start and end points of each
segments and an “edge-match confidence” or “em_conf.” Edge-match confidence scores each
Pasco 10
of the features on a value range of 0 to 100, 100 being the maximum confidence level. The
lower the confidence level, the less likely the segment will conflate. The project does not
include the edge-match tool because a great deal of manual intervention is needed to
transform the segments with low confidence levels to increase the accuracy. This is not suitable
for large-scale projects.
Join by attributes is similar to transfer attributes, but it forces users to choose two fields, one in
each of the datasets, in order to conflate (7). To conflate, the fields are not required to have the
same GCS, but the tool still needs a source and target layer. The process joins the source layer’s
attribute table to the target layer’s attribute table, attempting to match corresponding
features. If there is no correlation between the layers, the target dataset’s attribute table will
be recorded as a <Null> value. The project did not utilize the join by attributes because it is
difficult to acquire two fields that will match the features within two datasets and involves
several rounds of trial and error. The common result is that all of the target attribute table will
be <Null>.
V. RESULTS
Spatial Join: The table below displays the result of 20 spatial joins for the four cases.
Table 1: Spatial join results.
Road Name & Spatial Join Type
# of features
Count>0 features
Conflation Accuracy,
ca (%) Average ca
64_EDGE_ORG 728 632 86.81
75.11
564_EDGE_ORG 12 10 83.33
95_EDGE_ORG 573 452 78.88
395_EDGE_ORG 136 89 65.44
495_EDGE_ORG 167 102 61.08
64_EDGE_GCS 728 359 49.31
44.11
564_EDGE_GCS 12 10 83.33
95_EDGE_GCS 573 287 50.09
395_EDGE_GCS 136 27 19.85
495_EDGE_GCS 167 30 17.96
64_NON_ORG 2 2 100.00
100.00
564_NON_ORG 2 2 100.00
95_NON_ORG 2 2 100.00
395_NON_ORG 2 2 100.00
495_NON_ORG 2 2 100.00
64_NON_GCS 2 2 100.00 100.00
Pasco 11
564_NON_GCS 2 2 100.00
95_NON_ GCS 2 2 100.00
395_NON_ GCS 2 2 100.00
495_NON_ GCS 2 2 100.00
Where:
The number represents the interstate
“EDGE” represents the EDGE layer in LRS
“ORG” represents the original XD
“GCS” represents the Geographic Coordinate System or the projected XD layer
“NON” represents the Non-EDGE layer in LRS.
In the GCS versus ORG cases, the recorded data shows that the ORG is 30% more accurate in
the EDGE layer and there is no difference in the Non-EDGE layer. In the EDGE versus Non-EDGE
cases, Non-EDGE is 100% accurate due to the fact that there are only two segments for each
interstate. Although Non-EDGE seems to be complete, the XD matched with the entire road
instead of a segment on the LRS. This will not be sufficient for many applications that would use
the conflated datasets. Since both Non-EDGE comparisons resulted identically, further
investigation of the count for each feature was done.
Table 2: NON-EDGE segments count results.
Road Name & Spatial Join Type First Segment
Match Count Second Segment
Match Count
64_NON_ORG I-64E 628 I-64W 620
564_NON_ORG I-564E 9 I-564W 15
95_NON_ORG I-95S 443 I-95N 438
395_NON_ORG I-395S 84 I-395N 96
495_NON_ORG I-495S 77 I-495N 79
64_NON_GCS I-64E 190 I-64W 208
564_NON_GCS I-564E 4 I-564W 2
95_NON_GCS I-95S 127 I-95N 146
395_NON_GCS I-395S 12 I-395N 23
495_NON_GCS I-495S 9 I-495N 7
In the table above N, S, E, and W are North, South, East, and West, respectively. According to
the data, the ORG layer receives more matches than the GCS layer. The greater the count
number is, the more trouble GIS will have matching the correct segments. Although it may
seem that ORG is less accurate than GCS, there is a chance that it is actually the opposite. GCS is
Pasco 12
less accurate because projecting a shape file to a different GCS induces geometric
discrepancies. Therefore, it was concluded that the main reason behind the NON_GCS case
having a lower count is because GIS is not reading the segments correctly.
Transfer Attributes: The table below displays the result of 20 transfer attribute conflations.
Table 3: Transfer attributes results.
Road Name & Search
Distance
# of features
No <Null> features
Conflation Accuracy,
ca (%)
I-64_0.1 mi 754 686 90.98
I-564_0.1 mi 11 8 72.73
I-95_0.1 mi 477 435 91.19
I-395_0.1 mi 64 60 93.75
I-495_0.1 mi 52 50 96.15
I-64_0.3 mi 754 689 91.38
I-564_0.3 mi 11 8 72.73
I-95_0.3 mi 477 435 91.19
I-395_0.3 mi 64 61 95.31
I-495_0.3 mi 52 50 96.15
I-64_0.5 mi 754 689 91.38
I-564_0.5 mi 11 8 72.73
I-95_0.5 mi 477 434 90.99
I-395_0.5 mi 64 61 95.31
I-495_0.5 mi 52 50 96.15
I-64_1 mi 754 689 91.38
I-564_1 mi 11 8 72.73
I-95_1 mi 477 434 90.99
I-395_1 mi 64 61 95.31
I-495_1 mi 52 50 96.15
Where:
“ca” represents the “Conflation Accuracy”
“no <Null>” is the removal of <Null> values
“mi” represents miles
The red text indicates change in the number of “no <Null>” features between one
search distance and the next.
The shorter the interstates are, the less chance of change there will be. This is the case for I-
564, I-395, and I-495. Observing the red text, there are subtle changes in the number of “no
<Null>” features. After experimenting, 0.1 mi was set as the minimum search distance and 1 mi
Pasco 13
as the maximum search distance due to the fact that anything lower or higher than the
minimum and maximum produced inaccurate results.
Particularly on I-64, three segments matched only 0.3 mi or higher. To understand why this is,
the buffer tool was utilized to mimic how GIS processes the conflation.
Each of the LRS segments were numbered 1 thru 4 from left to right and the number of lines
that intersected with the buffer was recorded. The intersections will be referred to as matches.
The first set of results are matched using the round buffer and setting the source layer as LRS
and target layer as XD with the three search distances being 0.1 mi, 0.3 mi, and 0.5 mi. A single
flat buffer was applied with a search distance of 0.1 mi because it was apparent that the
matches would all be identical. The second case is using the same source and target layers, but
matching which XD segments intersected the buffers. The XD segments are labeled as their
feature number. The results are as followed:
Table 4: Buffer tool results.
LRS Segment
Search Distance
Round Buffer 0.1 mi
Round Buffer 0.3 mi
Round Buffer 0.5 mi
Flat Buffer 0.1 mi
LRS Segment 1 1 1 1 1
LRS Segment 2 3 3 3 2
LRS Segment 3 2 2 2 2
LRS Segment 4 2 2 2 1
XD Segment
Round Buffer 0.1 mi
Round Buffer 0.3 mi
Round Buffer 0.5 mi
Flat Buffer 0.1 mi
XD Segment 1 (4100330)
2 2 2 2
XD Segment 2 (4100331)
3 3 3 2
XD Segment 3 (4100515)
3 3 3 2
In the table above the numbers represent the number of matches for search distance and the
buffer. Observing the results in Figure 7, the 0.1 mi round buffer intersects some of the XD
segments by an insignificant amount. When the buffer search distance is increased to 0.3 mi or
0.5 mi, GIS has enough data to read the features and correctly match.
VI. CONCLUSIONS & OUTLOOK
Understanding the different methods of conflation is crucial in measuring the accuracy of the
process. There are a few methods that are suitable for all types of research. Spatial joining is
Pasco 14
better to use if the data sets are comprised of many, potentially small, features or if the
datasets are comparatively similar in geometry and data structure. Transfer attributes is overall
more accurate because it matches features by not only their geometry, but also by how similar
their attributes are. This method covers the two most important aspects in the conflation
process.
Analyzing the five roadways from the study areas, it can be noted that the smaller interstates (I-
395, I-495, and I-564) seem to be more accurate than the two main interstates (I-64 and I-95).
While observing how the conflation process affected areas along the roads, it seems that there
are more gaps within the urban cities than the rural areas. The most likely cause for this is
because the roads are within close proximity of each other, so they are more likely to be
mismatched by GIS. There are few gaps and overlaps along the rural areas that are due to
geometric discrepancies or mislabeling of road names.
In the future, these methods can be used on a wider scale. For example, instead of doing the
five interstates, the entire Virginia road network can be studied. The side effects of working on
a larger scale will likely be that the conflation process will take much longer and that the
resulting road networks are most susceptible to problems such as gaps and overlaps. This is
because the closer the geometric features are, the higher the mismatching chance is. To
combat these issues, an understanding of the problems that are affiliated with the gaps and
overlaps is required. Manual intervention, such as manually transforming segments, may be
needed for some spots that are not correctly digitized. Once these issues are pinpointed and
resolved, a quasi-automatic or automatic process can be coded and used in GIS. This will create
a smoother conflation process and make it easier for spatial data and attributes for multiple
datasets to be researched.
VII. ACKNOWLEDGEMENTS
A special thanks to Simona Babiceanu, who advised me and kept me on the right path during
my research. I want to also thank Dr. Emily Parkany for the constant support and
encouragement that she gave to me and the other interns. Lastly, I want to give a warm thank
you to Daniela Gonzales for motivating me to apply for the Mid-Atlantic Transportation
Sustainability Center – University Transportation Center (MATS UTC) program.
VIII. REFERENCES
1. G. v. Gösseln, M. Sester. Integration of Geoscientific Data Sets and the German Digital Map
Using A Matching Approach. Commission IV, WG IV/7.
http://www.cartesia.org/geodoc/isprs2004/comm4/papers/534.pdf. Accessed June 15, 2016.
2. Davis, Curt H., Haithcoat, Timothy L., Keller, James M., Song, Wenbo. Relaxation-Based Point
Feature Matching for Vector Map Conflation, 2011. Transactions in GIS, 15(1), pg. 43-60.
Pasco 15
http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9671.2010.01243.x/full. Accessed June 20,
2016.
3. Virginia Department of Transportation. Roadway Network System. Release Notes, Linear