Importing Data from Excel SpreadsheetsDos, donts, and updated
procedures for ArcGIS 10By Mike Price, Entrada/San Juan, Inc.
This exercise models data from a well-known gold and base metals
mining area in northern Nevada located near the town of Battle
Mountain. Although the sample data is synthetic, it is true to the
underlying geology of Battle Mountain, Nevada.~ A,~Mq. N. 1*
.4,
.-~
~
t
5~
~ ~
O~
.:~
n~
a
n
~
,ac.~.t
-
Vt
-
,, .
.
. .
.
.
.
~
..
.
Many organizations keep valuable data in Microsoft Excel and
comma-separated ~ ~.t values (CSV) files. Learn a methodology ~ for
importing data kept in Excel and CSV files into ArcGIS that has
been updated for ~ ArcGIS 10 and Microsoft Office 2007/2010. ~
Excel spreadsheets have been used since ~ k the release of ArcGIS 8
to prepare and ~ ~ import tabular data into a GIS. Previous ArcUser
articles described the benefits and limitations of spreadsheets in
the version ~ of ArcGIS current at that time. In early at 2004,
ArcUser editor Monica Pratt wrote Working with Excel in ArcGIS
(esri.com ~ news/arcuser/0104 files excel.pdf). In 2007, the author
wrote another article on the same topic, Mapping and Modeling
Groundwater Geochemistry (esri.com/news/arcuser 0207 files
groundwater.pdf). Since these articles were published, Microsoft
has released two new versions, Excel 2007 and Excel 2010. With each
release, spreadsheet capabilities have improved and the processes
for importing data into ArcGIS have changed. This article updates
and refines rules and procedures for importing Excel 2003 files
into ArcGIS 9.x. This exercise reexamines the Excel spreadsheet as
a data import
~
I;. t~
.
~.
~
i~ ~ :i~ ,. .
~-
ft
L
,
~
r~
(P
tool, focusing on ArcGIS 10 and Excel 2007/2010. The tutorial
uses spreadsheets to create and enhance geologic data. Field
samples indude Hydrogeochemical Stream Sediment Reconnaissance
(HSSR) points plus custom soil and rock data. In this exercise, we
will model a well-known gold and base metals mining area in
northern Nevada, located near the town of Battle Mountain. The
custom samples are typical of data that might come from the field,
assayed by a modern analytic laboratory.
su Spring 2012 esri.com
Hands On
What You Will Need ArcGIS 10 for Desktop Microsoft Excel 201
0/2007 or 2003 or the 2007 Office System Driver Sample dataset from
the ArcUser website
Best Practices When Using Excel Files with ArcGISHow many times
have you imported spreadsheet data into ArcGIS only to find some
cells empty or formatted with an unwanted field type? To avoid
these and other problems, follow these practices when creating and
maintaining data in Excel you will use in ArcGIS.
1. When creating spreadsheets, make sure fields are fewer than
255 characters. A Word about Microsoft Excel VersionsIf you have
installed Office 2007, you can read .xls and .xlsx files. If you
have Office 2003 or 2010 installed, you can read .xls files, but
you will need to install the 2007 Office System Driver to read
.xlsx files. If you do not have Microsoft Excel installed, you must
install the 2007 driver before you can use either xis or .xlsx
files. The 2007 Office System Driver can be downloaded from the
Microsoft Download Center at links.esri.com/office_07_sysdriver.
Carefully follow the installation instructions before you restart
ArcGIS. Also, if you have previously specified on the File Types
tab of the Customize > ArcCatalog Options dialog box that
ArcCatalog show you .xls files, youll need to remove this file type
to be able to access Excel files directly. Before beginning to work
the exercise, read the accompanying ar ticle, Best Practices When
Using Excel Files with ArcGIS, for valu able tips on working with
Excel data. ArcGIS reads the first 255 field characters. Fields
with more than 255 characters are converted to BLOB fields and are
not read able. Abbreviate, manually truncate, or split any fields
longerthan 255 characters.
2. Check the numeric field type before and after importing Excel
data.ArcGIS typically converts spreadsheet numeric fields to double
precision (Double), which may not meet your needs. If necessary,
create new fields of the desired type and calculate values into
them.
3. Check the format for date fields.ArcGIS 10 uses the Lotus
date/time format. In this format, the calendar date is represented
by a whole number value that rep resents the number of days since
January 1, 1900, plus one day (due to a bug in Lotus 123 and
carried over to Excel). Time is rep resented as the decimal portion
of a 24-hour day. If date/time data is important, format the input
spreadsheet using a standard Excel date/time format.
Getting Started: Examining Files in ArcCatalogTo begin this
exercise, go to the Arc User website and download the training
data, excelmagic.zip. Unzip the excelmagic.zip data into a project
area on your local machine and start ArcCatalog. Navigate to the
Battle_Mountain folder and locate the XLSFi1es folder. When
ArcCatalog dis plays an Excel file, it adds a dollar B ~ Folder
Connections ~ sign ($) to each worksheet name. 51 ~ a~ Inside this
folder, expand all files. Locate Sample_Locations. S ~
Battle_Mountain 51 ~ GDBFiIes xlsx and preview Rock$. This S ~
XLSFiles B ~1 H~5R_LonLat83 Excel 2010 spreadsheet contains SD
HSSR$ B ~j HSSR_LonLat83_2003 two worksheets named Rock$ SD HSSR$
and Soil$. Rock$ and Soil$ con ~ RockData 9 ~ Sample_Locations tam
sample numbers, universal SD RockS SD SoilS transverse Mercator
(UTM) ~ Samplej.ocations_2003 SD RockS coordinates, and field
informa [ID SoilS tion that allow this data to be Soil_Data Qj
Battle_MountainOl posted on a map. Next, preview 51 Toolboxes 51
~[I Database Servers HSSR_LonLat83.xlsx and study 51 ~ Database
Connections W~GlS Servers its only worksheet, HSSR$. 51 ~ Tracking
Connections Next, locate and preview two CSV files, Rock_Data
Preview the sample data in and Soil_Data. These files 0
ArcCatalog.
4. Follow ArcGIS field naming rules when creating Excel column
names.The first row of an Excel worksheet sets the name for each
column. Column names become field names when an Excel worksheet is
imported into ArcGIS. Always follow these naming rules:
Column/Field names must begin with a letter. Column/Field names
must contain only letters, numbers, and the underscore character.
Column/Field names must be no more than 64 characters. If a name is
longer than 64 characters, ArcGIS retains the first 63 characters.
Column/Field names may not consist solely of reserved words. Do not
use these words in field names. See support.microsoft.
com/kb/286335 for the list of reserved words. ArcGIS typically adds
trailing underscore to reserved word field names added by copying
and pasting from other sources. Column/Field naming rules also
apply to named ranges im ported from Excel. For more information,
search ArcGIS 10.0 web help for Excel worksheet.
euri.Com Spring 2012
au
lflb.Li.4.fl~05.1~
-
fo~
D~,
f25 00 2~I 49352010 4~21Th0024 45~4099 0,0
~=.
~z~~. !0 21
0~
K
I
,,
510(III
:~
II 12912
s.As
3006, I64~
4219 52075 4239 22610 41509303912027 5fl 22250 32 12101 4153
22191 ~ 12195 570 1,95 570 12101 4250 22195 4240 2193 4141 I~4 45
40 2512 4145 1~4221l5 4140 ~4125l9 0 22344 4110 22545 4251 19546
4250 12549 570 12200 570 27954 4l~i2243 4257 4254 4224 429100
22595
~
20034 25535 2~1 20035
~~42 30043 29044 29045 20046 91047 20990 26052 ~52
49379322 44350056 44355221 4~035 4903493 495344.20
41~ll2O 995* 4~3.03995* 44959509540,4 442194250,4 44056450240042
~ 440095*039010 4496155049509 449951075940* 441950220 4025
54002019507-45 4490495209920
5,441.4964 ~900I~ 0~IAl296d LI0p07&4~ 49095w 430., 0I~
Sample Locations.xlsx. Open the Soil worksheet and inspect the
data. This lo cation table contains 20,096 soil sample points
posted in UTM North American aLum~,o303)
one
.11 ivieiers.
41910242 4~7414 4~1 91 499030.32 49011101
~
5*90,400264
iW0,~230.~0.l~ 149p., 00264 152~0641 11.525~
Coordinates are posted and displayed using a precision of 0.01
meters. Many 1 (1 samp es are co e y aim roup. m.-
~4~.43 44029521 4~04 95 44~5O954050 4499250239954 4495~91 6454
T0004~ T~0,~
m5*o.s
~
Save this spreadsheet as a new file so you can retain the
original data as an archive. Name the new file Sample_Locations
Import.xls.P1 ic e 301.10 wor.s.s eeu anu ioo.~. aus. e first row
of data. Many text strings in this 1 0.C
~i10
29520 26996 2~7 ~6 29559 20~ 1~l 13 2~4
42994940 44951095 4~4412 49512025 47952595 41520101 4~l407
40040301 49034301 496~20
4490120340040 4410412.35 4045 4490203.29 P.04k 4499134950044
4449595959949 441035635100*
~
T~07~ oit.rr 321 164020~ ~e~_91_ 1000,~ 1btI,, m0095I4~7
14049,~1
15*
1
1..
.1 1
1
5.
10704203
12570 22522
2~5 29595 20950 2~~70
44~03l 0024 4420~42 4,04 44)9595290049
440004.09 44953420 419654451156
~ 4444010959904 4400190179049440701 111 ~
0_S
~ . ~ ~
row contain spaces. Change these spaces to underscores. (Hint:
Select onlythe first35-
4203
720002064I
010~
rowanduseFindandReplace.) Next, clarify the coordinate system
columns. Change Easting to UTM83Z11_E and Northing to UTM83Z11_N.
Now define a named range. Move to cell
After field names have been corrected, create a named range in
Excel called Rock_ Locations_lmport_R.
contain companion analytic data for the Rock$ and Soil$
worksheets. The [SAMPLENO] field in both CSV files will support a
one-to-one tabular join with the same field in the Soil$ and Rock$
worksheets. Closely inspect the alignment of data in Soil_Data
columns. Notice that [SAMPLENO] and [SBPPM] are aligned on the left
side of the column while [AU_PPB], [AG_PPM], [AS_PPM], and [HG_PPB]
are aligned on the right. Scroll down through the table and observe
that many fields in the right-aligned columns are empty. In the
source CSV file, many of the fields contain nonnumeric strings that
do not display properly. Notice that [SB PPM], a left-aligned
field, contains many fields that begin with a less than () symbol
(e.g., >10,000 for gold). Fortunately, the over-limit samples in
this dataset have already been resolved, so only the less than
values need fixing. Since it is statistically meaningful to
recognize that some small amount of each element exists in all
samples, it is not appropriate to change all < values to zero.
Instead, change them to a smaller ab solute value, typically 20 to
50 percent of the detection limit. Take a more conservative
approach and use 20 percent. Table 1 lists the current value and
smaller absolute value for elements below the minimum detection
limit. Detection Limit5.0 ppm 5.0 ppm 5.0 ppb 10.0 ppb 0.5 ppm
ElementAntimony Arsenic Gold Mercury Silver
Abbr.Sb As Au Hg Ag
UnitPPM PPM PPB PPB PPM
Change From