-
GoogleAnalyticsby Justin Cutroni
Copyright 2007 O'Reilly Media, Inc.
ISBN: 978-0-596-51496-9
Released: September 11, 2007
Web analytics is the process of measur-ing your web site,
analyzing the data,and making changes based on the anal-ysis. Many
businesses are just starting tolearn how they can increase the
perform-ance of their web site by using web ana-lytics. For many
people, their firstexposure to web analytics is Google An-alytics,
a free tool available to everyone.Although analysis is vital to web
analyt-ics, you can't do analysis without gooddata. Configuring
Google Analytics cor-rectly is the key to collecting good data.This
Short Cut provides a thorough de-scription of how the Google
Analyticssystem works, information about manydifferent types of
implementations, andways to avoid common pitfalls. It alsoshares
some best practices to get yoursetup correct the first time.
Contents
Getting Setup Correct ..................... 2How Google
Analytics Works ........ 2Profiles and Profile Settings ..........
11Filters .......................................... 15Goals and
Funnels ........................ 29Common Web SiteConfigurations
............................. 35Marketing Campaign Tracking .....
47E-Commerce Tracking ................. 56Custom Segmentation
.................. 63CRM Integration ..........................
66Tips and Tricks ............................ 69Reference
..................................... 81Conclusion
.................................. 89
Find more at shortcuts.oreilly.com
-
Getting Setup CorrectI wrote this Short Cut for one primary
reason: to help you configure Google Ana-lytics correctly. If it's
not configured correctly, then you may have incorrect data,leading
to incorrect analysis. In my experience, getting Google Analytics
config-ured correctly is the biggest roadblock to using it
effectively.Throughout the Short Cut, I try to explain how Google
Analytics works so youcan understand the impact of various
configuration choices. Remember, GoogleAnalytics is used to analyze
business data, which means each business will con-figure it
differently. You need to identify what's best for your business
andconfigure Google Analytics accordingly.I believe this Short Cut
can be used in two distinct ways. First, it can be used as
areference manual. If you're already using Google Analytics, and
you have a ques-tion about filters, just flip to the "Filters"
section. Or, if you're not sure you've setup cross-domain tracking
correctly, read "Tracking Across Multiple Domains"in the "Common
Web Site Configurations" section. My goal is to make eachsection a
resource that can be used without reading the entire Short
Cut.Second, you can view this Short Cut as a complete work. I've
tried to structure thesections to follow a typical implementation.
One of the tips I include in the "Tipsand Tricks" section is a
short implementation process. If you're just getting startedwith
Google Analytics, you may want to review that process first and
keep it inmind as you progress through this Short Cut.
How Google Analytics WorksUnderstanding the architecture of the
Google Analytics systemhow it collectsdata, identifies visitor and
creates reportsis the key to understanding many ofthe advanced
topics that will be discussed later in this Short Cut. Before we
begindiscussing filters, goals, and advanced implementations, let's
review the funda-mentals of how the system works.Data Collection
and ProcessingI'm going to explain how Google Analytics collects,
processes, and displays datausing Figure 1. The data collection
process begins when a visitor requests a pagefrom the web server.
The server responds by sending the requested page back tothe
visitor's browser (step #1). As the browser processes the data, it
contacts otherservers that may host parts of the requested page.
This is the case with the GoogleAnalytics Tracking Code (GATC).The
visitor's browser requests the code from a Google Analytics server
(step #2),and the server responds by sending the code to the
visitor's browser. All of the code
Google Analytics 2
-
is contained within one file named urchin.js. Once the browser
receives the code,the GATC begins to execute while the rest of the
page loads.During execution, the code identifies attributes of the
visitor and his browsingenvironment, such as how many times he's
been to your site, where he came from,etc.After all the appropriate
data has been collected, the GATC sets (or updates, de-pending on
the situation) a number of cookies (step #3), which are discussed
laterin this Short Cut. The cookies are used to store information
about the visitor. Afterwriting the cookies, the tracking code
sends the data back to the Google Analyticsserver. The data is
transmitted to the server via a request for an invisible GIF
file(step #4).When the Google Analytics server receives this
request, it stores the data in a largetext file called a logfile
(step #5). There is one line in the logfile for each
pageviewcreated by Google Analytics.Each line in the logfile
contains numerous attributes of the pageview. This in-cludes: When
the pageview occurred (date and time) Where the visitor came from
(referring web site, search engine, etc.) How many times the
visitor has been to the site (number of visits) Where the visitor
is located (geographic location) Who the visitor is (IP
address)
Figure 1. Google Analytic processing flow
Google Analytics 3
-
After the pageview is stored in the logfile, the data collection
process is complete.The next step is data processing.At some
regular interval, usually every few hours, Google Analytics
processes thedata in the logfile. During processing, each line is
split into pieces, one piece foreach attribute of the pageview.
Here's a sample logfile line (note that this is not anactual
logfile line from Google Analytics. It is a
representation.)65.57.245.11 www.epikone.com -
[21/Nov/2006:19:05:06 -0600]
"GET/__utm.gif?utmwv=1&utmn=323703347&utmcs=utf-8&utmsr=1600x1200&utmsc=32-bit&utmul=en-us&utmje=1&utmfl=8.0&utmcn=1&utmdt=EpikOne%20-%20Google%20Analytics%20Support%2C%20Training%20-%20Urchin%205%20Software%2C%20Analytics%20Consulting&utmhn=www.epikone.com&utmr=-&utmp=/
HTTP/1.1" 200 35 "http://www.epikone.com/" "Mozilla/4.0
(compatible; MSIE 6.0;Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET
CLR
2.0.50727)""__utma=100957269.323703347.1164157501.1164157501.1164157501.1;
__utmb=100957269;__utmc=100957269;__utmz=100957269.1164157501.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none)"
While most of this data is difficult to understand, a few things
stand out. The dateand time (Nov 21, 2006 at 19:05:06) and the IP
address of the visitor(65.57.245.11) are easily identifiable.Google
Analytics turns each piece of data in the logfile line into a data
elementcalled a field. For example, the IP address becomes the
"Visitor IP" field. It's im-portant to understand that each
pageview has many, many attributes, and thateach one is stored in a
different field.After each line has been broken into fields (step
#6), filters are applied to the data(step #7). Filters are business
rules that you add to Google Analytics. They controlwhat data
appears in your reports and how it appears.Finally, after the
filters have been applied, the reports are created (step #8)
andstored in the database (step #9). Each report in Google
Analytics is created bycomparing a field, like the Visitor City, to
a piece of integer data (Visits, Pageviews,Bounce Rate, Conversion
Rate, etc.).Once the data is in the database, the process is
complete. When you (or any otheruser) request a report, the
appropriate data is retrieved from the database and sentto the
browser.
WarningOnce Google Analytics has processed the data and it is in
the database, itcan never be changed. This means historical data
can never be altered orreprocessed. Any mistakes made during setup
or configuration can perma-nently affect the quality of the
data.
Google Analytics 4
-
This also means that any changes made to the configuration of
Google An-alytics will not alter historical data.
About the GATCGoogle Analytics uses a very common web analytics
technology called Page Tag-ging to identify visitors, track their
actions, and collect the data. Each page on yourweb site that you
want to track must be "tagged" with a small snippet of
JavaScript.If the tracking code is not on a page, then that page
will not be tracked. The fol-lowing JavaScript snippet is the
standard GATC:
_uacct = "UA-XXXXX-X";urchinTracker();
Google suggests that you place the tracking code immediately
before the tag of each page, so if the browser has any problems
requesting the urchin.js filefrom a Google Analytics server (shown
in Figure 1, step #2), it does not slowdown the page from
loading.However, there may be cases where the tracking code must be
placed at the be-ginning of the page (these will be discussed
later). In these cases you can place theGATC right after the tag or
even right before the tag.If your web site uses a content
management system or some type of templatingengine, then the
tracking code can be added to template files or other mechanismthat
automatically generates common HTML elements. This is a fast,
effective wayto tag all web site pages.If you cannot place the
tracking code before the closing tag, then it ispossible to add it
to some other part of the web page. It can be placed in the tag or
almost anywhere in the main tag. If you do place the GATCin the
tag, make sure it appears inline and not nested within another
tag.However, this can have a negative effect on the visitor's
experience. If there is anylatency in the Google Analytics server,
then the browser will pause while waitingfor urchin.js to download.
The visitor will experience a delay while the web siteloads in the
browser. Remember, the visitor experience is very important.
Anythingthat can degrade the experience should be minimized.
Google Analytics 5
-
NoteIt is possible to host the urchin.js file on your own
server. To do so, copy thecontents of urchin.js by viewing the file
in your browser. Just enter
http://www.google-analytics.com/urchin.js into your browser, copy
the result-ing code, and place it in a file on your server. Then,
update the GATC toreference the new file location on your server
and not the urchin.js locatedon the Google Analytics server:
_uacct = "UA-XXXXX-X";urchinTracker();
It should be noted that Google updates urchin.js without
notifying users. Ifyou decide to host urchin.js on your own
servers, make sure you periodicallycheck for updates.
There are two versions of the Google Analytics Tracking code: a
secure versionand a nonsecure version. The secure version should be
used on secure web pages.If the nonsecure version is used on a
secure page, then the browser will display asecurity warning to the
visitor. Security warnings can negatively impact a
visitor'sengagement with your web site.To simplify the
installation, the secure version can be used on both secure
andnonsecure pages:
_uacct = "UA-XXXX-X";urchinTracker();
Customizing the GATCThere are many settings and features that
can be enabled or modified by addingor changing variables in the
GATC. A complete list of these variables, and howthey can be used
to modify Google Analytics tracking, can be found in the
"Ref-erence" section of this Short Cut.About urchinTracker()The
most important part of the GATC is a JavaScript function named
urchinTracker(). This function is used to collect visitor data,
store that data in cookies,
Google Analytics 6
-
and send the data to the Google Analytics server.
urchinTracker() appears in theGATC:
_uacct = "UA-XXXXX-X";urchinTracker();
Every time urchinTracker() executes, a pageview is created, and
the data is sentto Google Analytics (Figure 1, step #4). Each
pageview has a unique "name" thatcan be found in the URL column of
the Top Content report. Figure 2 shows somesample data from the Top
Content report.During the data-collection process, urchinTracker()
extracts the information fromthe location bar of your browser. It
modifies the value by removing the domainname and domain extension.
The only things left are the directories, filename, andquery-string
variables. This is called the Request URI, and it's one of the
fieldscreated during data processing (Figure 1, step #6). Here's an
example. The URLhttp://www.epikone.com/pages/index.php?id=110 would
appear in the topcontent report as /pages/index.php?id=110. So, in
this example, the Request URI
Figure 2. Top Content report showing pageview "names"
Google Analytics 7
-
is the part of the URL that comes after www.epikone.com
[http://www.epikone.com].That's the default behavior of
urchinTracker(). You can override this behavior andspecify how
urchinTracker() names a pageview by passing a value to
urchinTracker().For example, to change the way the pageview for
/index.php appears in GoogleAnalytics you would modify
urchinTracker() on the index.php page as
follows:urchinTracker('index page')
This modification forces urchinTracker() to name the pageview
'index page'rather than /index.php. The deeper effect of this
change is the value for the RequestURI is not index.php, but index
page. This will have an impact on other configu-ration settings.
I'll discuss this later in this ShortCut.urchinTracker() is just
like any other JavaScript function. This means that it canbe
executed anywhere a normal JavaScript function can be executed. So,
if youplace urchinTracker() in the onClick attribute of an image, a
pageview will becreated in Google Analytics when a visitor clicks
on the image. How will the pa-geview appear in Google Analytics? By
default, it will use the Request URI.However, if you pass a value
to the function you can name the click anything youwant.This
technique can be used to track visitor clicks, actions, and other
browserevents. For example, to track clicks on links to other web
sites (called outboundlinks), simply add the urchinTracker()
function to the onClick attribute of theappropriate anchor tags.
Don't forget to pass urchinTracker() a value so the visitorclick is
identifiable. There are more tricks on using urchinTracker() to
track Flash,JavaScript and non-HTML files in the "Tips and Tricks"
section, later in this ShortCut.About the Tracking CookiesGoogle
Analytics uses up to five, first-party cookies to track a visitor
and storeinformation about that visitor. These cookies, set by the
urchinTracker() function,track attributes of the visitor, such as
how many times she has been to the site andwhere she came from.
Note that the cookies do not store any personally
identifiableinformation about the visitor. Here is a list of all
the tracking cookies, their format,and other information:__utma
Expiration: 2038Format:
domain-hash.unique-id.ftime.ltime.stime.session-counter
Google Analytics 8
-
The __utma cookie is the visitor identifier. The unique-id value
is a number thatidentifies this visitor. ftime, ltime, and stime
are all used to compute visit length(along with the __utmb and
__utmc cookies). The final value in the cookie is thesession
counter. It tracks how many times the visitor has visited the site
and isincremented every time a new visit begins.
__utmb
Expiration: End of sessionFormat: hashcodeThe __utmb cookie, in
conjunction with the __utmc cookie, is used to computethe visit
length.
__utmc
Expiration: End of sessionFormat: hashcodeThe __utmc cookie, in
conjunction with the __utmb cookie, is used to computevisit
length.
__utmz
Expiration: By default, six Months, but it can be
customizedFormat:
domain-hash.ctime.nsessions.nresponses.utmcsr=X(|utmccn=X|utmctr=X|utmcmd=X|utmcid=X|utmcct=X|utmgclid=X)The
__utmz cookie is the referral-tracking cookie. It tracks all
referral informa-tion regardless of the referral medium or source.
This means that all organic,CPC, campaign, or plain referral
information is stored in the __utmz cookie.Data about the referrer
is stored in a number of name-value pairs, one for eachattribute of
the referral:utmcsr
Identifies a search engine, newsletter name, or other source
specified in theutm_source query parameter See the "Marketing
Campaign Tracking"section for more information about query
parameters.
utmccnStores the campaign name or value in the utm_campaign
query parameter.
utmctrIdentifies the keywords used in an organic search or the
value in theutm_term query parameter.
Google Analytics 9
-
utmcmdA campaign medium or value of utm_medium query
parameter.
utmcctCampaign content or the content of a particular ad (used
for A/B testing)The value from utm_content query parameter.
utmgclidA unique identifier used when AdWords auto tagging is
enabled This valueis reconciled during data processing with
information from AdWords.
The expiration date for the campaign cookie can be set in the
GATC. See the"Reference" section for more information about how to
change the defaultvalue. Also, there is more information about
referral tracking in the "MarketingCampaign Tracking" section.
__utmv
Expiration: 2038Format: domain-hash.valueCustom Segmentation
cookie. This cookie is not present unless custom seg-mentation has
been implemented. The cookie is created using the __utmSetVar()
function, which will be discussed in a later section. The value
passed to__utmSetVar() is stored in the value section of the __utmv
cookie.
In this Short Cut, I don't discuss any of the issues regarding
cookies and visitortracking. There are numerous studies, white
papers, and blog posts estimating therate at which cookies are
blocked by browsers and deleted by users. Eric Petersenfirst wrote
about the pitfalls of cookies in a 2005 study for Jupiter Research.
Asummary of his work can be found in this press
release:http://www.jupitermedia.com/corporate/releases/05.03.14-newjupresearch.htmlIn
my opinion, the best course of action to mitigate visitor behavior,
and its effectson your data, is to look for trends and patterns in
your data and to avoid absolutenumbers.
Note
Google Analytics does not track any visitor who has configured
his browserto block first-party cookies or any visitor that has
disabled JavaScript. Thereis no way to circumvent this
limitation.
Google Analytics 10
-
Also, if a visitor deletes his cookies, he will appear as a new
visitor the nexttime he visits the web site.
Profiles and Profile SettingsThe data for each web site that you
track is stored in a profile. Most documentationdescribes a profile
as data for a web site. But a profile is more than just data.
Eachprofile has a number of settings that can affect the data
within the profile.A more accurate way to describe a profile is a
collection of data and business rules.The business rules modify the
data in the profile. In Google Analytics, the businessrules are
called filters (I'll discuss them more in a later section), and
each profilecan have different filters.Multiple profiles can be
created for the same web site. Each profile can have dif-ferent
filters, thus changing the data in each additional profile created
for a website. So, even though you may have two profiles for
http://www.epikone.com, thedata in the reports could be
dramatically different because of the different filtersapplied to
each profile.Why would you create multiple profiles for a single
web site? To create differentsets of data for different types of
analysis. I'll discuss this more in the "Filters" and"Tips and
Tricks" sections, where I suggest some profiles that you should
considercreating.In addition to filters, there are other settings
that are common to each profile.Understanding how each setting
alters the data in the profile is important whenyou're setting
things up.Website URLThe Website URL is used for two simple tasks.
First, it is used to check the instal-lation of the tracking code.
After a profile has been created, Google Analytics willspider the
Website URL value and search for the tracking code to insure that
it hasbeen installed correctly.The Website URL is also used to
create the Site Overlay report. When the SiteOverlay report is
generated Google Analytics adds the web site URL value anddisplays
the page in a separate window. It then adds the appropriate data to
eachlink on the page.Profile NameThe Profile Name identifies each
profile in a list. There are no restrictions on howto name a
profile. You can even create two profiles with the same name, but I
don't
Google Analytics 11
-
recommend this (how would you differentiate them in a list?). I
suggest namingprofiles something descriptive that all users will
understand. If a profile has a filterapplied to it, then include a
small description explaining how the filter changesthe data in the
profile.For example, I might name a profile "www.epikone.com
[http://www.epikone.com] - vermont traffic only". When viewing it
in a list, I can easily understandthat the profile is for the
www.epikone.com [http://www.epikone.com] domainand the profile
contains traffic only from Vermont.
TipWondering when a profile was created? Add the "start date" to
the profilename; that way, you'll always know how much data exists
in the profile.
Time ZoneThis setting can be changed only if your Google
Analytics account is not linked toan AdWords account. If you've
linked your Analytics account to an AdWordsaccount, then the Time
Zone setting will be the time zone you defined in yourGoogle
AdWords account. Applying the AdWords time zone to the Analytics
dataensures that the Google AdWords reporting in Google Analytics
is accurate.Default PageSetting the default page for a website is a
simple configuration step that ensuresthe quality of your Google
Analytics report data. The default page for a web siteis the page
shown to a visitor when they enter just the web site domain into
thebrowser's location bar. If you type http://www.epikone.com/ into
your browser,the web server returns index.php. You won't see
index.php in the browser's locationbar, but that's the page the
server returns. This is the same for directories withinyour web
site. http://www.epikone.com/blog also returns index.php.Why does
this matter? When the GATC executes, it creates pageviews using
thepage name that the visitor requested. What if there is no page
name, as is the casein this example? Google Analytics creates a
pageview and names it /. However,when the user types
http://www.epikone.com/index.php, Google Analytics cre-ates a
pageview for index.php. Although the visitor sees the same page,
GoogleAnalytics creates a pageview for / and a pageview for
index.php: two pageviews forthe same page. Pageviews for a content
page should be summarized as a single lineitem, not two. Figure 3
illustrates how two pageviews can exist for a single page.To remedy
this problem, enter the default page for your web site in the
"DefaultPage:" field in the Main Website Profile Information
configuration section. Be sure
Google Analytics 12
-
to enter only the page name. Do not include a slash before the
page name, and donot use regular expressions. As Figure 4 shows,
just input the name of the page,nothing else.
Exclude URL Query ParametersSee the "Dynamic Websites" section
for coverage of this setting. Even if you do nothave a dynamic web
site, you should still read this section.
Figure 3. In the above image, /index.html and / are the same
page and should beconsolidated into a single line item
Figure 4. "The Default page" setting
Google Analytics 13
-
E-Commerce SettingsThere are two e-commerce settings in the
profile information section, shown inFigure 5. The first,
E-Commerce Website, is a switch. When set to Yes, a seriesof
e-commerce reports will be added to the reporting interface. By
default, thissetting is set to No. So, if you have an e-commerce
web site, make sure you changethis setting when creating a
profile.The second e-commerce setting formats the currency in your
Google Analyticsreports. Google Analytics can display currency in
eight different formats. Changingthe currency setting will alter
the way e-commerce data is displayed in the
reportinginterface.Apply AdWords Cost DataMost profile settings can
be configured in the Google Analytics interface. However,one
profile setting, "Apply Cost Data" (Figure 6) can only be activated
via theAdWords interface. This setting is used to automatically
import cost data fromAdWords into Google Analytics.You can read
more about the Apply Cost Data setting in the "Marketing Cam-paign
Tracking" section.
Figure 5. Profile e-commerce settings.
Google Analytics 14
-
NoteRemember, making changes to any profile settings will not
alter the datathat has already been processed by Google Analytics.
It will affect the dataprocessed after the settings have been
changed.
FiltersThere is no Google Analytics concept that is more
important, but less understood,than filters. Functionally, filters
are business rules. They are added to a profilebecause there is a
business need to modify the data in a profile. For example, it
isvery common to exclude web site traffic generated by internal
employees. Thisdata can skew the data generated by actual
customers, thus causing incorrect anal-ysis.I believe the key to
understanding filters is understanding how web site data
isstructured in Google Analytics. I discussed this earlier in "How
Google AnalyticsWorks." If you have not read that section please do
so.There are two types of filters in Google Analytics: predefined
filters and customfilters. Predefined filters are common filters
that most people use. Google has bun-dled these common filters
together and simplified the implementation.
Figure 6. How to apply Google AdWords cost data to a Google
Analytics profile
Google Analytics 15
-
Custom filters are different. You need to do all the
configuration work when cre-ating a custom filter. While it can be
challenging, custom filters truly offer youadvanced control over
the data in your profiles.In general, custom filters and predefined
filters work off of the same premise.How Profile Filters
WorksFigure 7 displays the three common components of a filter:
Filter field Filter pattern Filter typeI find it easy to define
filters as a process involving these components. As GoogleAnalytics
processes site data, it executes the filters that have been applied
to theprofile. It compares the filter field against the filter
pattern and, if the field matchesthe pattern, the filter performs
an action. When the action occurs, the data in theprofile is
changed.Multiple filters can be applied to a profile. When more
than one filter is appliedto a profile, they are executed
sequentially, in the order they are listed. The outputfrom one
filter is used as the input to the next filter.
Figure 7. The three primary parts of a filter
Google Analytics 16
-
WarningFilters forever modify the data in the profile. This
means an incorrect filterwill forever alter profile data. Be
careful when applying filters to your pro-files. Test your filters
on a "test" profile to ensure they work as expected (seethe
"Reference" section for more information about test profiles).
Filter fieldThe first part of a filter is the filter fields.
These data elements are created whenGoogle Analytics processes the
data in the logfile (Figure 1, step #6). Each filterfield is an
attribute of a pageview in Google Analytics. There are 37 different
filterfields in Google Analytics, each of which can be used to
create a filter. A completelist of filter fields, and what they
represent, can be found in the Google Analyticssupport documents:
http://www.google.com/support/googleanalytics/bin/answer.py?answer=55588Some
of the most common filter fields are listed in Table 1.Table 1. The
most commonly used filter fieldsFilterField
DescriptionRequestURI
The request URI is created using the information in the location
barof the browser. Google Analytics removes the subdomain, the
host-name, and the extension. Everything remaining becomes the
requestURI.
Hostname This is the primary and subdomain (if present), listed
in the locationbar of the visitor's browser.
Visitor IPAddress
The IP address of the person visiting your web site. While this
field isavailable for use in filters, it is not visible. The value
of the IP address,which is protected by the Google Analytics
Privacy Policy, cannot bedisplayed in reports.
TipEach field represents a piece of data that can have many
values. For example,the field for Visitor City can contain Boston,
San Francisco, Seattle, etc. Tofind the different values stored in
a field, use the Google Analytics reports.As mentioned previously,
each report is constructed using a field, so all thevalues in a
field are displayed in the report built from that field. At the
timeof writing, there is no reliable documentation about which
reports are cre-
Google Analytics 17
-
ated from each filter field. This information should appear in
the GoogleAnalytics support documentation soon.
Filter PatternThe second part of a filter is the filter pattern.
The pattern is applied to the filterfield, and if the pattern
matches any part of the field, the pattern returns a
positiveresult, thus causing an action to occur. The patterns used
in Google Analytics arecalled regular expressions.A regular
expression is a set of characters that represents a larger set of
data. Thesecharacters may be standard alphanumeric characters (like
letters or numbers) orspecial characters (like the * or +).Rather
than begin a discussion on regular expression in this section, I
feel that itis better to continue the conceptual discussion of
filters. If you understand thebasic concept of a regular
expression, that it is a pattern applied to data, then youwill be
able to follow this section.(More information about regular
expressions can be found in the Reference sectionof this Short
Cut.)Filter TypeThe final part of a filter is the filter type. The
filter type is what happens to the dataif the filter pattern
matches the filter field. There are seven different types of
filters,each with a distinct function. They are all custom filters,
and their descriptionsfollow next.Include/Exclude
filtersInclude/exclude filters are the most common custom filters
in Google Analytics.They're also the easiest to understand. The
action for an include filter is "inclusion."This means that if the
regular expression matches the filter field, the data is inclu-ded
in the profile.Exclude filters operate in the opposite manner. If
the filter pattern matches thefilter field, then the data will be
excluded from the profile.Include/Exclude filters are extremely
powerful because they can be used to seg-ment your data in
different ways. For example, to analyze the visitation habits
ofvisitors from California, create an Include filter as shown in
Figure 8.The result of this filter would be that all data stored in
the profile, and displayedin the reports, would be for visitors
from "california." (The Visitor Region fieldstores the U.S. state
name.)
Google Analytics 18
-
A similar example (Figure 9) would be an Include filter for
visitors from New York.Now, how would you include visitors from New
York and California? The answeris not as easy as applying two
filters to the profile. Remember, filters are appliedsequentially
during processing, and the output from filter #1 is used as the
inputfor filter #2. Applying two filters to a profile, one to
include visitors from New
Figure 8. An include filter that "lets in" data from
California
Figure 9. A filter to include visitors from New York
Google Analytics 19
-
York and one to include visitors from California would not work.
The reason isthe first filter would naturally exclude the filter
pattern of the second filter. Tocombine the functionality of these
two filters, a single filter, shown in Figure 10,must be used.This
filter uses a regular expression to indicate that the Visitor
Region must be"New York" or "California".
Search and Replace filterThe Search and Replace filter is a
simple way to replace one piece of data with adifferent piece of
data. This is most often used to replace long, unreadable URLswith
more "human-readable" information.Search and Replace filters are
slightly different than other filters because they donot have a
filter pattern. Instead they have a search string, which is the
same as afilter pattern.When a Search and Replace filter is applied
to a profile, the filter searches the filterfield for the search
string. If the search string is found in the filter field, then
thefilter replaces the entire search string with the replace
string. Figure 11 illustratesa common Search and Replace
filter.
Figure 10. A filter to include visitors from New York and
California
Google Analytics 20
-
This filter searches the Request URI field for the pattern
"category_id=1234". Ifthe search string is found in the Request
URI, then the entire Request URI will bereplaced with the string
"Chairs".
WarningThe Replace String is standard text. It is not a regular
expression. Also notethat the entire Filter Field is replaced with
the Replace String. So, if there areany other filters attached to a
profile, and those filters use the same filterfield as the Search
and Replace filter, then those filters may not work if theSearch
and Replace filter first modifies the value in the filter
field.
Lowercase/Uppercase filtersLowercase/Uppercase filters (see
Figure 12), are different than other filters in thatthey do not
require a filter pattern, only a filter field. Simply put, a
Lowercase orUppercase filter changes the selected filter field to
all lowercase characters or alluppercase characters,
respectively.
Figure 11. A common Search and Replace filter
Google Analytics 21
-
Why is this filter needed?Some web servers, particularly
Microsoft IIS servers, create pageviews with mixed-case URLs. This
means that Google Analytics creates multiple line items for thesame
physical page in various reports.As an example, the URLs
http://www.epikone.com/default.asp and
http://www.epikone.com/Default.asp will generate the same page for
the visitor. How-ever, Google Analytics will create two line items
in the Top Content report, onefor default.asp and one for
default.asp. Obviously these are the same page andshould be tracked
as a single line item. A Lowercase filter forces the filter fieldin
this case, the request URIto a consistent case, thereby
consolidating all ver-sions of the same page into a single line
item in the Google Analytics reports.
TipAnother good use of the Lowercase/Uppercase filter is for
keywords. Manyusers want to see "EpikOne", "epikone", and "EPIKONE"
as the same key-word, not three different keywords. An Uppercase or
Lowercase filter,applied to the Campaign Term field, will change
the keyword case.
Figure 12. Lowercase filter setup form; Uppercase filters have
the same settings
Google Analytics 22
-
Lookup Table filtersLookup Table filters are not currently
active in Google Analytics. If you were pre-viously an Urchin on
Demand customer, and were using Lookup Table filters, thenthey will
still work. However, new Lookup Table Filters cannot be created.The
idea behind lookup tables is that they are an automated
search-and-replacefilter. A lookup table is a text file that can be
uploaded to Google Analytics. GoogleAnalytics then applies the
information in the lookup table to a specific filter field.For
example, a lookup table could be used to replace product IDs in
Google An-alytics with the name of the actual product.This filter
can be extremely useful. Hopefully it will be added to Google
Analyticssoon.Advanced filtersAdvanced filters can alter data
fields by combining elements from multiple filterfields, removing
unnecessary parts of filter fields, or replacing one filter field
withanother.Unlike most filters, advanced filters have two filter
fields: Field A and Field B. Alongwith each filter field, there is
an Extract field. The Extract field is synonymous withthe filter
pattern; it is the regular expression that is applied to the filter
field. Itshould be noted that you don't have to use both fields.So,
Extract A is applied to Filter A, and Extract B is applied to
Filter B. The reasonwhy the filter patterns are named Extract for
advanced filters is that certain partsof Field A and Field B can be
removed, or extracted, from each field. The part ofthe filter field
that is extracted is specified using a regular expression.In Figure
13, two fields are referenced: the Request URI for Field A and the
Host-name for Field B. The pattern applied to filter Field A means
"capture all thecharacters in the Request URI and retain those
characters." The pattern applied tofilter Field B means "match all
the characters in the Hostname and retain thosecharacters."What
happens to those characters that are captured in the Extract
Fields? GoogleAnalytics allows you combine the extracted pieces of
data and output them toanother field, called a Constructor. The
Constructor is simply a field. After a partof the field has been
captured, Google Analytics stores it in a variable. Data ex-tracted
in Extract A starts with $A, and data extracted from Extract B
starts with$B.You can configure Google Analytics to permanently
change the value of the con-structor using the Override Output
Field setting (see Figure 13). When you select
Google Analytics 23
-
Yes, the data in the constructor overwrites the Output Field
value. So any reportsthat are created using that field will be
modified. Table 2 breaks down the processof combining two extracts
and exporting them to a constructor.
Table 2. How two extracts can be combined to modify an existing
field
Captured part ofRequest URI [$A1]
Captured part ofHostname [$B1]
Output to Constructor: Re-quest URI [$B1$A1]
/pages/index.html www.epikone.com[http://www.epikone.com]
www.epikone.com/pages/in-dex.html
[http://www.epikone.com/pages/index.html]
If this filter is applied to a profile, then all the reports
based on the Request URIchange; they will include the hostname as
well as the directory path, filename, andquery-string
variables.
WarningModifying the Request URI field using an advanced filter
can affect othersettings in Google Analytics, most notably goal
settings. (Goals can be cal-
Figure 13. Advanced filters have two filter fields, Field A and
Field B; a portion of eachfilter field can be extracted using
regular expressions
Google Analytics 24
-
culated using the Request URI.) So, if an advanced filter
changes the RequestURI, make sure to check the effect on your goal
settings.
There are two other settings specific to an Advanced filter:
Field A Required andField B Required. These settings control the
logic of an Advanced filter. When youset either one of these
options to Yes, Google Analytics will place some constraintson when
the filter takes action. If Field A does not match the pattern in
Extract A,then the filter will not "execute." The same goes for
Field B.Here's one way an Advanced filter can be used to gain more
insight into the actionsof visitors. In some applications it may be
useful to identify which organizationsare consuming content on your
web site. Maybe it's a competitor or potential client.An Advanced
filter can be used to attach the ISP or network name to the
RequestURI. This will display the name of the organization's
network along with the pagerequested. Figure 14 shows the settings
for such a filter.The above filter would modify the Constructor
(Request URI in this case) by add-ing the Visitor's ISP or
organization's name to the Request URI. So, all reportscreated
using the Request URI would be modified. The result of this filter
is shownin Figure 15.Now, let's take the above example one step
further. Let's say we'd also like to seethe keyword the visitor
used to find the web site along with the network name.
Figure 14. Filter to concatenate the visitor's ISP organization
and Request URI
Google Analytics 25
-
We can use a series of filters, passing data from one filter to
the next, to modify aconstructor.This first filter would be an
Advanced filter to add the keyword to the ISP or net-work name.
While this filter is similar to the previous filter, there is one
maindifference. The constructor used in this filter, shown in
Figure 16, is Custom Value1. This field is a temporary field that
is not used by any reports. It is meant to beused when you need to
pass data from one filter to another.The second filter, shown in
Figure 17, must add the Request URI to the value wepreviously
created and stored in Custom Value 1. I've used a comma in the
con-structor to separate the Campaign Term from the Visitor ISP
Organization. I likeusing the comma to separate values because if I
export the data from Google An-
Figure 15. The Top Content report displaying the modified
Request URI
Figure 16. Advanced filter that stores data in a temporary
variable
Google Analytics 26
-
alytics into Excel, I can easily import it as a comma-separated
file, and Excel willplace each value in a new column.I've chosen to
use the "Visitor Java Enabled?" field as the constructor for the
secondfilter. The reason is I want to place the data in a field
that is not used by any "major"reports. If I used the Request URI
as the constructor, then all of the reports thatare based on the
Request URI would break. So, by using "Visitor Java Enabled?",I
break only the Java Enabled report. In my experience, this report
is not used veryoften and can be sacrificed. If you use the Java
Enabled report, then consider usinga different field for the
constructor or create an additional profile for this
filter.Remember, for the above filters to work correctly, they must
be in a specific order.Google Analytics does not limit the number
of extracts for each field. Multipleparts of Extract A and Extract
B can be captured. If more than one part of an extractis captured,
then Google Analytics will retain multiple variables for that
extractfield. Figure 18 shows multiple values extracted from a
filter field.The above filter will capture all of Extract A (the
entire hostname) and two partsof Extract B. The first part of
Extract B that will be captured is "v" followed by anycharacter.
The second extract from Field B will be everything after
"/23/".Then, the filter will combine the hostname with the two
extracts from Field B. Itwill separate the values in $A1, $B1, and
$B2 with slashes. The characters entered
Figure 17. Advanced filter that modifies a value stored in a
temporary variable
Google Analytics 27
-
in the Output To field are literal characters, meaning that they
appear exactly asyou type them, except for those that begin with $A
or $B.Predefined filtersBefore we conclude our discussion on
filters, I should mention that Google Ana-lytics has three
predefined filers. These filters are the three most commonly
usedfilters. Google has simplified the implementation by removing
the filter field andfilter pattern. With a predefined filter, just
choose the filter type and enter theappropriate data into the form.
Figure 19 lists the three predefined filters.The "Exclude all
traffic from a domain" filter uses a reverse lookup to identify
thedomain of the site visitors. The visitor domain excluded is the
domain associatedwith the visitor's IP address."Exclude all traffic
from an IP address" removes all data coming from the
addressedentered into the filter pattern. This filter is primarily
used to exclude internal com-pany resources.The "Include only
traffic to a subdirectory" filter isolates data for a specific
direc-tory on the web site. This filter is usually used to create
profiles that focus on onepart of the web site.
Figure 18. Multiple extracts using an Advanced filter
Google Analytics 28
-
Goals and FunnelsAnother common profile configuration is the
creation of goals and funnels. Whileit is not necessary to create
any goals or funnels, it is highly recommended. GoogleAnalytics
automatically segments report data and displays the goal conversion
ratefor each line item in the report. So, if you're looking at a
report containing key-words, Google Analytics will display the goal
conversion rate for each keyword.This type of segmentation is
extremely valuable when analyzing data but is onlypossible if you
set up goals for your profiles.GoalsGoogle Analytics goals are a
way to measure conversion activities on your website.A goal is
simply a pageview that indicates the visitor has completed some
type ofhigh-value process. This process could be filling out a
contact form, purchasing aproduct, or downloading a file. Each
process usually concludes with some type of"thank you" page. In
Google Analytics, this is called the goal page.A goal is defined by
the URL of the goal page. As Google Analytics processes sitedata,
it increments the goal counter each time a goal page is found. If
the goal pageis found multiple times during a single visit, the
goal counter is incremented onlyonce. This is important because it
means that a visitor can convert only once duringa visit.
Figure 19. Predefined filters
Google Analytics 29
-
There are multiple ways to define a goal, depending on the
complexity of your website. The easiest way to create a goal is to
paste the URL of your goal page into theGoal URL field. Figure 20
shows the goal setup form and the Goal URL field.So, if your
checkout process ends with http://www.epikone.com/thankyou.php,
enter "http://www.epikone.com/thankyou.php" in the Goal URLfield.If
the URL of the goal page is
http://www.epikone.com/thankyou.php?submit=true then enter
http://www.epikone.com/thankyou.php?submit=true intothe Goal URL
field.A goal can also be defined using a regular expression. Rather
than enter an exactURL in the Goal URL field, you can enter a
regular expression. This is particularlyhelpful if the web site is
dynamic and contains query-string parameters that maydiffer from
one visitor to the next. If the goal page contains a unique
identifier,then you can't copy and paste a URL into the Goal URL
field; every goal URL willbe different. You'll need to use a
regular expression for the Goal URL. I'll discussthis later in the
"Additional Settings" section.The Goal setup form also includes a
field for "Goal name." The "Goal name" willbe used to identify the
goal in the reports. The Activate Goal setting is an on-offswitch.
Switching the setting to Off will stop tracking for the goal. Why
would youwant to turn a goal off? Google Analytics will calculate
an overall web site con-version rate using all of the goals you
define for the site. If you create a goal that istemporary, say for
a specific campaign, then it could artificially skew the
overallsite conversion rate if you leave the goal on after the
campaign ends.
Figure 20. Paste the URL for a goal page in the Goal URL text
field
Google Analytics 30
-
NoteIn reality, Google Analytics uses only the Request URI when
calculatinggoals. So, even if you specify the entire URL as a goal
page, Google Analyticswill use only the Request URI. This also
means that if you modify the RequestURI using a filter (like an
Uppercase filter or Lowercase filter), you may needto change your
Goal URL.For example, if a goal is defined as
/pages/html/thankyou.html, but an ad-vanced filter has been applied
to the profile and changes the request URIto /pages/thankyou.html,
then the goal will not work.
FunnelsA funnel is a series of predefined steps, or pages, that
a visitor must go throughbefore reaching a goal. Not every goal
will have an associated funnel, so defininga funnel is optional.
You should set up a funnel if you have a predefined processthat the
visitor must go through before reaching the goal. This could be as
simpleas specifying the form used on a "Contact Us" page or as
complicated as a multistepcheckout process. The funnel is an
excellent way to visualize problems in the con-version
process.Setting up a funnel is very similar to setting up a goal.
Each step in a funnel is apageview. So, to create a funnel, paste
the URL for each page in your process intothe setup form (Figure
21).
Figure 21. The funnel setup form
Google Analytics 31
-
The "Required step" checkbox can affect the number of goal
conversions in theFunnel Visualization report. When selected,
visitors who complete the goal with-out starting at the first step
in the defined funnel will not be shown as completingthe goal in
the funnel visualization report. However, the conversion will be
recor-ded in other conversion reports.
WarningGoogle Analytics will "backfill" your predefined funnels.
For example, if youhave a four-step funnel, and a visitor completes
only the first step and thefinal step, Google Analytics will go
back and indicate that the visitor actuallyhit every step in the
funnel process.
Additional SettingsEach goal/funnel has an "Additional Settings"
section that can aid configurationin unique situations. Any changes
made in this section will be applied to both thegoal and the steps
in the funnel. Figure 22 shows the options available in
the"Additional Settings" section.The "Case sensitive" setting can
be used with web sites that have mixed-case URLs.If the Goal URL,
or any of the steps in your funnel are case-sensitive, check
thischeckbox. Remember, filters can affect this setting. If an
Uppercase or Lowercasecustom filter has been applied to the
profile, then there will be no mixed-case URLs,and this setting
will be irrelevant.The Match Type setting is a powerful setting
that can facilitate goal tracking. Forexample, if each goal page
contains a unique customer identifier, then it will beimpossible to
paste a single URL into the Goal URL field. Because each URL willbe
unique, your web site will not have a single URL that represents
the goal page.Google Analytics has three different match types that
can be used to match multipleURLs and resolve goal setup issues.
When selected, each match type will change
Figure 22. The additional settings for a goal and funnel
Google Analytics 32
-
how Google Analytics applies the value in the Goal URL field to
the data it pro-cesses.Exact Match
The value in the Goal URL field must exactly match the URL in
the locationbar of the visitor's browser
Head MatchThe Head Match setting can be used when a small part
of the goal URL differsfrom one visitor to another. When using a
Head Match, the URL in the visitor'sbrowser must exactly match the
value in the Goal URL. However, if there isany additional data at
the end of the visitor's URL that does not appear in theHead Match
value, the goal will still count. The Head Match will match
bothpath data and query-string variables.
Regular ExpressionThis setting defines a goal using a regular
expression rather than a static URL.If the regular expression
entered into the Goal URL matches any part of theURL in visitor's
browser, then the goal is counted. This includes subdomains,primary
domains, path information and query-string variables. (To learn
moreabout regular expressions, see the "Regular Expressions"
section in the "Ref-erence" material.)
WarningThe "Case sensitive" and Match Type settings are applied
to the values inboth the Goal URL and funnel steps. It is
impossible to use a match type ofExact Match for your funnel steps
and a Regular Expression match type forthe Goal URL.
TipYou can define goals and funnels for data created by
urchinTracker(). Re-member, if you pass a value to urchinTracker(),
then that data becomes apageview in Google Analytics. These
pageviews can then be defined as goalsby placing the value passed
to urchinTracker() in the Goal URL field.
The final option in the "Additional Settings" section is "Goal
value." Use this fieldto monetize non-e-commerce goals. For
example, if each contact form submittedby a user is worth $100,
enter 100 in the "Goal value" field. Google Analytics willuse 100
to calculate return on investment (ROI) and other revenue-based
calcu-
Google Analytics 33
-
lations. If e-commerce tracking is active for a profile, and you
would like to use e-commerce data for your goals, simply leave this
field blank. Google Analytics willinsert a zero and use the value
from e-commerce transactions as the goal value.You should really
try to monetize your non-e-commerce goal values. The reasonis that
Google Analytics will use the goal value to calculate an $Index
value. Thisis a metric that indicates how much each page on your
website is worth. You canuse the $Index value to determine which
content is most important to the conver-sion process. The $Index
value can be found in the Top Content report.
Using Regular Expressions to Extend GoalsSome users are
frustrated by the four-goals-per-profile limit. With a little
creativity,the limit can be circumvented by tracking multiple
conversion activities in a singlegoal. This can be done with
regular expressions. Remember, the Goal URL can bea regular
expression. This means that multiple URLs on a web site can match
theregular expression defined as a goal.Here's an example. The
following two pages represent a conversion on a
site:http://www.analyticstalk.com/blog/outbound/rss/googlehttp://www.analyticstalk.com/blog/outbound/rss/rssThe
regular expression used for the Goal URL could be
/rss/(google|rss)$.Both URLs match the regular expression and would
count toward the goal tally.To drill down into the data and
differentiate which URL generated more goals, usethe Goal
Verification report. This report, shown in Figure 23, segments a
goal bythe different pages that contribute to it.
Note
What goals should you configure for your website? That depends
on youronline business model. Remember, Google Analytics collects
business data,so the goals for an e-commerce business may be very
different than those fora lead-generation business. In general,
goals usually involve one of the fol-lowing:
Figure 23. The Goal Verification report
Google Analytics 34
-
Completing an e-commerce transaction Submitting a "Contact Us"
form Subscribing to an email newsletter Viewing certain content on
a web site
Common Web Site ConfigurationsThis section of the ShortCut
addresses common web site architectures that cancause problems with
the Google Analytics tracking. Remember, the technologythat Google
Analytics uses to track visitors, called page tagging, is based on
Java-Script and cookies. So any web site architectures, like
multiple domain names, thataffect cookies or JavaScript can
interfere with tracking. Most of the changes re-quired to deal with
these web site configurations are usually made to your web siteand
not Google Analytics directly.If your web site contains five static
HTML pages, then it is very likely that thissection will not apply
to you. However, if you have a dynamic web site that
crossesmultiple domains and subdomains, then this section will
offer you valuable infor-mation about how, and why, you should
configure Google Analytics.Dynamic Web SitesA dynamic web site is
one that uses query-string parameters, or variables, to de-termine
which content the visitor is consuming. As discussed in the section
onurchinTracker(), Google Analytics includes query-string
parameters when it cre-ates a pageview. Table 3 illustrates how a
URL in the browser's location bar wouldappear in a Google Analytics
report.Table 3. How Google Analytics creates page "names"URL in
Browser Resulting URL in Google
Analyticshttp://www.mysite.com/dir/index.php?sess=1234&cat=3&prod=foo&var2=bar
/dir/index.php?sess=1234&cat=3&prod=foo&var2=bar
http://www.mysite.com/dir/index.php?sess=4567&cat=6&prod=bar&var2=foo
/dir/index.php?sess=4567&cat=6&prod=bar&var2=foo
However, not all query-string parameters are created equal. Some
query-stringparameters indicate the content that a visitor is
viewing. These parameters arenecessary for analysis. Other
query-string parameters are used by your web serveror web
application and provide no insight into the visitor's actions or
the content
Google Analytics 35
-
she views. These variables are not needed and should be
eliminated from GoogleAnalytics.To configure Google Analytics to
remove query-string parameters during process-ing, simply list the
unwanted parameters in the Exclude URL Query Parametersfield in the
"Profile Information" section (see Figure 24). List multiple
query-stringparameters as a comma-separated list.In the above
example, the query-string variables sess and var2 would be
removedfrom Google Analytics during processing. Table 4 indicates
how a URL will appearafter certain parameters have been
excluded.
Table 4. How a URL looks after removing unnecessary query-string
parameters
URL in Google Analytics Resulting value after excluding
parameters/dir/index.php?sess=1234&cat=3&prod=foo&var2=bar
/dir/index.php?cat=3&prod=foo
/dir/index.php?sess=4567&cat=6&prod=bar&var2=foo
/dir/index.php?cat=6&prod=bar
Figure 24. Enter unwanted query-string variables in the Exclude
URL Query Parameterstext box
Google Analytics 36
-
Excluding query-string parameters from Google Analytics will
affect other partsof the application. Once a query-string variable
has been added to the list, it willbe completely removed from the
system. This means that the parameter data willnot be accessible
via filters, goal settings, or funnel settings.So, if a filter
utilizes a particular query-string variable, and the variable is
excluded,then the filter will break. This also holds true for goal
settings and funnel settings.What parameters should you eliminate?
Any parameter that does not provide in-sight into what the visitor
is doing, or what the visitor is viewing, should beremoved. How
will you know which query-string parameters to exclude and
whichones to include? I've found the easiest way is to let Google
Analytics collect somedata. Then use the Top Content report to
identify all query-string parameters.Create a list of the
parameters and check with your IT staff to learn what each
onesmeans. This process is not easy, but it is important.
WarningIt is very common for web sites to use a query-string
variable called a SessionID to identify each individual visitor.
Session IDs are unique strings that mayappear in the query string
of every page. A session ID will make every pa-geview unique
because each session ID is unique. Session IDs should beeliminated
from Google Analytics in the method described above.If every page
comes through as unique, because of the unique session ID,every
page will have only one pageview. Only when you aggregate the
data,by removing the session ID from the query string, will you fix
the problem.Some web sites add the session ID as a directory in the
file path. In this case,an Advanced filter must be used to
restructure the Request URI field. Pleasesee the section "Advanced
filters" for more information.
As usual, changing the Exclude Query String Parameters setting
will not affect datathat has already been processed by Google
Analytics. Only data processed in thefuture will reflect this
change.
WarningIt is against the Google Analytics Privacy Policy to
store any personally iden-tifiable information in Google Analytics.
If your web site uses query-stringparameters to pass personal
information about your visitors, then that in-formation will be
stored in Google Analytics, thus violating the privacy
Google Analytics 37
-
policy. You must exclude all query string variables that may
containpersonally identifiable information.
Tracking Across Multiple DomainsGoogle Analytics has the ability
to track visitors across multiple domains. Thisfunctionality is
primarily used on web site that have a third-party shopping
cart,but it can be used for other purposes. If your web site
traverses multiple domains,then you will want to track your
visitors as they move from one domain to another.If you do not
track them across domains, then each visitor will appear as a
newvisitor (i.e., a new person) each time they move from one of
your web sites toanother.However, tracking across multiple domains
should be implemented only if thereis some functional connection
between the web sites. If there is no business rela-tionship
between the sites, then there may not be a need to track visitors
betweendomains. Only you can decide if you need to track visitors
between multiple do-mains.Critical to cross-domain tracking is the
concept of first-party cookies. First-partycookies are cookies
whose domain is the same as the web site that the visitor
iscurrently visiting. For example, cookies for a user visiting
www.epikone.com[http://www.epikone.com] have a domain of
epikone.com [http://epikone.com].Google Analytics uses first-party
cookies. Therefore, the GATC on www.epi-kone.com
[http://www.epikone.com] can interact only with cookies that have
adomain of epikone.com [http://epikone.com]. If the visitor leaves
www.epi-kone.com [http://www.epikone.com], as is the case when a
web site uses a third-party shopping cart, then the tracking
cookies cannot be accessed by the GATCon the shopping cart
pages.How it worksWhen a visitor arrives at the web site for the
first time, the Google AnalyticsTracking code sets a number of
cookies that uniquely identify the visitor. No matterwhere the
visitor goes on the web site, he can always be identified by the
cookies.Things change if the visitor leaves the web site. The
tracking cookies are first-partycookies, which means they can be
used only by the web site that sets them. If thevisitor leaves the
site to use a shopping cart located on a different domain, thenthe
tracking cookies will no longer work. There needs to be some
mechanism totransfer the cookies, along with the visitor, from one
domain to another. This isshown in Figure 25.
Google Analytics 38
-
Google Analytics provides two functions to transfer the tracking
cookies betweendomains: __utmLinker() and __utmLinkPost(). Both
functions operate in the samemanner. They extract the tracking
cookie values from the cookies and place thedata in the
destination-page URL as query-string parameters. The tracking
cookiesin Figure 25, colored green, are passed in the query string
as the visitor movesfrom epikone.com [http://epikone.com] to
cutroni.com [http://cutroni.com].The name of each tracking cookie
is highlighted in orange.When the visitor lands on cutroni.com
[http://cutroni.com], the GATC removesthe cookie values from the
query string and resets the tracking cookies on cutro-ni.com
[http://cutroni.com]. When the process is complete, the visitor has
twosets of cookies with the same values. One set of cookies is for
epikone.com [http://epikone.com], and one set is for cutroni.com
[http://cutroni.com].There are two critical conditions that must be
met for this technique to work: Both domains must have the GATC
installed. The third-party domain must accept query-string
parameters.If the third-party domain prohibits either of these
conditions, then Google Ana-lytics does not track visitors from one
domain to the other.
Figure 25. When a visitor moves from one domain to another, his
tracking cookies mustmove with him
Google Analytics 39
-
ImplementationFirst, make sure the pages on both web sites have
the GATC installed. If the track-ing code cannot be added to both
web sites, then tracking will not work. In additionto adding the
tracking code, the pages must be modified as follows:
_uacct="UA-xxxx-x";_udn="none";_ulink=1;urchinTracker();
The above modifications are necessary to change how the tracking
code interactswith, and configures, the tracking cookies. The _udn
variable determines the do-main for the tracking cookies. Normally
Google Analytics uses the subdomain andprimary domain in the
location bar of the browser for the cookie domain. By setting_udn
to "none", the tracking code uses the entire hostname for the
cookie domain.The _ulink variable is a switch that activates a
security feature within the trackingcode. This feature creates a
"key" that ensures that the tracking variables are setwith the same
values on both domains. You can see the key in the URL. It is
storedin a query-string parameter named _utmk.Once the tracking
code has been modified and installed, __utmLinker()
or__utmLinkPost() must be added to your web site. As mentioned
above, these func-tions extract the tracking cookie values and add
them to the destination URL asquery-string parameters.If the web
site transfers the visitor between domains using standard anchor
tags,then __utmLinker() must be added as follows:
Buy Now');
If a form is used to transfer the visitor between domains, then
__utmLinkPost()must be added to the necessary forms. Modify all
appropriate forms as follows:
__utmLinkPost() will change the form action, by adding
query-string parametersto the value in the action attribute, when
the visitor submits the form.It is important to note that you may
need to tag the links and forms on both websites. Why? Every page
on Web Site A and Web Site B should be considered asearch engine
results page, and thus a starting point for a visitor's visit.
However,
Google Analytics 40
-
if there is no chance that the visitor's visit will start on Web
Site B, and then moveto Web Site A, then there is no need to change
the links or forms on Web Site B.
Tip
Many people have been experimenting with DOM scripts to
dynamicallycall __utmLinker() or __utmLinkPost() rather than
manually add the func-tions to the HTML. While these scripts will
work some of the time, successhas been inconsistent due to
variations in the DOM from one browser toanother. Care should be
taken when experimenting with this customization.
Once the tracking code has been modified, and __utmLinker() or
__utmLinkPost() have been added, visitors will be tracked between
domains. To aid in reporting,it is a good idea to add a filter that
attaches the web site hostname to the requestURI. This makes it
easier to identify common content on each domain. The settingsfor
the filter are shown in Figure 26.
Tracking Across Multiple SubdomainsLike tracking across multiple
domains, the primary issue with tracking acrossmultiple subdomains
has to do with the cookie domain.
Figure 26. Advanced filter to add the hostname to the Request
URI
Google Analytics 41
-
By default, the GATC includes the web site subdomain in the
cookie domain. Thismeans that a cookie set by the GATC while the
visitor is visiting one subdomaincannot be utilized by the GATC on
a different subdomain. So, a visitor who visitsmultiple subdomains
on a web site will receive a different set of tracking cookiesfor
each subdomain. I've illustrated this issue in Table 5.
Table 5. How subdomains affect cookie domainsDomain Cookie
Domain Can be accessed
bysupport.foo.com[http://support.foo.com]
.support.foo.com[http://.support.foo.com]
support.foo.com [http://support.foo.com] only
secure.foo.com [http://secure.foo.com]
.secure.foo.com[http://.secure.foo.com]
.secure.foo.com [http://.secure.foo.com] only
To resolve this issue, the cookie domain must be consistent from
one subdomainto another. The subdomain must be removed from the
cookie domain. Once thesubdomain is removed, the cookie can be
accessed by the GATC that appears onany subdomain, as illustrated
in Table 6.
Table 6. Changing cookie domain enables tracking across
different subdomainsDomain Cookie Domain Can be accessed
bysupport.foo.com[http://support.foo.com]
.foo.com[http://.foo.com]
support.foo.com [http://support.foo.com] or se-cure.foo.com
[http://secure.foo.com]
secure.foo.com[http://secure.foo.com]
.foo.com[http://.foo.com]
support.foo.com [http://support.foo.com] or se-cure.foo.com
[http://secure.foo.com]
The tracking cookie domain can be changed using the _udn
variable. In the defaultconfiguration, _udn is set to a value of
"auto", causing the subdomain to be includedin the cookie
domain._udn can be set to a specific value, which will in turn be
used for the cookie domain.So, setting _udn to the web site's
primary domain lets the tracking code access thecookies on various
subdomains.
Google Analytics 42
-
ImplementationYou can configure Google Analytics to track
visitors across multiple subdomainswith the following process: 1.
Modify the tracking code to include _udn variable. 1. Apply a
filter to clarify Google Analytics reports. 2. Segment traffic into
multiple profiles for improved reporting (this step is
optional but recommended).Begin by modifying the GATC to include
the _udn variable. Set this variable to theprimary domain for the
web site:
_uacct = "UA-XXXXX-X";_udn =
"primarydomain.com";urchinTracker();
Once the tracking code has been modified and installed, you have
to add a filterto the appropriate profile (Figure 27). The filter
will differentiate pages that appearon multiple subdomains. For
example, the page index.html may appear on multiplesubdomains but
will appear as index.html in the reports. Adding the hostname tothe
Request URI will differentiate multiple versions of the same
page.
Figure 27. An Advanced filter that concatenates the hostname and
the Request URI
Google Analytics 43
-
The final step in configuring multiple subdomains is optional
but recommended.It is a good idea to create a separate profile for
each subdomain. This provides agreater level of reporting and more
insight into visitor actions on each subdomain.
Warning
This filter, or any filter that modifies the Request URI field,
will break thesite overlay report. The reason is that the Site
Overlay report uses the RequestURI to identify which links in the
Site Overlay report correspond to specificdata (like clicks and
visits).
To create the additional profiles, use an Include filter (Figure
28) based on theHostname field. When complete, there should be one
main profile that containssummary data for all subdomains and
individual profiles for each subdomain.
Tracking Across Multiple Domains with Multiple
SubdomainsTracking visitors across multiple primary domains, which
contains multiple sub-domains, can be done. The key to a successful
implementation is making sure theGoogle Analytics tracking cookies
are set with the correct domain and that thecookies are passed
between the primary domains. There are three steps to configurethis
type of tracking:1. Modify tracking code on each subdomain and
primary domain.
Figure 28. An Include filter used to create a profile for a
specific subdomain
Google Analytics 44
-
2. Modify links and forms on both sites to use __utmLinker() or
__utmLinkPost().
3. Add a filter to clarify the data within reports.Many of the
GATC modifications for this configuration are similar to the
settingsused in the multiple domains and multiple subdomain
tracking. _udn is used toremove the subdomain from the cookie
domain, thus making tracking across eachsubdomain possible. _ulink
is used to trigger certain actions in the tracking codenecessary
for cross-domain tracking. Table 7 lists how the tracking code
shouldbe modified for each domain.
Table 7. The GATC configuration for a web site that uses
multiple domains and multiplesubdomains
Site 1 hostnames Tracking codesecure.site1.com
[http://secure.site1.com]
_uacct = "UA-XXXXX-X"; _uhash = "off"; _udn = "site1.com";
_ulink = 1; urchinTracker();
products.site1.com[http://products.site1.com]
Site 2 hostnames Tracking codesecure.site2.com
[http://secure.site2.com]
_uacct = "UA-XXXXX-X"; _uhash = "off"; _udn = "site2.com";
_ulink = 1; urchinTracker();
support.site2.com[http://support.site2.com]
The primary difference is the _uhash variable. _uhash creates a
unique hash (ornumerical representation) of the domain name. This
number is then placed in thetracking cookies.Originally _uhash was
created to speed the processing of profile data and to ensurethe
integrity of the tracking cookies. With improvements to the
tracking code andthe server architecture of Google Analytics,
_uhash has become antiquated. How-ever, it must be set
appropriately for tracking to work.
Google Analytics 45
-
In addition to changing the tracking code, each site must be
modified to use__utmLinker() or __utmLinkPost(). Remember, these
functions pass the GoogleAnalytics tracking cookies between the
domains via the query string. If the cookiesare not passed between
domains, then the visitor's session will not be trackedbetween web
sites.Finally, to help clarify the data in your reports, use an
Advanced custom filter toattach the hostname to the Request URI.To
aid in analysis, it is wise to create separate profiles for each
primary domainand/or each subdomain. Do this by using a simple
Include filter based on theHostname field. Sample filters can be
found in Figure 25 and Figure 26.Theoretically, there is no limit
to the number of primary domains or subdomainsthat you can track a
visitor across. However it may be impractical to track acrossmore
than two or three primary domains.Frames and IframesYou can use
Google Analytics on sites that have frames. However, take care
duringinstallation and configuration. The most common problem with
sites that useframes is that the original referral information can
become distorted. This can leadto problems when tracking online
marketing.When implementing Google Analytics on a site that uses
frames, make sure thatboth the frameset and the pages within the
frame are tagged with the Google An-alytics tracking code. If both
pages are not tagged, then Google Analytics does nottrack referral
information correctly.A side effect of tagging both the frameset
page and the pages within the framesetis that there will be an
artificially high number of pageviews for some pages, spe-cifically
the frameset. If the frameset page is not critical (i.e., it is
simply a navigationmenu or page header), then consider removing it
from the profile using an Excludefilter.
WarningIf your web site uses frames, and the number of pageviews
is a critical metricfor your business, be sure you filter your
profile data appropriately to insureaccurate metrics.
Google Analytics 46
-
iframesThe effect of iframes on Google Analytics tracking is
similar to that of standardframes. If the outer page is not tagged
with the Google Analytics tracking code,then the original referral
information will be lost.Another common issue with iframes is the
use of third-party shopping carts. If youare using a third-party
shopping cart and embedding the shopping cart pages inan iframe, it
will be impossible to track the visitor session from originating
website to the shopping cart web site. The reason is __utmLinker()
and __utmLinkPost() functions cannot be used in the SRC attribute
of an iframe.
WarningSome have tried to pass the Google Analytics tracking
cookies directly to thesource of an iframe. This technique will not
work. When __utmLinker() or__utmLinkPost() execute, they create a
hash, or key, based on the values ofthe cookies. The hash is sent
to the third-party domain and used to checkthe accuracy of the
values.
Marketing Campaign TrackingAnother important part of setting up
Google Analytics correctly is the identifica-tion of all URLs used
in online marketing. Unlike other configuration steps,marketing
campaign tracking is not done in the Google Analytics
administrativeinterface or on your web site. Marketing campaign
tracking involves changing thelinks used in your marketing
activities. I'll discuss this more in a moment.The reason why
marketing campaign tracking is so important is that, by
default,Google Analytics places your visitors in three basic
referral segments:organic
Visitors who click on a search engine results pagereferral
Visitors who click on a link on some other web sitedirect
Visitors who go directly to your web site by typing the URL into
their browsersWhile these segments are useful, they do not identify
paid marketing activities.You want to measure paid marketing
activities so you can better understand ifthey're successful! This
can only be done with via marketing campaign tracking.
Google Analytics 47
-
How It WorksMarketing campaign tracking is based on the process
of link tagging, which isadding extra information to the
destination URLs used in your online marketingactivities. The extra
information is actually a number of query-string parametersthat
describe the marketing activity. I've illustrated how link tagging
identifies yourmarketing activity in Figure 29.It all begins with
the ad that the visitor sees (step #1). In this example the ad is
apaid search ad on Google AdWords. When the user clicks on the ad,
she is sent toa destination URL. Within the destination URL there
are additional query-stringparameters that Google Analytics uses to
identify the ad (step #2).When the visitor arrives on the web site
landing page, the urchinTracker() functionbegins to execute. It
examines the URL in the location bar and identifies the
query-string parameters that identify the URL as a campaign URL.
urchinTracker()extracts the query-string parameters (step #3). Then
it splits the query-string pa-rameters into their name-value pairs,
reformats them (step #4), and finally storesthem in the __utmz
cookie (step #5). Because the values are now stored in a
cookie,
Figure 29. How link tagging works
Google Analytics 48
-
any actions that the visitor performs can be linked to the ad
that drove them to thesite.Let's dig a bit deeper and learn about
the specific query-string parameters used inlink tagging. Table 8
shows how the tagged link in Figure 29 was created.Table 8. A
destination URL before and after link taggingLink Before Tagging
Link After Tagginghttp://www.google.com/analytics/
http://www.google.com/analytics/?utm_source=google&utm_medium=CPC&utm_campaign=en&utm_term=google%20analytics
Parsing the tagged link above identifies the query-string
parameters used for iden-tifying the ad. Table 9 identifies each
parameter and value.Table 9. The name-value pairs extracted from a
destination URLParameter Valueutm_source Googleutm_medium
CPCutm_campaign Enutm_term google%20analytics
Now I'll describe what each parameter actually
represents:utm_campaign
The name of the marketing campaign. Think of this as a bucket.
It holds all ofthe marketing activities in some bigger effort. For
example, buying some key-words on Google, running some banner ads,
and sending out an email blastmay all be part of the marketing plan
for some type of sale. These three activities,which are all part of
the same campaign, can be grouped together for easy re-porting.
utm_medium
The medium is the mechanism, or how the message is delivered to
the recipient.Some popular mediums are email, banner, and
cost-per-click (CPC).
utm_source
Think of the source as the "who." With whom are you partnering
to distributethe message? If you're tagging CPC links, the source
may be Google, Yahoo!,
Google Analytics 49
-
or MSN. If you're using banner ads, the source could be the name
of the website where the banner ad is displayed.
utm_term
The search term or keyword that the visitor entered into the
search engine. Thisvalue is automatically set for organic links but
must be set for CPC links.
utm_content
The version of the ad. This is used for A/B testing. You can
identify two versionsof the same ad using this variable. This
parameter is not included in Fig-ure 29.
It's important to note that not all parameters are required. The
core parametersare utm_campaign, utm_source, and utm_medium. These
three should always be usedwhen tagging a marketing link. utm_term
should be used for tracking paid searchadvertising, and utm_content
can be used for A/B testing advertising.You determine the value for
each parameter. In reality it does not matter whatvalue you use.
Whatever data you do use will appear in Google Analytics.
However,it is important to follow some basic guidelines: Keep the
value short. Use alphanumeric characters and avoid white spaces.
Make sure the value is understandable to you, or to whoever use
these reports.The value of each parameter will be imported directly
into the Google Analyticsreports. This is very powerful. Google
Analytics is importing information that isspecific to your
business, like the name of a marketing campaign, and segmentingyour
data based on the values. It displays the data, exactly as it
appears in thequery-string parameter, in a series of reports. These
reports segment web site trafficand conversions, thus providing
insight into which marketing activities are work-ing. Figure 30,
the Campaign Report, shows how Google Analytics segmentsvisitation
data based on a marketing campaign.
Warning
All cost-per-click links that are not tagged will be categorized
as "organic."This can artificially inflate organic traffic volume,
leading to incorrect anal-ysis. If you're using Google AdWords it
is highly recommended that AutoTagging be enabled. If other paid
search systems are used, like Yahoo! SearchMarketing or Microsoft
AdCenter then the destination URLs must be man-
Google Analytics 50
-
ually tagged. This is absolutely vital to configuring Google
Analytics cor-rectly.
How to tag linksThe process of link tagging is simple. Start by
identifying the marketing informa-tion to be placed in the
query-string parameters. Specifically, you need to identifythe
campaigns, mediums, sources, and potentially keyword and content
values.Remember, the keyword parameter is used only for tracking
search-based ads, andthe content parameter is used to identify
different variations of an ad. I recommendusing some type of
spreadsheet to organize the information.Once all the parameter
values have been identified, modify the destination URLsto include
the parameters and values. Place a question mark at the end of
thedestination URL followed by the query-string parameter. Separate
each name-val-ue pair using an ampersand (&).If the destination
URL already has query-string parameters, simply add the
GoogleAnalytics parameters at the end of the URL. Separate the
Google Analytics pa-rameters from the existing parameters using an
ampersand (&).
Warning
If your web site uses redirects on the landing pages, then there
may be troublewith link tagging. The Google Analytics campaign
tracking parameters mustbe present in the URL of the landing page.
If the URL does not physically
Figure 30. The Campaign Report automatically segments site data
based on theutm_campaign value
Google Analytics 51
-
contain the tracking parameters, then the visit will not be
attributed to thecorrect ad.
Link tagging works for any destination URL. So, if you are
sending out emails orusing banner ads, you should be tagging the
destination URLs. In general, anytimeyou pay for advertising on the
web, you should try to tag the URL used in the ad.
TipSome destinations URLs, especially those used in email
marketing, can bevery long before the addition of the Google
Analytics campaign tags. Onetrick is to create a custom URL on your
web site and direct all traffic fromthe email to the custom URL.
Then, when a visitor lands on the custom URL,you dynamically append
the campaign tracking variables to the URL. Thiscan be done using
application-level code or with a simple HTML METArefresh.More
information about dynamically tagging campaign URLs can be foundin
this Conversion University
article:http://www.google.com/analytics/cu/tt_offline_campaigns.html
Understanding Conversion ReferralsVisitor campaign information
is stored in the __utmz cookie on the visitor's ma-chine. This
cookie not only stores campaign information, but also all
referralinformation including organic referrals, marked campaign
links, untagged referrallinks, and direct visits.Each time a
visitor visits the web site, the urchinTracker() function updates
the__utmz cookie with the appropriate campaign information. When
the cookie isupdated, Google Analytics discards the previous
campaign information. As a resultGoogle Analytics tracks only the
current campaign information, not previouscampaign information.With
that said, there is a hierarchy of data importance that Google
Analytics ref-erences before it updates the __utmz cookie and
overwrites the referral information.Remember Google Analytics
buckets traffic in four basic ways:Campaigns
Links that are tagged with campaign informationReferrals
Visitors who click on an untagged link residing on a web
pageGoogle Analytics 52
-
DirectVisitors who type the URL directly into the browser
OrganicVisitors who click on an organic search result
Here is how Google Analytics updates the campaign-tracking
cookie based on thereferrer: Direct traffic is always overwritten
by referrals, and organic and tagged links. Referral, and organic
or tagged links always override existing campaign infor-
mation.For example, a user may visit a site via a tagged link in
a newsletter. When thevisitor leaves the site, the
campaign-tracking cookie persists for six months andindicates that
the visitor arrived via the newsletter.The same visitor decides to
come back to the site one day later and types the website URL into
the browser. This is a direct visit. The campaign cookie will
stillindicate that the visitor arrived via the newsletter because
the second visit was adirect visit, and direct traffic does not
overwrite existing campaign information.One day later, the visitor
clicks on a tagged CPC link. The __utmz cookie is updatedto
indicate the visitor clicked on a paid search link, and the visit
is attributed tothe CPC link.
Note
The timeout value for the campaign cookie can be changed. By
default it isset to six months. This value can be altered by
changing the _ucto variablein the GATC.
Google Analytics can be configured to retain the original
campaign data stored inthe __utmz cookie. To enable this feature,
add an additional query-string parameterto a destination URL. The
query-string parameter, utm_nooverride=1, will alertGoogle
Analytics that the existing campaign information should be
retained.While helpful, this technique does not prevent the GATC
from updating the cam-paign cookie if a visitor arrives by organic
search or untagged referral link. Thistechnique is helpful only in
preventing tagged campaign links from overwritingprevious referral
information.
Google Analytics 53
-
Tracking AdWordsThe Google Analytics and Google AdWords systems
are connected. To take ad-vantage of the interconnectivity, the
AdWords account must be linked to theAnalytics account. If the
accounts are connected, there are two primary featuresthat become
enabled: Auto Tagging and Apply Cost Data.Auto Tagging automates
the link-tagging process that is usually used to track aCPC
campaign. When Auto Tagging is enabled, as shown in Figure 31,
GoogleAdWords automatically adds a query -string parameter to the
destination URLthat identifies Google AdWords as the referring
site. While this parameter is dif-ferent than the standard
link-tagging parameters, it does the same thing. Thequery-string
parameter is named gclid and contains a random value.The second
benefit of linking an AdWords account to an Analytics account is
theApply Cost Data feature (Figure 32). If you enable this option,
Google Analyticsimports your AdWords cost data and uses it in ROI
(Return On Investment) andother calculations. It is recommended
that the Apply Cost Data feature be activatedas it provides an
extremely powerful view of how AdWords ad campaigns
areperforming.
Figure 31. The Auto Tagging feature is activated from the
Account preferences sectionof the My Account tab
Google Analytics 54
-
WarningWhen Apply Cost Data is activated, cost data from the
entire AdWordsaccount is applied to each profile in the Analytics
account. Google Analyticsdoes not match the campaigns in AdWords to
the profiles in Analytics. Thismeans if you are managing AdWords
campaigns for multiple web sites andyou link the AdWords account to
Analytics, cost data for the entire Ad-Words account is applied to
each Analytics profile, even though the profilescontains data for a
single web site. This is especially problematic if you tryto link a
Client Center Ad