-
1520-9202/13/$31.00 2013 IEEE P u b l i s h e d b y t h e I E E
E C o m p u t e r S o c i e t y computer.org/ITPro 43
CLOUD COMPUtingLeveraging Big Data
Rhoda C. Joseph, Pennsylvania State University HarrisburgNorman
A. Johnson, University of Houston
Can big data and associated analytics help e-government evolve
into transformational government? The authors discuss how
government agencies, such as the US Department of Veterans Affairs,
can extract and analyze big data to more efficiently and
effectively deliver services.
Big data can improve decision making and increase organizational
efficiency and effectiveness, but only if organizations employ a
variety of analytical tools and
methods to make sense of the data. For example, descriptive
analytics produce standard reports, ad hoc reports, and alerts;
predictive analytics is con-cerned with forecasting and statistical
modeling; and prescriptive analytics focuses on optimization and
randomized testing.1 Furthermore, organi-zations must know how to
apply the statistical re-sults to improve organizational
functions.2 Profit motives make it urgent for companies in the
pri-vate sector to learn how to leverage such data, but what about
in the public sector?
In particular, government services could be greatly improved
through the use of big data and
associated analytics. Here, we describe some of the drivers and
barriers affecting the use of big data in e-government. Big data
can increase e-government efficiency and effectiveness, help-ing it
evolve into transformational government (t-government), which is
viewed as the ultimate evolutionary stage of e-government.3 We
illus-trate this potential using data from the US De-partment of
Veterans Affairs (VA).
Big Data in GovernmentThe US federal government comprises three
branchesexecutive, legislative, and judicialall of which generate
large volumes of data. Here, we focus on data in the executive
branch. Table 1 lists the 15 federal executive departments and some
sources generating big data for these departments.4
Big Data and Transformational Government
itpro-15-06-jos.indd 43 31/10/13 6:04 PM
-
44 IT Pro November/December 2013
Leveraging Big Data
For example, for the VA, 312,878 employees are producing data.
Furthermore, access to the VA via its e-government platform
generates large vol-umes of data as well, with 2,700,000 users
visiting the VA website each month. Collecting datasuch as the
length of time spent on page, frequen-cy of visits, transactions
completed per visit, and so ondrives the need for Web
analytics.
Exploiting Web AnalyticsWeb analytics provides a comprehensive
view of online users interactions with an organiza-tion. On
business websites, less than 2 percent of visitors make purchases,5
but all of the visitors contribute information, as these sites
gather data on such things as visitors landing points, page views,
time on page, abandoned shopping carts, downloads, and searches.
Web-based companies, such as Alexa, Compete, Comscore, Google, and
Quantcast, also provide a variety of Web metrics, including site
traffic, global and local rankings, and site reputationuseful data
for Web analytics.
Similar types of information should be gath-ered for government
websites that offer services. Although such websites typically have
much less traffic than commercial sites, we can still use Web
analytics to extract enough information from the
collected data to improve customer service. By analyzing which
information is being consumed, we can develop metrics for
monitoring the effec-tiveness of e-government platforms.
The Push for T-GovernmentWith e-government, government
institutions are adopting information and communications
technologies (ICTs) to improve public sector ser-vices.6
Transforming government services using ICTs can be a complex and
costly task, but it has the potential for significant returns on
the in-vestment.7 Successful t-government can lead to product
innovation, improved customer service delivery, better management
of the public sector infrastructure, and lower operational
costs.8
T-government can occur through
aggregation: forming cooperative alliances with other government
agencies to facilitate infor-mation flow;
syndication: exploiting core competencies to improve service
delivery to citizens or other agencies;
consumption: engaging different entities, such as universities
and businesses, as external data sources; or
Table 1. Federal government executive departments and some of
their sources of big data.
Sources of big data
Department
Website
Employees (2010)4
Monthly site visitors* (Quantcast.com)
Sites linking in* (Alexa.com)
1. agriculture usda.gov 98,235 2,800,000 69,658
2. Commerce commerce.gov 45,348 7,700 42,281
3. Defense defense.gov 771,614 441,100 10,237
4. education ed.gov 4,611 2,900,000 53,467
5. energy energy.gov 16,651 501,400 24,334
6. Health and Human Services
hhs.gov 83,745 173,200 35,950
7. Housing and Urban Development
hud.gov 9,818 789,700 32,376
8. Homeland Security dhs.gov 191,197 680,600 21,280
9. interior doi.gov 72,168 41,100 6,550
10. Justice justice.gov 118,104 446,000 19,488
11. Labor dol.gov 16,554 547,400 23,760
12. State state.gov 12,086 1,800,000 61,586
13. transportation dot.gov 58,189 841,300 36,432
14. treasury treasury.gov 112,541 308,100 8,053
15. veteran affairs va.gov 312,878 2,700,000 33,798
*Data collected in April 2013
itpro-15-06-jos.indd 44 31/10/13 6:04 PM
-
computer.org/ITPro 4 5
co-creation: innovating with external partners to improve
processes and the management of in-tellectual property.8
There are many benefits to t-government, but it will encounter
many challenges associated with technology implementation and
usage,3 includ-ing limited access due to the digital divide,
lim-ited availability of skilled labor, high costs, and integration
issues with legacy systemsall cou-pled with organizational and
political vagaries. One path forward is to migrate data to
alternate platforms, such as a cloud architecture, and ana-lyze the
big data sets to improve decision making.
Big Data ConsumptionTheres a performance gap when e-government
websites are compared to ecommerce websites, with e-government
sites lagging behind.9 Devel-opments in the private sector can
provide bench-marks and best practices for public sector entities
to emulate. As governments move to extract val-ue from big data,
they continue to run the risk of lagging behind their
private-sector counterparts in terms of deriving value from big
data. This lag in technological advancement also creates an-other
important issuethe simultaneous over-production and
underconsumption of data in the government sector. This raises a
question as to how we might explain both the potential value and
inherent challenges associated with such a situation.
Overproduction and UnderconsumptionThe economic theory of
overproduction and un-derconsumption helps frame both the potential
and the challenges of using big data analysis to improve the
federal government. Overproduc-tion refers to the instability of
distribution of production from time to time and inequality of
distribution of productive activity from place to place.10 The
accrual of big data can result in a surplus of information thats
underconsumed in the e-government domain. This overproduction,
accompanied by sustained underconsumption of a physical good, would
normally result in a crisis situation, characterized by decreased
pricing and a devaluation of the product. However, big data
presents an anomaly to this economic rationale.
In this case, because of the lower level of con-sumption, more
unconsumed data accrues. As
more data is produced, instead of the data los-ing value, theres
increased value in the poten-tial insight that can be gleaned from
analyzing such data. By transforming big data into useful knowledge
and quantifiable metrics through appropriate analysis, this
potential profit from the overproduction of data can be accessed
and the rates of data production and consumption brought to more
similar rates. Even though the consumption of big data might
continue to lag behind the rate of production, the value can
continue to increase because of opportunities for analysis, and
will affect how the information is technically processed and
organizationally managed.
US Department of Veterans AffairsFor example, consider the US
Department of Veterans Affairs (VA). The recent winding down of
wars in Iraq and Afghanistan has produced an influx of returning
veterans requiring different
levels of service, and there are often long delays associated
with processing their benefits.
The VA collection of pending claims grew from 400,000 in 2009 to
880,000 in 2012,11 and disabilities claims that are over 125 days
old in-creased from 180,000 in 2010 to 594,000 by the end of
2012.12 Internally, theres an overproduc-tion of data, coupled with
an underconsumption in the VAs processing units. As veterans
con-tinue to submit data to the VA, theres an oppor-tunity to
reduce the apparent processing backlog through big data analytics.
Using predictive and prescriptive analytics, new metrics can be
created to improve processing of each new claim, thereby creating
economies of scale.
One other immediate challenge facing data management in the VA
is the proliferation of pa-per-based processing. This surplus of
data would immediately benefit from automation and the re-design of
some internal functions. The analysis of big data can create
radical change in the VAs
The recent winding down of wars in Iraq and Afghanistan has
produced an influx of returning veterans requiring different levels
of service.
itpro-15-06-jos.indd 45 31/10/13 6:04 PM
-
46 IT Pro November/December 2013
Leveraging Big Data
technical processing of claims. As more big data is analyzed,
patterns will arise that can clearly show correlations among
different segments of the veteran population. Big data analytics
can also drive more collaboration between the US Department of
Defense (DoD) and the VA.
For example, the VA aims is to increase the number of fully
developed claims arriving at the VA from 3 to 20 percent, primarily
by having the DoD provide electronic access to all service and
personnel records of departing Active Duty, Na-tional Guard, and
Reserve Service members.12 A fully developed claim contains all
pertinent in-formation from the DoD, such as service records and
entrance and exit exams.
Efficient and Effective ServicesAs governments become more
efficient, the amount of time and effort needed to complete a task
will decrease. Outside of government, data-driven companies
outperformed their com-petition and were, on average, 5 percent
more productive and 6 percent more profitable.13 For example, the
retailer Sears used big data analytics to reduce its advertising
promotion development time from eight weeks to less than one
week.13
If the relevant internal entities can gather and appropriately
analyze big data, it will reduce the time required to produce
reports and run addi-tional, more specific kinds of analytics.
Further-more, the amount of effort required to process tasks should
also be reduced through big data
consumption and analysis. Big data can also pro-vide specific
metrics for measuring outcomes in the e-government domain.
We propose using big data to increase govern-ment efficiency by
automating and redesigning data analysis processes. We also argue
that big data analysis can increase government effective-ness
through data segmentation and information transparency (see Figure
1).
AutomationAutomation is one of the cornerstones of imple-menting
ICTs in business and government. It can streamline big data and
support its analysis by targeting bottlenecks. For example, in the
VA case, a backlog of paper-filed cases reveals an op-portunity for
data automation. By moving paper filing to electronic-based filing,
some data will be automatically processed, thereby reducing the
amount of time required to process a claim.
RedesignThrough the use of big data analytics such as genetic
algorithms, regression analysis, and sentimental analysis tools,
processes can be re-designed.14 (Regression analysis examines
re-lationships among well-defined variables, and sentimental
analysis seeks to extract polaritypositive or negativefrom opinion
statements. These types of analytical tools can also improve
service delivery by helping employees better un-derstand the needs
of their customers. For ex-ample, the Internal Revenue Service
(IRS) has redesigned tax filing processes to incorporate big data
analytics to improve fraud detection and discover noncompliance.15
As analysis of big data and the judicious use of the information
derived from it increases across different federal depart-ments and
other government agencies, less time and effort will be needed to
process transactions.
SegmentationSegmentation reveals specific data clusters or
groupings. This is a common marketing con-cept, often used to
create groups based on demographics or geographical regions.
Seg-menting big data in the public sector can reveal clusters that
arent intuitive or easily visible with just a cursory examination
of data. Big data ana-lytics can help government employees read the
data from multiple perspectives to reveal new
Figure 1. A model for leveraging big data to improve
e-government services, ultimately resulting in transformational
government. Automating data analysis and redesigning processes can
improve efficiency, while segmenting the data and making
information more transparent can improve effectiveness.
Automation Improves efficiency
Improves efficiency
Improves effectiveness
Improves effectiveness
Redesign
Segmentation
Transparency
itpro-15-06-jos.indd 46 31/10/13 6:04 PM
-
computer.org/ITPro 4 7
information and better tailor services to meet citizens
needs.14
Consider the case of big data aggregated from claims filed at
the VA, revealing some hidden link across clusters, such as a
specific medical condi-tion, that helps the VA create new standards
for quickly processing claims.
TransparencyBig data tools can readily support reporting on
large amounts of data, thus making more information available to
the public. For ex-ample, the growth of social media in the
e-government domain has already increased transparency and reduced
corruption in some areas of government.16 As big data analysis
in-creases with t-government, decision making will be driven more
by data and less by conjec-ture, increasing the effectiveness of
public ad-ministration. The concept of open government requires
releasing more information to the public. With increased
expectations of open-ness, decision makers need to justify outcomes
based on data inputs.
Overcoming Big Data BarriersAlthough t-government could be
derived from the governments use of big data analytics, sev-eral
challenges would still need to be met.
Analyze Unstructured DataFirst, traditional relational database
systems arent well suited to manage the big data collected from
unstructured sources such as images, blogs, smartphones, GPS,
mobile devices, and social networks.13 In fact, approximately 85
percent of data generated today is unstructured.15 Govern-ments
will need to maintain data repositories that can support the
manipulation and analysis of unstructured data.
Build the InfrastructureSecond, it will be necessary for
government de-partments and agencies to adopt the appropri-ate
technical infrastructure to manage big data. Agencies such as NASA
and the IRS are using warehouse optimization, streaming data,
Hadoop, and other technologies to manage big data.15 Ha-doop is an
open source solution that uses large clusters of computers to
analyze big data for in-dustries ranging from retail to
bioinformatics. As
real-time requests drive data consumption needs, additional open
source and propriety products are emerging to fill gaps that
existing platforms dont address.16 Innovative technology tools can
reduce the data overproduction and undercon-sumption disparity.
Accept ChangeGovernment departments are large bureaucra-cies
that can be steeped in tradition and re-sistant to the winds of
change. Organizational inertia can hinder the growth of new ideas
and new methodologies. However, as the value of data is revealed,
more champions will see the value of big data as a tool for
improving public administration. In public sector entities,
elec-tion cycles can also influence the pace of change and the of
adoption of new ways of operation, either speeding adoption of
current technolo-gies or slowing the spending necessary to
imple-ment them.
Address Privacy IssuesFinally, privacy issues can inhibit the
adoption of big data in the public domain. Collecting and
ma-nipulating sensitive data is a controversial topic of interest
to many groups both inside and outside of government. Government
departments might not want to share data that they consider
propri-etary with other government agencies. Oversight and proper
management is required to reduce this potential barrier.
A s mobile technology, social media, and ICT activities grow,
big data analytics in the public sector is a natural outcome. With
more government agencies using big data analytics to transform
public administration and public policy, executives must ask key
questions: What insights can we glean from big data? What analyses
were done on the data? How confident are we about the
results?13
Governments will need to maintain data repositories that can
support the manipulation and analysis of unstructured data.
itpro-15-06-jos.indd 47 31/10/13 6:04 PM
-
48 IT Pro November/December 2013
Leveraging Big Data
Furthermore, as the technologies used to ana-lyze big data
advance, more options will emerge for viewing the data from
different angles. Gov-ernments can reduce operational costs by
mak-ing use of new information that emerges from big data and
analytics, and they should look to adopt best practices from the
private sector to improve outcomes.9 To further enhance the
transformative value of big data in government, appropriate
resource allocation must also occur, ensuring that the data is
properly captured and managed.
References 1. T.H. Davenport, What Do We Talk About When
We Talk About Analytics? Enterprise Analytics, T.H. Davenport,
ed., Pearson Education, 2013, pp. 918.
2. A. McAfee and E. Brynjolfsson, Big Data: The Man-agement
Revolution, Harvard Business Rev., vol. 90, no. 10, 2012, pp.
5968.
3. A. Ghoneim, Z. Irani, and S. Sahraoui, Guest Edito-rial,
European J. Information Systems, vol. 20, no. 3, 2011, pp.
303307.
4. C.W. Copeland, The Federal Workforce: Characteristics and
Trends CRS Report for Congress, Apr. 2011;
http://assets.opencrs.com/rpts/RL34685_20110419.pdf.
5. B. Franks, Analytics on Web Data: The Original Big Data,
Enterprise Analytics, T.H. Davenport, ed., Pear-son Education,
2013, pp. 4770.
6. V. Weerakkody and C.G. Reddick, Public Sector Transformation
through E-Government, Public Sec-tor Transformation through
E-Government Experiences from Europe and North America, V.
Weerakkody and C.G. Reddick, eds., Taylor & Francis, 2013, pp.
18.
7. K. Siau and Y. Long, Synthesizing E-Govern-ment Stage Models:
A Meta-Synthesis Based on Meta-Ethnography Approach, Industrial
Manage-ment & Data Systems, vol. 105, nos. 3 and 4, 2005, pp.
443458.
8. J. Feller, P. Finnegan, and O. Nilsson, Open Inno-vation and
Public Administration: Transformational Typologies and Business
Model Impacts, European J. Information Systems, vol. 20, no. 3,
2011, pp. 358374.
9. F.V. Morgeson III and S. Mithas, Does E-Govern-ment Measure
Up to E-Business? Comparing End User Perceptions of US Federal
Government and E-Business Web Sites, Public Administration Rev.,
vol. 69, no. 4, 2009, pp. 740752.
10. V. Jordan, Overproduction and Business Organiza-tion, The
Menace of Overproduction, S. Hamlin, ed., Books for Libraries
Press, 1969, pp. 131142.
11. P. Hegseth and P. Rieckhoff, No Medal for Veterans Affairs,
The Wall Street J., 17 Oct. 2012.
12. Department of Veterans Affairs (VA) Strategic Plan to
Eliminate the Compensation Claims Backlog, Veterans Benefits
Administration, 25 Jan. 2013, pp. 120;
http://benefits.va.gov/transformation/docs/VA_Strategic_Plan_to_Eliminate_the_Compensation_Claims_Backlog.pdf.
13. J. Manyika et al., Big Data: The Next Frontier for
Inno-vation, Competition, and Productivity, McKinsey Global Inst.,
2011;
www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation.
14. Demystifying Big Data: A Practical Guide to Transforming the
Business of Government, TechAmerica Foundation: Fed-eral Big Data
Commission, 2012;
www.techamerica.org/Docs/fileManager.cfm?f=techamerica-bigdatareport-final.pdf.
15. J.C. Bertot, P.T. Jaeger, and J.M. Grimes, Promoting
Transparency and Accountability through ICTs, So-cial Media, and
Collaborative E-Government, Trans-forming Government: People,
Process and Policy, vol. 6, no. 1, 2012, pp. 7891.
16. G. Mone, Beyond Hadoop, Comm. ACM, vol. 56, no. 1, 2013, pp.
2224.
Rhoda C. Joseph is an associate professor at Pennsylvania State
University Harrisburg. Her primary research areas are e-government,
big data, and IT in emerging economies. Joseph received her PhD in
information systems from City University of New York. Contact her
at [email protected].
Norman A. Johnson is an associate professor in the De-cision and
Information Sciences Department at the Bauer College of Business,
University of Houston. His recent re-search interest is the area of
big data and analytics. Johnson received his PhD in management
planning and information systems from the City University of New
York. Contact him at [email protected].
Selected CS articles and columns are available for free at
http://ComputingNow.computer.org.
itpro-15-06-jos.indd 48 31/10/13 6:04 PM