Big Data as a Service: A Neo-Metropolis Model Approach for Innovation Hong-Mei Chen, Rick Kazman University of Hawaii Serge Haziyev, Valentyn Kropov SoftServe Dmitri Chtchourov Cisco Systems
Jan 21, 2017
Big Data as a Service: A Neo-Metropolis
Model Approach for Innovation
Hong-Mei Chen, Rick Kazman University of Hawaii
Serge Haziyev, Valentyn KropovSoftServe
Dmitri ChtchourovCisco Systems
Motivation Success in big data analytics depends on having
an infrastructure for: ingesting, processing, storing, integrating, and visualizing data However, many companies fail to achieve this...
Motivation According to a 2013 Infochimps survey,
55% of big data projects were not completed, due to:
technical roadblocks, system complexity, talent shortages, heavy up-front costs
Solution? Many vendors are offering BDaaS
platforms. However these are mostly proprietary,
closed-world. Choosing among them may limit the
potential for innovation.
Solution An open world model for developing a BDaaS
platform to integrate different open source technologies ease prototyping and broaden choices allowing organizations to innovate while
managing risk. A model that we call Neo-Metropolis
The Neo-Metropolis Model
Metropolis is the Greek word for “city.” The analogy is deliberate. The Metropolis Model, introduced in
2009, helps us reason about system creation that is commons-based and peer produced.
Metropolis Model Structure
Kernel
Periphery
Masses
Kernel
Periphery: Developers
Masses: Users
Open Source
Kernel
Periphery: Prosumers
Masses: Customers
Open Content
Neo-Metropolis Purpose
A Neo-Metropolis (N-M) system reflects a larger scale: it is a system of systems platform.
Intent: to make it easy for projects at the periphery to adopt, deploy, and scale systems.
A N-M system is an enabler.
N-M Characteristics Mashability Providing constituent systems as
services. “Lego-blocks” approach: platform users
create systems by plugging together, configuring, and provisioning open-source components in cloud infrastructures.
N-M Characteristics Conflicting, unknowable
requirements Requirements will always emerge from
the periphery => the open source projects.
And they will always conflict.
N-M Characteristics Continuous Evolution Metropolis projects are never in a stable
state The kernel might have traditional
releases, but the periphery is continually changing
…like a city…
N-M Characteristics Focus on Operations Cloud services are called “the fifth
utility” This requires a "DevOps" mindset.
N-M Characteristics Sufficient Correctness Perpetual beta of the periphery is the
norm But the kernel must be stable and
backwards compatible.
N-M Characteristics Scalable Resources The platform, hosted on a cloud (or
intercloud), provides scalable resources These resources are managed by the
kernel.
N-M Characteristics Gated Behaviors A Metropolis system is subject to
emergent behaviors. This is often desirable. But gated behaviors are desirable in a
Neo-Metropolis environment.
N-M Principles1. Community Engagement and Negotiation
2. Bifurcated Requirements
3. Bifurcated Architecture
4. Fragmented Implementation
5. Distributed Testing/V&V
6. Distributed Delivery/Maintenance
7. Ubiquitous Operations
N-M Innovation These principles and characteristics support: Open innovation: participants—from the
periphery and the edge—can interact dynamically, via the kernel, to generate “collective intelligence”.
The numbers game and “Lego” innovation: interoperability allows rapid mashups of services. More Lego blocks => more possible combinations.
Case Study: Cisco's BDaaS Platform
Cisco's mission is to increase their customer base via a platform and vendor-agnostic (primarily open source) approach to big data analytics.
“We don’t compete directly with Amazon; our strategy is to develop technology for microservices (higher up the stack) so that it can be deployed anywhere.”
“Public product cloud offering is not our core business; we want to invest in the internet in general, providing the capabilities for B2B interactions, e.g., Cisco’s Intercloud network.”
An Example: Cisco
Realizing N-M Principles
Community engagement and negotiation:
for the edge, BDaaS customers are initially drawn from their existing customer base
Cisco provides cost/benefit analyses for these enterprise clients
for the periphery, they draw participation from vendors of open-source products
Through collaboration, sub-contracting, partnering
Realizing N-M Principles
Bifurcated architecture / Bifurcated requirements / Fragmented implementation:
Cisco is using a traditional top-down, plan-driven process to create the kernel of its platform
The requirements, architectures, and implementations of the products at the periphery are (largely) independent.
Realizing N-M Principles
Distributed testing: Cisco manages the testing of its kernel. Also exerts oversight on the quality of
constituent projects via automated acceptance testing.
Realizing N-M Principles
Distributed delivery/maintenance: automating repetitive and error-prone tasks
(e.g., build, testing, and deployment maintain consistent environments)
employing automated testing analysis tools
Realizing N-M Principles
Ubiquitous Operations: automating as much of operations as possible employing performance dashboards. using tools like Apache Mesos to better manage
and deploy resources.
N-M Innovation Innovation is supported by the characteristics
and principles of the Neo-Metropolis model. In particular: mashability, bifurcated requirements, bifurcated architecture and implementation, continuous operations
N-M Innovation Components for big data applications
(microservices) developed so far include: Data Storage as a service (e.g., HDFS), Data Processing as a Service (e.g., MR, Spark), Data Insights as a Service (pre-processed data as Data Marts
and Data Insights ready for consumption), Data Visualization as a service (e.g., Zoomdata). They believe everything can be a service: making it
easy for others to create new ones, moving towards the vision of a “data mall” (e.g., IoT with a collection of data marts).
Conclusions This is just a single case study. However it is the evolution of trends that are
driving our software ecosystem:1. the increasing prominence of cloud computing, 2. the proliferation of open source products 3. sufficiently mature interoperability technologies Neo-Metropolis instances are the future of
service platform development.
Questions?