Research & Innovation API & Platform Business Strategy & Digital Transformation New Usages, Connected Business & Mobility Re-Imagination of Enterprise Architecture William El Kaim – April 2015 – Part 2 - V 2.3
Research & InnovationAPI & Platform
Business Strategy & Digital TransformationNew Usages, Connected Business & Mobility
Re-Imagination of Enterprise Architecture
William El Kaim – April 2015 – Part 2 - V 2.3
Plan
• The Entrepreneurial Age
• Rise of Platforms & Ecosystems
• Business Agility Through API
• Looking for the Next Gen
Architecture
• RestFul Architecture
• Antifragile Architecture
• MicroService Architecture
• 3rd Generation Mobile Architecture
• (Big-)Data Architecture (Re-)
Invented
• New Databases That Scales High
• The Devops Movement
• From Virtual Machines to Containers
• Programmatic Infrastucture
• Backend as a Service
• User Experience
• Are You Ready?
4Copyright © William El Kaim 2015
6Source: DomoCopyright © William El Kaim 2015
What is Big Data?
• A collection of data sets so large and complex that it becomes difficult to
process using on-hand database management tools or traditional data
processing applications
• Due to its technical nature, the same challenges arise in Analytics at much lower
volumes than what is traditionally considered Big Data.
• Big Data Analytics is:
• The same as ‘Small Data’ Analytics, only with the added challenges (and potential) of
large datasets (~50M records or 50GB size, or more)
• Challenges :
• Data storage and management
• De-centralized/multi-server architectures
• Performance bottlenecks, poor responsiveness
• Increasing hardware requirements
Source: SiSense8Copyright © William El Kaim 2015
Six V to Nirvana
and Visualization …
Source: James Higginbotham
Big Data: A collection of data sets so large
and complex that it becomes difficult to
process using on-hand database
management tools or traditional data
processing applications
10Copyright © William El Kaim 2015
Six V to Nirvana
Source: Bernard Marr 11Copyright © William El Kaim 2015
Six V to Nirvana
Source: Bernard Marr 14Copyright © William El Kaim 2015
Enterprise Big Data Flows
Dashboards,
Reports,
Visualization, …
CRM, ERP
Web, Mobile
Point of saleBig Data
Platform
Business
Transactions
& Interactions
Business
Intelligence
& Analytics
UnstructuredData
Log files
DB data
Exhaust Data
Social Media
Sensors, devices
Classic Data
Integration & ETL
Capture Big DataCollect data from all sources
structured &unstructured
ProcessTransform, refine,
aggregate, analyze, report
Distribute ResultsInteroperate and share data
with applications/analytics
FeedbackUse operational data w/in
the big data platform1 2 3 4
Source: HortonWorks 18Copyright © William El Kaim 2015
Enters Hadoop the Big Data Refinery
• Hadoop is not replacing anything.
• Hadoop has become another component in an organizations enterprise data platform.
• Hadoop (Big Data Refinery) can ingest data from all types of different sources.
• Hadoop then interacts and has data flows with traditional systems that provide transactions and interactions (relational databases) and business intelligence and analytic systems (data warehouses).
19Source: DBA Journey BlogCopyright © William El Kaim 2015
Hadoop: Open Source Bazaar Style Dev.
• Hadoop was first conceived at Yahoo as a distributed file system (HDFS)
and a processing framework (MapReduce) for indexing the Internet.
• It worked so well that other Internet firms in the Silicon Valley started using
the open source software too.
• Apache Hadoop, by all accounts, has been a huge success on the open
source front.
• Thousands of people have contributed to the codebase at the Apache
Software Foundation,
• Hadoop project has spawned off into dozens of Apache projects
• Hive, Impala, Spark, HBase, Cassandra, Pig, Tez, Ambari, and Mahout.
• Apart from the Apache Web Server, the Apache Hadoop family of projects is
probably the ASF’s most successful project ever.
20Copyright © William El Kaim 2015
Hadoop V1: Integration Options
Batch & Scheduled
Integration
Near Real-Time
Integration
Existing Infrastructure
HDFS
Pig
REST
Hive HBase
MapReduce
HCatalog
WebHDFS
Databases &
Warehouses
Applications &
Spreadsheets
Visualization &
Intelligence
Flume
Logs &
Files
Existing Infrastructure
HDFS
Pig Hive HBase
MapReduce
HCatalog
Databases &
Warehouses
Applications &
Spreadsheets
Visualization &
Intelligence
Logs &
Files
Data Integration (Talend, Informatica)
ODBC/JDBC
SQOOP
Source: HortonWorks 21Copyright © William El Kaim 2015
Hadoop: Elements
• Hive - A data warehouse infrastructure
than runs on top of Hadoop. Hive
supports SQL queries, star schemas,
partitioning, join optimizations, caching
of data, etc.
• Pig - A scripting language for
processing Hadoop data in parallel.
• MapReduce - Java applications that
can process data in parallel.
• Ambari - An open source management
interface for installing, monitoring and
managing a Hadoop cluster. Ambari
has also been selected as the
management interface for OpenStack.
• HBase - A NoSQL columnar databasefor providing extremely hast scanning of column data for analytics.
• Scoop, Flume and WebHDFS - toolsproviding large data ingestion for Hadoop using SQL, streaming and REST API interfaces.
• Oozie - A workflow manager and scheduler.
• Zookeeper - A coordinator infrastructure
• Mahout - a machine learning librarysupporting Recommendation, Clustering, Classification and FrequentItemset mining.
• Hue - is a Web interface that contains a file browser for HDFS, a Job Browser for YARN, an HBase Browser, QueryEditors for Hive, Pig and Sqoop and a Zookeeper browser.
22Copyright © William El Kaim 2015
Hadoop V1
23Source: Octo TechnologyCopyright © William El Kaim 2015
Example: Unified Log Analytics
25Source: Snowplow
Before: Batch-basedNormally run overnightSometimes every 4-6 hours
Copyright © William El Kaim 2015
Hadoop V2: YARN
• YARN (Yet Another Resource Negotiator) is the foundation for parallel
processing in Hadoop.
• YARN is:
• Scaleable to 10,000+ data node systems.
• Supports different types of workloads such as batch, real-time queries (Tez), streaming,
graphing data, in-memory processing, messaging systems, streaming video, etc. You
can think of YARN as a highly scalable and parallel processing operating system that
supports all kinds of different types of workloads.
• Supports batch processing providing high throughput performing sequential read scans.
• Supports real time interactive queries with low latency and random reads.
26Copyright © William El Kaim 2015
Hadoop V2: Spark Revolution
• Apache Spark has been winning over users since it was developed at the
University of California, Berkekey, AMPLab in 2009
• All of the major Hadoop distributions now support it, it’s a top-level Apache
Software Foundation project and there’s a startup, called Databricks,
dedicated to productizing, supporting and certifying Spark.
• Spark actually extends and generalized the MapReduce execution model to
be able to do more types of computations more efficiently.
27Copyright © William El Kaim 2015
Hadoop V2: Spark Stack
28
Spark powers a stack of high-level tools including Spark
SQL, MLlib for machine learning, GraphX, and Spark
Streaming. You can combine these libraries seamlessly in the
same application.
Copyright © William El Kaim 2015
Hadoop V2: Spark Stack Evolutions (2015)
29Source: Databricks
Goal: unified engine
across data sources,
workloads and environments
DataFrame is a
distributed collection
of data organized
into named columns
ML pipeline to define
a sequence of data
pre-processing,
feature extraction,
model fitting, and
validation stages
Copyright © William El Kaim 2015
Big Data: New Dev Tools
30http://www.dataiku.com/dss/
Free Community Edition(mac, linux, docker, aws)
Copyright © William El Kaim 2015
Hadoop V2: Spark Advantages
• Spark replaces MapReduce.
• MapReduce is inefficient at handling iterative algorithms as well as interactive data
mining tools.
• Spark is fast: uses memory differently and efficiently
• Run programs up to 100x faster than MapReduce in memory, or 10x faster on disk
• Spark excels at programming models
• involving iterations, interactivity (including streaming) and more.
• Spark offers over 80 high-level operators that make it easy to build parallel apps
• Spark runs Everywhere
• Runs on Hadoop, Mesos, standalone, or in the cloud. It can access diverse data sources
including HDFS, Cassandra, HBase, S3.
31Copyright © William El Kaim 2015
Beyond Hadoop: The streaming future of big data
• With Apache Spark and a number of other technologies (Storm, Kafka, and
so on), we seem to be veering away from batch processing with Hadoop and
toward a real-time future.
• Lambda Architecture will marry the Two!
32Copyright © William El Kaim 2015
Beyond Hadoop: The streaming future of big data
33
Before
Today
Future
Netflix Data Pipeline
Source: Netflix
Apache Kafka is publish-subscribe messaging rethought as a distributed commit log
Copyright © William El Kaim 2015
Beyond Hadoop: The streaming future of big data
34Source: Snowplow
Unified Log with Amzon Kinesis and
Snowplow
Copyright © William El Kaim 2015
StreamTools
35Copyright © William El Kaim 2015 http://blog.nytlabs.com/streamtools/
Beyond Hadoop: The streaming future of big data
36
Edmunds.com Streaming platform to Build a Near Real-
Time Dashboard
Source: ClouderaCopyright © William El Kaim 2015
Big Data Integration
37Source: Tibco
TIBCO ActiveMatrix
BusinessWorks 6 +
Apache Hadoop =
Big Data Integration
Copyright © William El Kaim 2015
Hadoop Distribution: A Game of Three
• According to Wikibon’s latest market analysis, spending on Hadoop software
and subscriptions accounted for a mere $187 million in 2014, or less than 1
percent of $27.4 billion in overall big data spending.
• Wikibon expects Hadoop spending on software and subscriptions to grow to
$677 million by 2017, when the overall big data market will have grown to
$50 billion.
• That’s just over 1 percent, and if you include professional services, it more than doubles
to about 3 percent.
• None of the three pure-play Hadoop distributors have yet to turn a profit.
• Cloudera, Hortonworks, and MapR Technologies
Source: Wikibon’s Big Data Vendor Revenue and Market Forecast 2011-2020 report.
38Copyright © William El Kaim 2015
Cloudera Adds Proprietary Tools
Source: Cloudera 39Copyright © William El Kaim 2015
Hadoop V2: Cloudera Impala
• Impala is Cloudera’s massively parallel processing (MPP) SQL query engine
for data stored in a computer cluster running Apache Hadoop.
• Can be seen as an Analytical Database on Top of Hadoop
• Impala enables users to issue low-latency SQL queries to data stored in
HDFS (Hadoop’s distributed file system) and Apache HBase (non-relational,
distributed database) without requiring data movement or transformation.
• Impala is integrated with Hadoop to use the same file and data formats,
metadata, security, and resource management frameworks used by
MapReduce, Apache Hive, Apache Pig, and other Hadoop software.
• Contrary to classic Hadoop processing using MapReduce, Impala is much
faster—a query response only takes a few seconds in many use cases.
40Copyright © William El Kaim 2015
IBM BigInsights
42Source: IBMCopyright © William El Kaim 2015
Source: ComputerWorld 43Copyright © William El Kaim 2015
Interesting Tools
• Druid is an open-source analytics data store designed for OLAP queries on
timeseries data (trillions of events, petabytes of data).
• SyncSort Hadoop ETL Solution extends the capabilities of Hadoop, turning it
into a highly scalable, affordable, and easy-to-use data integration
environment.
• ZoomData Big Data Exploration, Visualization & Analytics Platform
• Snowplow is an Event Analytics Platform. It gives you all your event-level,
customer-level data in your own Data warehouse to power analytics or
enables Unified log for real-time processing
• OpenTSDB (HBase) and Kairos: Time-series databases built on top of open-
source nosql data stores
• Aerospike, VoltDB:Database software for handling large amounts of real-time
event data.
44Copyright © William El Kaim 2015
Interesting Tools
• Graphistry, Splunk, SumoLogic, ScalingData, and CloudPhysics use some
open source software components in their IT monitoring products.
• Graphite, Anodot or SignalFx: Modern monitoring platform using streaming
analytics
• Hfactory: The application stack for Hadoop
45Copyright © William El Kaim 2015
Others: Pachyderm
46Copyright © William El Kaim 2015 http://www.pachyderm.io/
History of Database
Source: Robin Purohit 49Copyright © William El Kaim 2015
Data Storage Limitations
• Today, most structured data storage is managed in a relational database.
Most relational databases enforce a set of rules to ensure that data is
consistent (CAP Theorem). They also ensure transactions are atomic –they
either succeeded or failed (ACID).
• With these rules in place, it becomes much harder to spread data out to
multiple nodes to increase retrieval speed and therefore processing speed.
• Newer databases are offering different approaches to the CAP Theorem and
ACID compliance to overcome these limitations.
Source: James Higginbotham 50Copyright © William El Kaim 2015
CAP theorem
• CAP stands for Consistency, Availability, and Partition Tolerance. CAP
determines how your data storage will operate when data is written and the
availability of the data when you go to retrieve it later (even under failure). You
cannot have everything from all three categories, so each database must choose
what they will implement and what they will sacrifice. In general:
1. Relational databases choose Consistency and Availability (CA), ensuring writes are
consistent and immediately available across all instances
2. Many new database vendors are opting for Availability and Partition Tolerance (AP), accepting
new/updated records without immediate confirmation (“eventually consistent”)
3. Other database vendors are opting for Consistency and Partition Tolerance (CP), allowing
arbitrary loss of messages to some instances, while the system continues to be available
• Many vendors are experimenting with various combinations to satisfy specific
use cases, and are also choosing to require specific infrastructure/architecture to
support their implementation.
Source: James Higginbotham 51Copyright © William El Kaim 2015
The Need for Scalability
• By understanding the CAP theorem, we can see that traditional relational
databases generally require slower performance to ensure transaction
consistency across one or more database servers.
• This is due to the requirement that the storage of data must occur on each database
server, limiting vertical scaling to the speed of the slowest server’s speed of storage.
• While transaction consistency may be critical for some systems, when
datasets reach extreme scale, traditional databases often cannot keep up
and require alternative approaches to data storage and retrieval.
• The result: big data architectures are required to overcome these limitations
as our data grows beyond the reach of a single server or database cluster.
Source: James Higginbotham 52Copyright © William El Kaim 2015
What is NoSQL?
• Stands for Not Only SQL
• The term NOSQL was introduced by Carl Strozzi in 1998 to name his file-
based database
• It was again re-introduced by Eric Evans when an event was organized to
discuss open source distributed databases
• Eric states that “… but the whole point of seeking alternatives is that you need to solve a
problem that relational databases are a bad fit for. …”
• Three major papers were the “seeds” of the NOSQL movement:
• BigTable (Google), DynamoDB (Amazon)
• CAP Theorem
53Copyright © William El Kaim 2015
NoSQL Seeds …
• Three major papers were the “seeds” of the NOSQL movement:
• BigTable (Google)
• DynamoDB (Amazon)
• Ring partition and replication
• Gossip protocol (discovery and error detection)
• Distributed key-value data stores
• Eventual consistency
• CAP Theorem
54Copyright © William El Kaim 2015
Brewer’s CAP Theorem
• For any system sharing data, it is “impossible” to guarantee simultaneously
all of these three properties:• Consistency: all copies have same value
1. Strong consistency – ACID (Atomicity, Consistency, Isolation, Durability)
2. Weak consistency – BASE (Basically Available Soft-state Eventual consistency)
• Availability: reads and writes always succeed
• Partition-tolerance: system properties (consistency and/or availability) hold even when network failures prevent some machines from communicating with others
• You can have at most two of these three properties for any shared-data
system
55Copyright © William El Kaim 2015
ACID vs. CAP
• ACID
A DBMS is expected to support “ACID transactions,” processes that are:
• Atomicity: either the whole process is done or none is
• Consistency: only valid data are written
• Isolation: one operation at a time
• Durability: once committed, it stays that way
• CAP• Consistency: all data on cluster has the same copies
• Availability: cluster always accepts reads and writes
• Partition tolerance: guaranteed properties are maintained even when network failures prevent some machines from communicating with others
56Copyright © William El Kaim 2015
NoSQL Databases Taxonomy
• Key-value (SimpleDB, Riak, Redis, DynamoDB, Voldemort)• Focus on scaling to huge amounts of data
• Designed to handle massive load
• Based on Amazon’s dynamo paper
• Document-based (MongoDB, CouchDB, CouchBase, Riak)• Can model more complex objects
• Data model: collection of documents
• Document: JSON, XML, other semi-structured formats.
• Column-based (BigTable, Hbase, HyperTable, Cassandra)• Based on Google’s BigTable paper
• Like column oriented relational databases (store data in column order) but with a twist
• Tables similarly to RDBMS, but handle semi-structured
• Data model: Collection of Column Families
• Column family = (key, value) where value = set of related columns (standard, super)
57Copyright © William El Kaim 2015
NoSQL Databases Taxonomy
• Graph-based (Neo4j, FlockDB, Pregel, InfoGrid, Titan)
• Focus on modeling the structure of data (interconnectivity)
• Scales to the complexity of data
• Inspired by mathematical Graph Theory (G=(E,V))
• Data model: nodes and edges and key-value pairs on both
• Nodes may have properties (including ID)
• Edges may have labels or roles
• Tools
• Gremlin graph query language, Frames object-to-graph mapper, Rexster graph server and
Blueprints standard graph API
• Distributed graph analysis frameworks. Combinatorial BLAS, GraphLab, SociaLite, and Giraph
• NodeXL is a free, open-source template for Microsoft® Excel® 2007 and 2010 that lets you
enter a network edge list into a workbook, click a button, and see the network graph.
58Copyright © William El Kaim 2015
CAP and Databases Visual Guide
Source: http://blog.beany.co.kr/archives/275 59Copyright © William El Kaim 2015
NoSQL Databases
Source: Ben Scofielfd60Copyright © William El Kaim 2015
61
Source: Octo Technology
Copyright © William El Kaim 2015
62
Source: Octo Technology
Copyright © William El Kaim 2015
Nosql Data Modeling Techniques
Source: Highly Scalable 64Copyright © William El Kaim 2015
Application Performance Management (APM)
• First generation APM solutions were built around a set of assumptions that
are in many cases no longer true today.
• The application was going to get built or bought, then run inside the firewalls of the
enterprise data center.
• Applications were going to get built in Java or .NET, which were for a while the dominant
development environments used by developers.
• The average application would only get enhanced several times a year.
• Finally, many first generation APM solutions completely ignored the fact that in most
enterprises, 80% of the applications are purchased commercial applications and not
custom developed by the enterprise themselves.
• In fact all of the above assumptions are being invalidated in modern
enterprise environments.
Source: APMExperts 66Copyright © William El Kaim 2015
Application Performance Management (APM)
Source: APMExperts 67Copyright © William El Kaim 2015
Agile Development and Devops
• The unrelenting pressure to deliver more application functionality in less time
has given rise to other important trends: Agile Development as a
development methodology and “DevOps” as a methodology for managing
applications in production.
• Agile Development focuses upon having one developer responsible for each
component of an application system, and then having those developers work
as a self-coordinating team to deliver new functionality into production on
regular and short time intervals (every week, two weeks or a month at most).
Source: APMExperts 68Copyright © William El Kaim 2015
Continuous Integration
• “Software development practice where engineers integrate frequently,
leading to multiple integrations per day. Each integration is verified by an
automated build and test to detect integration errors as quickly as possible”.
Martin Fowler
• Key Principles:
• Maintain a Single Source Repository
• Everyone commits to mainline every day
• Every Commit should build the mainline on integration machine
• Keep the Build fast
• Everyone can see what's happening
69Copyright © William El Kaim 2015
Continuous Delivery
• “Continuous Delivery is a set of practices and principles aimed at Building,
Testing, and releasing software faster and more frequently”.
• Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment
Automation, by Martin Fowler.
• Key Principles:
• The process for releasing/deploying software MUST be repeatable and reliable.
• Automate everything!
• Done means “released”.
• If something difficult or painful, do it more often.
• Keep everything in source control
• Build quality in!
• Improve continuously.
• Everybody is responsible for release process
70Copyright © William El Kaim 2015
Devops
• DevOps is about eliminating the walls between application development and
production application support – essentially creating one team that builds the
application and supports it in production.
• “A set of processes, methods, and systems for communication, collaboration,
and integration among the IT functions responsible for application
development, infrastructure and operations, and quality assurance; with the
functions working together to produce fit-for-purpose and timely software
products and services”. Forrester
71Copyright © William El Kaim 2015
Devops Tools
• Ansible
• Bower
• Capistrano
• Chef
• Puppet
• Travis CI / Jenkins CI / Snap-CI
• Vagrant
73http://stackshare.io/devopsCopyright © William El Kaim 2015
Terraform
• Infrastructure as Code: Infrastructure is described using a high-level configuration syntax. This allows a blueprint of your datacenter to be versioned and treated as you would any other code. Additionally, infrastructure can be shared and re-used.
• Execution Plans: Terraform has a "planning" step where it generates anexecution plan. The execution plan shows what Terraform will do when you call apply.
• Resource Graph: Terraform builds a graph of all your resources, and parallelizes the creation and modification of any non-dependent resources.
• Change Automation: Complex change sets can be applied to your infrastructure with minimal human interaction.
74Copyright © William El Kaim 2015
From Virtual Machine to Container
• Deployment of server applications is getting increasingly complicated since
software can have many types of requirements:
• Dependencies on installed software and libraries
• Dependencies on running services
• Dependencies on a specific operating systems
• Dependencies on Resources
• minimum amount of available memory ("requires 1GB of available memory")
• ability to bind to specific ports ("binds to port 80 and 443")
• To solve these issues, the main technical answer was: run each individual
application on a separate virtual machine.
Source: infoQ 76Copyright © William El Kaim 2015
Virtual Machine: Expensive in two ways
• Money
• You need to predict the instance size you will need, because if you need more resources
later, you need to stop the VM to upgrade it (or over-pay for resources you don't end up
needing.
• Unless you use Solaris Zones, like on Joyent, which can be resized dynamically.
• Time
• Many operations related to virtual machines are typically slow!
• booting takes minutes, snapshotting can take minutes, creating an image takes minutes
• Enter Docker.
Source: infoQ 77Copyright © William El Kaim 2015
Docker: Package Once Deploy Anywhere
• Docker is an open platform for developers and sysadmins to build, ship, and
run distributed applications.
• Consists of Docker Engine, a portable, lightweight runtime and packaging
tool, and Docker Hub, a cloud service for sharing applications and
automating workflows.
• Docker enables apps to be quickly assembled from components and
eliminates the friction between development, QA, and production
environments.
• As a result, IT can ship faster and run the same app, unchanged, on laptops,
data center VMs, and any cloud.
https://www.docker.com/whatisdocker/ 78Copyright © William El Kaim 2015
Docker
https://www.docker.com/whatisdocker/ 79Copyright © William El Kaim 2015
Docker
• From a technical perspective Docker is plumbing to make two existing
technologies easier to use:
• LXC: Linux Containers, which allow individual processes to run at a higher level of
isolation than regular Unix process. The term used for this is containerization: a process
is said to run in a container. Containers support isolation at the level of:
• File system: a container can only access its own sandboxed filesystem (chroot-like), unless
specifically mounted into the container's filesystem.
• User namespace: a container has its own user database (i.e. the container's root does not
equal the host's root account)
• Process namespace: within the container only the processes part of that container are visible
(i.e. a very clean ps aux output).
• Network namespace: a container gets its own virtual network device and virtual IP (so it can
bind to whatever port it likes without taking up its hosts ports).
• AUFS: advanced multi layered unification filesystem, which can be used to create union,
copy-on-write filesystems.
Source: infoQ 80Copyright © William El Kaim 2015
Docker Advantages
• It's very lightweight.• Booting up a Docker container has very little CPU and memory overhead and is very
fast. Almost comparable to starting a regular process.
• Not only running a container is fast, building an image and snapshotting the filesystem is as well.
• Amazon Lambda is built on Docker and usage of the service is billed every 100 ms.
• It works in already virtualized environments.• You can run Docker inside an EC2 instance, a Rackspace VM or VirtualBox.
• On Mac and Windows use Vagrant.
• Docker containers are portable to any operating system that runs Docker. • Whether it's Ubuntu or CentOS, if Docker runs, your container runs.
• Docker is powered by Tools• Clocker is an open source project which lets you spin up a Docker Cloud.
• Google Kubernetes to manage clusters and Tectonic its commercial version from CoreOS
Source: infoQ 81Copyright © William El Kaim 2015
Cloud OS, Docker
• New Bare Metal Lightweight OS will natively support Docker
• Canonical Snappy Ubuntu Core (Snappy)
• Red Hat started an initiative called Project Atomic Hosts
• Suse JeOS
• CoreOS: Open Source Projects for Linux Containers
• New Cloud OS
• ClickOS, Drawbridge, ErlangOnXen, HalVM, GUK11, MiniOS, MirageOS, NetBSD
“rump”, Osv.
• Docker Load Balancing
• Open Source: Nginx and HAProxy
• Proprietary: Appcito CAFE for Docker / Nginx Plus
• Cloud Computing Stack
• Openstack / Eucalyptus
82Copyright © William El Kaim 2015
Microsoft and Docker
• Hyper-V Containers will ensure code running in one container remains isolated and cannot
impact the host operating system or other containers running on the same host
• powered by Hyper-V virtualization
• While Hyper-V containers offer an additional deployment option between Windows
Server Containers and the Hyper-V virtual machine, you will be able to deploy them
using the same development, programming and management tools you would use for
Windows Server Containers
Copyright © William El Kaim 2015 83Source: Azure Blog
Microsoft and Docker
• Nano Server: The Nucleus of Modern Apps and Cloud
• OS for the primary purpose of powering born-in-the-cloud applications.
• The result is Nano Server, a minimal footprint installation option of Windows Server that
is highly optimized for the cloud, including containers.
• This small footprint makes Nano Server an ideal complement for Windows Server
Containers and Hyper-V Containers, as well as other cloud-optimized scenarios.
• Nano Server focuses on two scenarios:
• Born-in-the-cloud applications – support for multiple programming languages and
runtimes. (e.g. C#, Java, Node.js, Python, etc.) running in containers, virtual machines,
or on physical servers.
• Microsoft Cloud Platform infrastructure – support for compute clusters running Hyper-V
and storage clusters running Scale-out File Server.
• You can read more about the technology on the Windows Server blog.
Copyright © William El Kaim 2015 84Source: Azure Blog
Docker: Asset of the devops
• Docker allows each development team to implement services using
whatever language, framework or runtime they deem appropriate.
• The only requirement they have to get their service to production is to
provide a Docker image (plus some basic run configuration in a YAML file) to
the Ops Team.
Source: Tom Leach 85Copyright © William El Kaim 2015
Docker: Asset of the devops
• The Ops Team’s responsibilities are now restricted to simply building and
maintaining a pipeline for deploying Docker containers without needing to
concern themselves with what code each container actually contains.
• The contents of the Docker image are solely the responsibility of the
development team.
• This allows the Ops Team to focus on core deployment problems
• Moreover, this arrangement allows engineering teams to scale.
• You can add more and more development teams and, as long as we adhere to the rule
that every shippable service must be bundled in a Docker image, we add no additional
cognitive load to the Ops Team.
86Source: Tom LeachCopyright © William El Kaim 2015
Docker Tools
• Kubernetes: Docker management tool developed by Google for deploying
containers across clusters of computers.
• https://github.com/GoogleCloudPlatform/kubernetes
• Dockersh: Dockersh lets multiple users connect to a given box, with each
user running a shell spawned from a separate Docker container.
• https://github.com/Yelp/dockersh
• DockerUI: Web front end allows you to handle many tasks normally
managed from the command line of a Web browser.
• https://github.com/crosbymichael/dockerui
• Shipyard: Shipyard uses the Citadel cluster management toolkit to facilitate
management of Docker container clusters that span multiple hosts.
• https://github.com/shipyard/shipyard
87Source: InfoWroldCopyright © William El Kaim 2015
Docker Tools
• Kitematic: makes Docker useful as a desktop-environment developer’s tool
for OS X-based programmers. Bought by Docker.
• https://github.com/kitematic/kitematic
• Other solutions for Mac: DVM, Docker OS X, and OS X Installer
• Logspout: route container-app logs to a single central location, such as a
single JSON object or a streamed endpoint available through an HTTP API.
• https://github.com/progrium/logspout
• Autodock: Deploys new containers as fast as possible by determining which
servers in a given Docker cluster have the least load.
• https://github.com/cholcombe973/autodock
• DIND (Docker-in-Docker): A way for you to run Docker within Docker
containers.
• https://github.com/jpetazzo/dind
88Source: InfoWroldCopyright © William El Kaim 2015
How Does Docker Eliminate These Risks
• Docker Advantages
• Assets are baked into an immutable image at build time
• No deploy-time dependencies on 3rd party repository
• Docker registry is simple and easy to scale
• Dependencies simple, explicit and direct
• Rollback is trivial
• Docker Misconceptions (source: LockerDome)
• If I learn Docker then I don't have to learn the other systems stuff!
• You should have only one process per Docker container!
• If I use Docker then I don't need a configuration management (CM) tool!
• I have to use Docker in order to get these speed and consistency advantages!
Source: Tom Leach 89Copyright © William El Kaim 2015
Programmatic Infrastructure
• Developers could now program without taking care of the infrastructure and
the different platforms to deploy on
• AWS Elastic Beanstalk lets you deploy your code in seconds.
• AWS CodeDeploy is a continuous delivery and deployment service
• Even the Developer environment (IDE) is now in the Cloud
• Codenvy simplifies setting up environment and running apps
• And the devops could now program the infrastructure like code
• AWS CloudFormation is like « magic » and lets you manage infrastructure with text files
you could store in configuration management tool
• Devops could also reuse Cloud infrastructure and software bricks and assemble them
like legos (like Amazon AWS) or duplos (like BitNami) depending on their granularity.
• Non functional requirements are now available as services
• Everybody could benefit from a world class infrastructure since day one
91Copyright © William El Kaim 2015
Codenvy & GitHub: The New Dev Platform
92
https://codenvy.com/
https://enterprise.github.com/aws
Copyright © William El Kaim 2015
Infrastructure As Lego: Amazon AWS
Source: Amazon AWS 93Copyright © William El Kaim 2015
Infrastructure as Lego
Source: Amazon AWS 95Copyright © William El Kaim 2015
Infrastructure As Lego: Google Cloud Platform
96
http://googlecloudplatform.blogspot.fr/2015/03/deploy-popular-software-packages-using-Cloud-Launcher.html
https://cloud.google.com/actual-cloud/
Copyright © William El Kaim 2015
Infrastructure As Lego: Microsoft Azure
97http://azure.microsoft.com/en-us/Copyright © William El Kaim 2015
Infrastructure As Lego: IBM BlueMix
98https://console.ng.bluemix.net/http://www-01.ibm.com/software/bluemix/Copyright © William El Kaim 2015
Scalability As A Service
• Search• Algolia, a YC backed startup.
• Unlike ElasticSearch’s open source solution they offer their proprietary search technology via the hosted model
• Amazon cloudsearch
• ElasticSearch (with Kibana)
• Apache Solr (with Banana)
• Newsfeed/ Activity Streams• Stream Framework
• GetStream.io
• Others: Cassandra, Redis, Celery and RabbitMQ
• Realtime Service• Faye, PubNub will enable you to be ready in few minutes
• StreamData.io transforms any JSON API into a real-time push API without a single line of server side code
Source: High Scalabity 99Copyright © William El Kaim 2015
Scalability As A Service
• Event Analytics
• Snowplow
• Perfkit
• PerfKit Benchmarker: PerfKit is unique because it measures the end to end time to
provision resources in the cloud, in addition to reporting on the most standard metrics of
peak performance.
• Perfkit Explorer: a visualization tool
• Cloud Cost Advisor
• AWS Simple Monthly Calculator & AWS TCO calculator & AWS Trusted Advisor
• CloudCheckr
• Cloudability
• Cloud Cruiser
100Copyright © William El Kaim 2015
Machine Learning as A Service
• Machine learning is a scientific discipline that explores the construction and
study of algorithms that can learn from data.
• Such algorithms operate by building a model from example inputs and using
that to make predictions or decisions, rather than following strictly static
program instructions.
• Available as a Service
• Algorithms.io, Amazon ML, BigML,
Google Prediction API, Microsoft Azure ML
• Example
• BVA with Microsoft Azure ML
• Quick Review of Amazon Machine Learning
• BigML training Series
101Copyright © William El Kaim 2015
CIMI: Cloud IAAS Standard
• Cloud Infrastructure Management Interface
• Specification that standardizes interactions between cloud environments to achieve
interoperable cloud infrastructure management between service providers and their
consumers and developers, enabling users to manage their cloud infrastructure use
easily and without complexity.
• Primer
• Cloud Infrastructure Management Interface Model and RESTful HTTP-based Protocol
102Copyright © William El Kaim 2015
Backend As A Service (BAAS)
• BaaS is an approach for providing web and mobile app developers with a
way to connect their applications to backend cloud storage and processing
while also providing common features such as user management, push
notifications, social networking integration, and other features that mobile
users demand from their apps these days.
• This new breed of BaaS services are provided via custom software
development kits (SDK) and application programming interfaces (APIs).
• BaaS is a relatively recent development in cloud computing, with most BaaS
start-ups dating from 2011 or later.
• The global BaaS market is estimated to grow from $216.5 million in 2012 to
$7.7 billion in 2017 from a report publishedby MarketsandMarkets.
Source: API Evangelist 104Copyright © William El Kaim 2015
How Does BaaS Differ From IaaS and PaaS?
• Baas has evolved out of frustration around deployment of IaaS platforms like
Amazon Web Services, just to fire up a single new mobile application.
• BaaS is about abstracting away the complexities of launching and managing
your own infrastructure, then bridging a stack of meaningful resources
targeting exactly what developers need to build the next generation of mobile
apps.
• BaaS, has a lot of the same intent as PaaS, to speed up the application
development process, but BaaS is purely a backend
• providing an infrastructure that automatically scales and optimizes, bundled with a set of
essential resources developers require
Source: API Evangelist 106Copyright © William El Kaim 2015
What Are The Benefits of BaaS?
• Efficiency Gains - Reducing overhead in all aspects of app devt, increasing
efficiency at all stages of development
• Faster Times to Market - Reducing the obstacles to take a mobile app from
idea to production and overhead with operations once in production
• App Delivery With Fewer Resources
• Optimize for Mobile and Tablets - BaaS providers have put a lot of time and
resources into optimization of data and network for mobile apps, and reduce
fragmentation problems across multiple platforms and devices.
• Secure and Scalable Infrastructure
• Stack of Common API resources - BaaS brings common and essential 3rd
party API resources into a single stack, preventing developers from having to
go gather them separately
Source: API Evangelist 107Copyright © William El Kaim 2015
Facebook Parse / Google EndPoints
https://cloud.google.com/endpoints/
https://www.parse.com/ 108Copyright © William El Kaim 2015
Microsoft Azure AppServices
http://azure.microsoft.com/en-us/services/app-service/ 109Copyright © William El Kaim 2015
AppDynamics
http://www.appdynamics.com/ 110Copyright © William El Kaim 2015
Backend As A Service (BAAS)
• Firebase: offers a real-time service for managing apps that orchestrates
back-end tasks (Bought by Google!)
• StackMob: impressive set of services. Their storage engine uses schemas
which can be auto-generated for you based on your data, or you can use a
dashboard they provide to define them. They support file storage if you have
an Amazon S3 account
• Wibidata: has developed a framework for companies to develop services that
stores the user’s data across multiple dimensions
• Telerik Backend Services (formerly known as “Everlive”) : object store at the
core, but enables creation of relationships between objects, upload files,
store geospatial data (and query it comparatively to other locations for
distance) and more.
112Copyright © William El Kaim 2015
Backend As A Service (BAAS)
• Anypresence: http://www.anypresence.com/
• Apigee App Services: http://apigee.com/docs/app_services
• APISpark: http://apispark.com/
• Build.io: http://www.built.io/
• Emergent One: http://www.emergentone.com/
• IFFT: https://ifttt.com/login
• Kinvey: http://www.kinvey.com/
• Proxomo: http://www.proxomo.com/
113Copyright © William El Kaim 2015
UX Tools: Main Needs
• Create: Flexible creation for custom details and better authenticity.
• Print: Ability to create high quality documentation as tangible printed
deliverables.
• Prototype: Ability to prototype experiences with interactions and motion.
• Present: Ability to present digitally.
• Collaborate: Ability to collaborate with stakeholders and make documentation
of feedback easier with version control.
• Test: Integrated user testing options.
116Source: ixdeology.com
Copyright © William El Kaim 2015
UX Tools
• JustinMind & Axure: Industry leading tool widely used.
• Adobe Illustrator/ Photoshop & Adobe InDesign: if you can afford them!
• MindNode: For paper-less brainstorming of site user flows, etc.
• inVision or MarvelApp: Focus on prototyping and collaboration
• Lucid Chart: For clean and fast process flow creation for digital strategies,
and site architecture maps.
• VerifyApp: Pre-development A/B Testing, design surveys, etc.
• SolidifyApp: Multi-device user testing, behavior analysis, etc. User Testing
• Full comparison chart
117Source: ixdeology.com
Copyright © William El Kaim 2015
Embrace The Legacy
http://www.convertigo.com/ 119Copyright © William El Kaim 2015
Extend The Legacy
120https://www.kimonolabs.com/Copyright © William El Kaim 2015
Unified Experience
121https://developer.chrome.com/apps/getstarted_arc
The App Runtime for Chrome,
or ARC, lets you run your
favorite Android apps on Chrome OS.
Copyright © William El Kaim 2015
Experience is Key
Source: Mobile Megatrends 2014122Copyright © William El Kaim 2015
The House of Cards
• Touch screens and, more recently, weareable devices are beginning to
change the dynamics of content consumption
• While most apps still ask to use fingers and thumbs to scroll down pages of
content and poke at buttons, they are now increasingly favoring gestures
• swiping, pinching, tapping, and even clawing
• With apps like Apple’s Siri and Google Now, developers should be looking to
a far more futuristic paradigm: the natural-language, contextually aware
human-computer “conversation.”
• Move from “applications” to “in context cards” (like Google Now cards or Apple
Passbook)
• Wearable devices using voice command
123Copyright © William El Kaim 2015
Information Pushed To Me In Context
• Load contextual content onto the “back” of a card to design a distraction-free
front side that well shows off the anchor content.
• Many of cards (built in HTML5) will ultimately be portable
• They can be transported intact from app to app, platform to platform without losing their
design context or in-built architecture.
• Examples
• Cards are now a major feature of Google+ and Google Maps, complementing the
existing experience on Google’s newest and most innovative projects, Glass and Google
Now (opening its API …)
• Apple Passbook
• Twitter cards
• Summary Card, Large Image Summary Card, Photo Card, Gallery Card, App Card, Player
Card, Product Card
124Copyright © William El Kaim 2015
Standards for Cards
Source: http://cardstack.io/ 125Copyright © William El Kaim 2015
Source: TheFamily127Copyright © William El Kaim 2015
Plateau of Productivity
Slope of enlightenment
Peak of Inflated
Expectation
InnovationTrigger
Trough ofDisillusionment
Maturity
Vis
ibili
ty
Source: Extended from Gartner Hype Cycle and Philippe MÉDA (Merkapt)
$$$
Blind Leaders
Impeding Challengers Opportunistic
Leaders
Rambling
Patent, AlgorithmLots of small tests,learning by doing
Building
Co-creation, Design Thinking, API, MVP, platforms, business
models
research tax credit, VCs, Innovation budget
Selling
Selling, marketing, pricing, advertising, ecosystems
Commercial Deployment
128
CIR: is a French
Right MarketRight ProductRight TimeRight Business Model
Copyright © William El Kaim 2015
Digital Darwinism
Source: Altimeter 129Copyright © William El Kaim 2015
Digital Transformation Requires an Agile Approach
130Source: BCGCopyright © William El Kaim 2015
132
Strategy
Technology
Business
Architecture
Application
Portfolio
Application
Architecture
Technical
Architecture
Lean StartupNew Business Model
Service Design
Hybrid CloudInfrastructure as
Code & Lego
Devops & Elastic Infra.New Databases
Non structured andimmutable data
Authentication &Digital Keys Mgt
Appstore MgtAgile DevHackaton
API - RESTfulElastic App
Big Data Analytics & Intelligence
Minimum Viable Digital
Platform
MicroservicesApi
Ephemeral AppsAdaptive UX
External SaaS Services
CRM, Marketing, Ads
Copyright © William El Kaim 2015
Design, Design, Design…
Source: Mike Clark 133Copyright © William El Kaim 2015
Agile Development & Devops
Scale enable elastic Cloud Architecture and devops techniquesSpeed enables and encourages new microservice architectures
Scale breaks HardwareSpeed breaks Software
Source: Adrian Cockcroft 135Copyright © William El Kaim 2015
Minimum Viable Digital Platform Needed…
136
The new digital, networked, real-time society forces us to start thinking and acting as an ecosystem.
Build your own Minimum Viable Digital Platform
And create Open APIs To encourage startups and partners
to hook in
Copyright © William El Kaim 2015
Minimum Viable Digital Platform Needed
137
IT core services
Legacy Bus. App
Commodity
services
CRM, social
network, etc.
Product-As-A-ServiceSovereign IT
Avoid Accidental
Architecture
Copyright © William El Kaim 2015
Build Your Own Digital Platform!
content
Bus. Services (API) or IS capability
Features in products
Platform, microservice
Transition
1st Question: Inside or Outside Soverign IT ?
2nd Question: Granularity level?
Adopt a MicroserviceApproach when possible
when prototyping
Ideation
Prototyping: Testing
The Idea in Real Business Deployment &
Scaling
Leverage
139Copyright © William El Kaim 2015
Modern Architectures for Adaptive Apps
140Source: Apigee
Copyright © William El Kaim 2015
Are You Ready?
• Data: Did you open your data silos and offer data to all ?
• Microservice: Did you leverage innovation by testing often and building
microservices based prototypes validated with clients?
• Platform: Did you adopt a fully digital approach to creating the experience
and the platform first instead of focusing on a product ?
• Mobile: Could a customer buy any of your company product or services
using one finger in less than 15s?
• Big data: Are the data concerning your products and your customers big
enough to not be managed in excel?
• Infrastructure: Did you treat infrastructure as code and leverage continuous
development and deployment (devops)?
141Copyright © William El Kaim 2015
Are You Ready?
• Conversation: Could a client engage easily a digital conversation or share
something about your company product/services at any time in different
ways?
• Customer service: Are you offering and end to end service and be able to
track the whole steps of your customer digital journeys?
• Brand reputation: Do you monitor in real time your brand reputation and do
you nurture brand advocates?
• Agile: Are your ready to build and market a product/service in weeks and kill
it after two or three years?
142Copyright © William El Kaim 2015
http://www.twitter.com/welkaim
SlideShare
http://www.slideshare.net/welkaim
http://fr.linkedin.com/in/williamelkaim
La Revue Du Digital
http://www.larevuedudigital.com/william-el-kaim/
143Copyright © William El Kaim 2015