Structure Europe IT Operations of the future AI and analytics based automation 18. – 19. September 2013 Hans-Christian Boos @boosc
Sep 12, 2014
Structure Europe
IT Operations of the future
AI and analytics based automation
18. – 19. September 2013
Hans-Christian Boos @boosc
Wouldn't it be cool to have more time?
Because you can choose how to use your time
convert your time to
quality
innovation
$ $ $ savings
Focus on creating time out of operations
operations improvement
In IT we spend 80% of our time on operations
Converting time to money
success story on operating
a major banking application
112
no
automation
30%
automation
78
45
22
8
50%
automation
80%
automation
arago 93%
automation
status at
project start
minimum expected
time gain
time gain
achieved
maximum expected
result
number of FTE experts needed
business case for
first automation project
Converting time to quality
success story on automating the portal of the
world market leader in lighting solutions
Mar 2013
10th generation of
application 900Mio€
transactions p.a.
Apr 2006
start of project
Aug 2006
1 month after end
of migration
Dec 2006
70 %
availability
95 %
availability
99,999 %
availability
99,998 %
availability
ChangeabilityQuality
Cost Fun
ChangeabilityQuality
Converting time to innovation
success story on moving from a governance to a
brokerage culture in IT provider management
Cost
choose 2, forget 1 have all
We treat IT like industrial production:
standardize, taylorize, consolidate
8 years in design and development –
unlikely
15+ years of monetisation on one platform –
wishful thinking
Accepting complexity as a fact of IT is key
We take a new approach combining
the best of both worlds
there are 2 approaches to operate IT
- no flexibility
- limited to
IaaS / PaaS
- no durability
+ efficiency
+ repeatability
PEOPLE CENTRICROBOTIC /
STANDARDIZED
- cost
- varying
quality
- availability
+ agility
+ accountability
A machine continuously trained by humans
and learning from its own activity is the solution
a new way of operating IT
- cost
- varying
quality
- availability
+ agility
+ accountability
- no flexibility
- limited to
IaaS / PaaS
- no durability
+ efficiency
+ repeatability
PEOPLE CENTRICROBOTIC /
STANDARDIZED
+ continuous cost
optimization
+ stable quality
+ scalability
+ agility
+ accountability
ANALYTICS BASED
& KNOWLEDGE
CENTRIC
Automation augmented engineers
Engineering enabled automation
We have built a machine that operates IT just
like a human would – we call it AutoPilot
AutoPilot does not act like a machine!
To solve a maze, a machine follows a
pre-defined run-book.
1. Go forward 1
2. Turn left
3. Go forward 1
4. Turn Right
5. Go forward 2
6. Turn left
7. Go forward 2
8. Turn Right
9. Go forward 3
10.Turn Right
11.Go forward 1
12.Turn left
13.Go forward 1
But we do not know the future….
… and the machine cannot use the
same run-book for a different maze!
1. Go forward 1
2. Turn left
3. Go forward 1
4. Turn Right
5. Go forward 2
6. Turn left
7. Go forward 2
8. Turn Right
9. Go forward 3
10.Turn Right
11.Go forward 1
12.Turn left
13.Go forward 1
Knowledge Item 2Knowledge Item 1
AutoPilot only needs two pieces of
knowledge to solve these mazes!
IF T
HIS
TH
EN
TH
AT
My right
hand touches
the wall
Walk forward
IF T
HIS
TH
EN
TH
AT
I cannot walk
forward
Turn left 90°
AutoPilot’s solutions start out more
complicated, but AutoPilot learns.
Before machine learning After machine learning
And this works for every maze (non
cyclic ones, with the given knowledge).
Before machine learning After machine learning
No matter how big they get!
AutoPilot even deals with unexpected
or unforeseeable ad hoc changes.
As AutoPilot
approaches
this wall it
suddenly
closes
And this one
opens
Because AutoPilot looks at its situation
like a subject matter expert would.
As AutoPilot
approaches
this wall it
suddenly
closes
And this one
opens
Sounds cool, doesn’t it?
But how is this relevant
to IT operations?
Because finding a way to solve an IT
task is like solving a maze on the fly.
Solving IT tasks is rarely a straight
forward process (contrary to the anticipation of run-books).
AutoPilot picks knowledge one by one
to create a solution on the fly.
A piece of knowledge – a Knowledge
Item – is a simple rule with context.
Knowledge Item – KI
Abstraction
TH
EN
TH
AT
Action
AN
D
TH
IS Execute
Condition
IF IN
CO
NTE
XT
Bind
Condition
Knowledge Item – KI
Example
TH
EN
TH
AT
Find location
of log file
AN
D
TH
IS Want to look
at log file
IF IN
CO
NTE
XT
On Linux
machine
A KI in its raw XML format (easy to
transform, easy to exchange).
So let us look at real life
usage of knowledge.
Here is an excerpt from a knowledge pool
we use at arago to perform SW tests.
Check is engine
installed
Install SW if
needed
Create EC2 Spot
instance
Extract EC2
instance FQDN
Set EC2 AMI ID
SLES
Read EC2 instance
Information
Set EC2 AMI ID
CentOS
Set Type/Price for
Spot Inst. Request
Check EC2 Spot
Request Status
Parse EC2 Spot
Request Output
Shutdown
unused EC2 inst.
Start some test
with EC2 instance
Start tests if
precond. are OK
EC2 install
repository on SLES
Install SW on SLES
Install SW on
Linux
Run Simple
AutoPilot CLI Test
Retry Install
EC2 install
repository
React on error
„repository not found“
Check AutoPilot
CLI
React on „No
provider of“ msg.
Check is engine
installed
Install SW if
needed
Create EC2 Spot
instance
Extract EC2
instance FQDN
Set EC2 AMI ID
SLES
Read EC2 instance
Information
Set EC2 AMI ID
CentOS
Set Type/Price for
Spot Inst. Request
Check EC2 Spot
Request Status
Parse EC2 Spot
Request Output
Shutdown
unused EC2 inst.
Start some test
with EC2 instance
Start tests if
precond. are OK
EC2 install
repository on SLES
Install SW on SLES
Install SW on
Linux
Run Simple
AutoPilot CLI Test
Retry Install
EC2 install
repository
React on error
„repository not found“
Check AutoPilot
CLI
React on „No
provider of“ msg.
Here knowledge is categorized into
four classes for easier visualization.
Solutions are built step-by-step using the
knowledge in the pool.
When examining the solution after
execution it looks like a script.
So let’s take a look what is possible with
this pool of just 22 KIs.
Example 1
Setup AutoPilot test environment
and run software tests
First we give AutoPilot
a task it can identify.
Start some test with
EC2 instance
Give the task of
performing AutoPilot
tests to the machine
AutoPilot found
knowledge how to
do test on EC2
Issue Detail View
Do Autopilot
Test
We prepare to allocate IaaS for our tests
at AWS using spot priced instances.
Start some test with
EC2 instance
Do Autopilot
Test
Give the task of
performing AutoPilot
tests to the machine
AutoPilot found
knowledge how to
do test on EC2
Issue Detail View
Set Type/Price for
Spot Inst. Request
Figure out what we
are willing to pay for
IaaS
The task we gave to AutoPilot
requested a test on CentOS.
Start some test with
EC2 instance
AutoPilot found
knowledge how to
do test on EC2
Issue Detail View
Set Type/Price for
Spot Inst. Request
Set EC2 AMI ID
CentOS
Figure out what we
are willing to pay for
IaaS
Choose CentOS as
OS for the IaaS
etc. etc.
After our tests are done we
decommission our EC2 instances.
Issue Detail View
Check other test pre
conditions
Start tests if
precond. are OK
Perform SW test for
AutoPilot CLI
package
Run Simple
Autopilot CLI Test
Decommission EC2
instance
Set EC2 AMI ID
CentOSFully automated.
We are finished
The steps shown before are
summarized in a run-book next.
Start some test with
EC2 instance
AutoPilot found
knowledge how to
do test on EC2
Set Type/Price for
Spot Inst. Request
Figure out what we
are willing to pay for
IaaS
Set EC2 AMI ID
CentOS
Choose CentOS as
OS for the IaaS
Create EC2 Spot
instance
Request the server
from AWS
Analyze the output
AWS gave us
Parse EC2 Spot
Request Output
Check if the spot
pricing request
issued was granted
Check EC2 Spot
Request Status
Analyze the output
AWS gave us
Parse EC2 Spot
Request Output
Check if the spot
pricing request
issued was granted
Check EC2 Spot
Request Status
Analyze the output
AWS gave us
Parse EC2 Spot
Request Output
Read EC2 instance
Information
Retrieve information
on AWS instance
Extract EC2
instance FQDN
Retrieve FQDN of
machine using AWS
API call
Install requested
software
Install SW if needed
Install software on
Linux
Install SW on Linux
Check for working
AutoPilot installation
Check is engine
installed
Install software on
Linux
Install SW on LinuxReact on error
„repository not found“
Decide what to do
with a “repository
not found” error
EC2 install
repository
Install repository
from EC2
Retry Install
Clear installation
history from new
install
Install software on
Linux
Install SW on Linux
Check for working
AutoPilot installation
Check is engine
installed
Check if AutoPilot
CLI is available
Check Autopilot CLI
Perform SW test for
AutoPilot CLI
package
Run Simple
AutoPilot CLI Test
Decommission EC2
instance
Set EC2 AMI ID
CentOS
Check other test pre
conditions
Start tests if
precond. are OK
This is only one script, one automatically
generated to solve a specific problem.
The same knowledge can solve millions
of tasks.
A representation more adequate to
AutoPilot is the Knowledge Graph.
All steps, just explained as a
Knowledge Graph.
Check is engine
installed
Install SW if needed
Create EC2 Spot
instance
Extract EC2
instance FQDN
Set EC2 AMI ID
SLES
Read EC2 instance
Information
Set EC2 AMI ID
CentOS
Set Type/Price for
Spot Inst. Request
Check EC2 Spot
Request Status
Parse EC2 Spot
Request Output
Shutdown unused
EC2 instances
Start some test with
EC2 instance
Start tests if
precond. are OK
EC2 install
repository on SLES
Install SW on SLES
Install SW on Linux
Run Simple
AutoPilot CLI Test
Retry Install
EC2 install
repository
React on error
„repository not found“
Check AutoPilot CLI
React on „No
provider of“ msg.
1
2
3
4
5,7,9
6,8
10
11
12
16
13,15,19
17
18
14,20
21
22
23
24
Example 2
With no additional knowledge, just by
posing a different task, we can generate
any single EC2 instance, this time not
CentOS but SLES
We skip how AutoPilot develops the
solution step by step… The result:
Check is engine
installed
Install SW if needed
Create EC2 Spot
instance
Extract EC2
instance FQDN
Set EC2 AMI ID
SLES
Read EC2 instance
Information
Set EC2 AMI ID
CentOS
Set Type/Price for
Spot Inst. Request
Check EC2 Spot
Request Status
Parse EC2 Spot
Request Output
Shutdown unused
EC2 instances
Start some test with
EC2 instance
Start tests if
precond. are OK
EC2 install
repository on SLES
Install SW on SLES
Install SW on Linux
Run Simple
AutoPilot CLI Test
Retry Install
EC2 install
repository
React on error
„repository not found“
Check AutoPilot CLI
React on „No
provider of“ msg.
1
2,4
3
5
6
7,8
9, 14, 15
10
11
12
13
Same knowledge, different task,
still fully automated.
Example Summary
With these 2 examples you can begin
to imagine how many other possible
tasks AutoPilot can perform with a
Knowledge Pool of only 22 KIs.
Here are just a few more real tasks
AutoPilot can perform with these 22 KIs.
Install any software on SLES CentOS from repositories
Install any software on SLES CentOS from packages
Provide EC2 Instance Status information
Create AWS EC2 Instances
Download and install software packages
Scale up AWS EC2 instance size
Install AutoPilot 4.1 SLES CentOS
Install AutoPilot 4.0 SLES CentOs
Install AutoPilot unstable SLES CentOS
Start/Terminate AWS EC2 Instances
Create AWS EC2 Spot Instances
Run AutoPilot tests in AWS environment
Run AutoPilot tests in AWS Spot Instance
Dynamically create and setup AutoPilot cluster nodes in AWS
Add software repositories SLES CentOS
Provide information about broken dependencies
Fix broken dependencies SLES CentOS
Shutdown EC2 instance when no longer needed
Clone and configure general purpose server
Check if AutoPilot engine instance is running properly
Update dynamic domain name from instance
Run commands with AutoPilot CLI
Install standard Linux Web Server with Apache, Tomcat, …
Install mail server on Linux OS
Install proxy server on Linux OS
Install content filter on Linux OS
Create a cluster of 1..n Linux servers
Dynamically add Servers to a cluster of systems
Install MySQL database on Linux Server
Decommission single no longe rused instance in a cluster
Create a cioy if a running server based on model
…
From Example 2 –
Automated Amazon Cloud Spot
Market provisioning
From Example 1 –
AutoPilot software test
Create EC2 Spot
instance
Set Type/Price for
Spot Inst. Request
Check EC2 Spot
Request Status
Parse EC2 Spot
Request Output
Start some test with
EC2 instance
Create EC2 Spot
instance
Check EC2 Spot
Request Status
Parse EC2 Spot
Request Output
For different tasks, AutoPilot uses
different entries in the KI graph.
We know exactly where to install.
We just do it.
We do not know where to install?
We can find out.
All or little input: no challenge to
AutoPilot – if it doesn’t know, it finds out.
Extract EC2
instance FQDN
Read EC2 instance
Information
Install SW if needed
Install SW on SLES
Install SW if needed
Install SW on SLES
The software was not installed properly? AutoPilot fixes the problem. This specific event
did not have to be anticipated by the subject matter experts creating the knowledge
pool and neither did they have to program case specific error handling.
AutoPilot handles unforeseeable events
by working with them.
Check is engine
installed
Install SW on Linux
Retry Install
EC2 install
repository
React on error
„repository not found“
Check AutoPilot CLI1
2
3
4
5
6
78
Let us talk about creating
Knowledge Pools
Do you think, you have to hire PhDs or
wizards to create and maintain KIs?
No, you don’t!
Knowledge is created
by ordinary people…
…people who have the knowledge…
…people who do the job now….
…people who most likely
have something better to do!
We will show you how
the Knowledge Pool we worked with in
the examples was created.
Knowledge is entered into the pool
bottom up.
That means knowledge is entered, after
it was needed for the first time.
The thing done manually first:
Straight forward, install AutoPilot on AWS cloud
Create AWS environment, run tests,
decommission AWS instances
Extract EC2
instance FQDN
Read EC2 instance
Information
Start some test with
EC2 instance
Run Simple
AutoPilot CLI Test
Start EC2 Instance
Shutdown unused
EC2 instances
The beginning of the Knowledge Pool
you saw before.
Extract EC2
instance FQDN
Read EC2 instance
Information
Start some test with
EC2 instance
Run Simple
AutoPilot CLI Test
Start EC2 Instance
Shutdown unused
EC2 instances
OK… worked fine, but
We wanted to be able to run multiple
environments with different software.
Ensure compatible operating system (CentOS)
Install any software upon request, not pre-determined
So we needed to be able to install
software based on model information.
Install SW if needed Install SW on Linux
Retry InstallEC2 install
repository
React on error
„repository not found“
Set EC2 AMI ID
CentOS
Resolve version conflicts in an installation of AutoPilot
And deal with incompatible versions of
our SW being installed as part of tests.
Check is engine
installedCheck AutoPilot CLI
Start tests if
precond. are OK
The Knowledge Pool has evolved.
e.g. it is now able to install any RPM.
Check is engine
installed
Install SW if needed
Extract EC2
instance FQDN
Read EC2 instance
Information
Set EC2 AMI ID
CentOS
Start some test with
EC2 instance
Install SW on Linux
Run Simple
AutoPilot CLI Test
Retry Install
EC2 install
repository
React on err „repository
not found“
Check AutoPilot CLI
Start tests if
precond. are OK
Start EC2 Instance
Shutdown unused
EC2 instances
OK… worked fine, but someone came
along and needed a test for a new OS.
Request SuSe Linux Enterprise Server from AWS
A new OS request KI was created to
request SLES servers on Amazon
Set EC2 AMI ID
SLES
Perform software installs on SLES and handle exceptions
and perform SLES compatible
installation procedures.
EC2 install
repository on SLESInstall SW on SLES
React on „No
provider of“ msg.
The new Knowledge Pool can install
packed SW on either CentOS or SLES.
Check is engine
installed
Install SW if needed
Extract EC2
instance FQDN
Set EC2 AMI ID
SLES
Read EC2 instance
Information
Start some test with
EC2 instance
EC2 install
repository on SLES
Install SW on SLES
Install SW on Linux
Run Simple
AutoPilot CLI Test
Retry Install
EC2 install
repository
React on error
„repository not found“
React on „No
provider of“ msg.
Start EC2 Instance
Set EC2 AMI ID
CentOS
Shutdown unused
EC2 instances
Start tests if
precond. are OK
Check AutoPilot CLI
OK… worked fine, until our first bill from
AWS and then we wanted spot prices.
Adding new KIs to request AWS spot
instances, handle availability and pricing.
Request AWS spot instances and check for availability, re-request if desired pricing
cannot be obtained in time.
Create EC2 Spot
instance
Set Type/Price for
Spot Inst. Request
Check EC2 Spot
Request Status
Parse EC2 Spot
Request Output
The Knowledge Pool we used before is
complete now.
Check is engine
installed
Install SW if needed
Create EC2 Spot
instance
Extract EC2
instance FQDN
Set EC2 AMI ID
SLES
Read EC2 instance
Information
Set EC2 AMI ID
CentOS
Set Type/Price for
Spot Inst. Request
Check EC2 Spot
Request Status
Parse EC2 Spot
Request Output
Shutdown unused
EC2 instances
Start some test with
EC2 instance
Start tests if
precond. are OK
EC2 install
repository on SLES
Install SW on SLES
Install SW on Linux
Run Simple
AutoPilot CLI Test
Retry Install
EC2 install
repository
React on error
„repository not found“
Check AutoPilot CLI
React on „No
provider of“ msg.
Effort?
With the same effort needed in another
environment to create 4 scripts…
…ordinary subject matter experts used
AutoPilot to create the foundation for
potentially automating millions of tasks.
instead of protecting the status quo engineers
do what engineers do best…
… make things better !
Thank you for your time which we hope was well invested, because dismissing good ideas can harm your future
Register at
http://www.arago.de/autopilot-ce/