Top Banner
Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior Data Scientist Adaptive Analytics LLC August 2018
44

Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Jun 21, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Cost Effective Machine Learning Technologies

Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress

Ricardo Vilalta

Senior Data Scientist Adaptive Analytics LLC

August 2018

Page 2: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

New Emerging Trends in Machine Learning

!  New Trends in Machine Learning

!  Transfer Learning

!  Active Learning

!  Deep Learning

!  Hardware for Machine Learning

!  Summary

Page 3: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Machine Learning

Search

Artificial Intelligence

Planning Knowledge Representation

Machine Learning Robotics

Clustering

Classification

Genetic Algorithms

Reinforcement

Learning

Page 4: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Classification or Supervised Learning

Supervised Learning:

Training set x = {x1, x2, …, xN} (historic data)

Class or target vector y = {y1, y2, …, yk} (true labels)

Find a function f(x) that takes a vector x and outputs a class y.

{(x,y)}

f(x)

Page 5: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Classification or Supervised Learning

! 

Normal Operation

Abnormal Operation

Page 6: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

New Emerging Trends in Machine Learning

!  New Trends in Machine Learning

!  Transfer Learning

!  Active Learning

!  Deep Learning

!  Hardware for Machine Learning

!  Summary

Page 7: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Transfer Learning

!  The goal is to transfer knowledge gathered from previous experience.

!  Also called Inductive Transfer or Learning to Learn.

!  Example: Invariant transformations across tasks.

Adapt Model Transfer Experience

Learn Predictive Model New Predictive Model

Page 8: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Example

A problem occurs when the drill string is no longer free to move (i.e., to rotate or move vertically), a situation called Stuck Pipe

Page 9: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Importance

Motivation:

The problem of stuck pipes accounts for several billions of dollars loss on capital equipment and non-productive time. Developing a method to predict this event in real-time has become high priority for the drilling industry (now possible due to modern sensor techniques and advanced data analysis tools).

Page 10: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Machine Learning Approach

Strategy: Use machine learning to learn a model that analyzes historical data and produces a model for prediction.

Predictive Model

Page 11: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Model may fail on different wells

Reasons:

•  Different geological formations

•  Hook load profile varies at different depth

•  Unexpected environmental conditions

Page 12: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Transfer Learning

Scenarios: 1.  Labeling in a new domain is costly.

DB1 (labeled)

Classification of Salt Deposits

DB2 (unlabeled)

Page 13: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Transfer Learning

Scenarios: 2. Data is outdated. Model created with one survey but a new survey is now available.

Survey 1

Learning System

Survey 2

?

Page 14: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Traditional Approach to Classification

DB1 DB2 DBn

Learning System

Learning System

Learning System

Page 15: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Transfer Learning

DB1 DB2

DB new

Learning System

Learning System

Learning System Knowledge

Source domain

Target domain

Page 16: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Knowledge of Parameters

Assume prior distribution of parameters

Source domain

Learn parameters and adjust prior distribution

Target domain

Learn parameters using the source prior distribution.

Page 17: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Feature Transfer

Identify common Features to all tasks

Page 18: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Example Weighting

Source Class 1 Source Class 2 Target

Page 19: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Target Class 1 Target Class 2

Example Weighting

Page 20: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Source Class 1 Data Source Class 2 Data Target Data Source Model Target Model

Example Weighting

Page 21: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Data Projection

When source instances cannot represent the target distribution at all in the parameter space, we can project source and target datasets to common feature space (i.e., we can align both datasets).

Page 22: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

New Emerging Trends in Machine Learning

!  New Trends in Machine Learning

!  Transfer Learning

!  Active Learning

!  Deep Learning

!  Hardware for Machine Learning

!  Summary

Page 23: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Classification is Costly: Labeling

A representative subset of objects are labeled as one of the following six classes:

!  Plain

!  Crater Floor

!  Convex Crater Walls

!  Concave Crater Walls

!  Convex Ridges

!  Concave Ridges

517 labeled segments.

Page 24: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Pool-Based Sampling

Assume a small set of labeled examples and a large set of unlabeled examples. Here we evaluate and rank the whole set of unlabeled examples; we then choose one or more “important” examples.

Active Learning

Page 25: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Uncertainty: 1.0 0.5 1.0

Sampling Based on Uncertainty

Page 26: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Sampling Based on Uncertainty

Figure taken from “Active Learning” by Burr Settles, Morgan & Claypool, 2012.

70% accuracy 90% accuracy

Page 27: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Results with Active Learning

Page 28: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

New Emerging Trends in Machine Learning

!  New Trends in Machine Learning

!  Transfer Learning

!  Active Learning

!  Deep Learning

!  Hardware for Machine Learning

!  Summary

Page 29: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

The idea is to disentangle factors of variation and to attain high level representations.

Pixel Information

Edges and Contours

Small Object Parts

Engine, Main Fuselage

Commercial Planes, Military Planes

Deep Learning

Page 30: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Deep Learning

!  We want to capture compact, high-level representations in an efficient and iterative manner.

Learning takes place at several levels

of representations.

Think about a hierarchy of concepts

of increasing complexity.

Low levels concepts are the foundation

for high level concepts.

Page 31: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

An Example in Deep Learning

Learn a “concept” (sedimentary rocks) from many images until a high-level representation is achieved.

Page 32: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

An Example in Deep Learning

Learn a hierarchy of abstract concepts using deep learning.

Local properties

Global properties

Deep Learning

Page 33: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Methodology

Cube of seismic data

Expert Labels

New training dataset

Learning Algorithm

Deep Learning

Deep Learning on Seismic Data

Page 34: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Challenges:

Single attributes bear incomplete information about the class.

Supervised Learning of Geological Bodies

Page 35: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Challenges:

Deep learning can capture “global” features that detect entire geological bodies as the result of the non-linear combination of many local models.

Supervised Learning of Geological Bodies

Page 36: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Decompose seismic cube into small cubes and create a large no. of examples.

Deep Learning on Seismic Data

Page 37: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Each cube is an example that we can feed into a deep learning architecture.

Deep Learning on Seismic Data

Page 38: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

New Emerging Trends in Machine Learning

!  New Trends in Machine Learning

!  Transfer Learning

!  Active Learning

!  Deep Learning

!  Hardware for Machine Learning

!  Summary

Page 39: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Hardware for Machine Learning

Most machine learning applications require fast processing speeds and lots of memory and disk space. Applications are “computationally expensive” Example: Deep learning.

Many applications in machine learning need matrix multiplications. Calculations are easy but there are many, “MANY”, of them.

Page 40: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Hardware for Machine Learning

A solution for CPU’s being overpowered is to use GPUs. A GPU can handle many instructions at incredible speeds. Disadvantage: 4x times more expensive than CPUs and sometimes Not really necessary.

Page 41: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Hardware for Machine Learning

Suggested minimum requirements: Memory: 16 GB (ideally 32 GB) Disk Space: 2 TB Processor: Intel 7th Generation or better; or AMD Ryzen 2nd generation Very important: ** GPU ** If working remotely, it is better to use a simple device (tablet) and send information to a central server for analysis.

Page 42: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Hardware for Machine Learning

Many manufacturers are already producing specialized chips that do deep learning at the hardware level: TPUs (Tensor processing units ) by Google AMD’s new GPU ** It is too soon to know how well they will perform for future applications. **

Page 43: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

New Emerging Trends in Machine Learning

!  New Trends in Machine Learning

!  Transfer Learning

!  Active Learning

!  Deep Learning

!  Hardware for Machine Learning

!  Summary

Page 44: Cost Effective Machine Learning Technologies€¦ · Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior

Summary

!  When we have similar classification tasks but there is indication that the distributions have changed ! Transfer Learning

!  When we have few training examples, labeling is expensive ! Active Learning

!  When we need more abstract features ! Deep Learning

!  Hardware using dep learning ! look for large memory, disk space, top processors, and do NOT FORGET the GPU.