JPK Group Business Forecasting & Analytics Forum September 18-19, 2017 • Chicago, IL Analytics Landscape Evolution September 19, 2:15 pm View presentation online at: https://jpkgroupsummits.com/attendee1/ “Sammy” Amirghodsi – Options Clearing Corporation How neuron nets and deep learning are changing the analytics landscape I am a Senior Enterprise Technology Executive delivering the strategic planning, technology leadership, and innovative platforms that create revolutionary insights with Big Data, substantial cost savings, and improve productivity. I harness technology to create end-to-end massive change. Key advisor to executive management and architect beyond multi-million dollar projects, I deliver the insight and actions that shape and grow organizations. I break down complexity and technological barriers into language that leaders can unite behind. Leveraging my expertise and business acumen, I court buy-in on a technology roadmap that is perfectly aligned to corporate goals and long- term goals. Having worked across many departments and industries from financial services, healthcare, eCommerce, and consumer product development, I develop the strategic relationships built on value and integrity that get everyone on board with complex projects.
41
Embed
S eptember 19, 2:15 pm A n alyt ics L an d scap e E vo lu ...Deep Learning is Physics + Chemistry + some statistic in between –Say what? ʘ Named after Ernst Ising, but invented
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
JPK
Gro
up
Business Forecasting & Analytics ForumSeptember 18-19, 2017 • Chicago, IL
How neuron nets and deep learning are changing the analytics landscape
I am a Senior Enterprise Technology Executive delivering the strategic planning, technology leadership, and innovative platforms that create revolutionary insights with Big Data, substantial cost savings, and improve productivity. I harness technology to
create end-to-end massive change. Key advisor to executive management and architect beyond multi-million dollar projects, I deliver the insight and actions that shape and
grow organizations. I break down complexity and technological barriers into language that leaders can unite behind. Leveraging my expertise and business acumen, I court buy-in on a technology roadmap that is perfectly aligned to corporate goals and long-
term goals. Having worked across many departments and industries from financial services, healthcare, eCommerce, and consumer product development, I develop the
strategic relationships built on value and integrity that get everyone on board with complex projects.
Siamak Amirghodsi (Sammy)Data Intelligence and Analytics Summit – Chicago
• Over 20+ years of designing, building, managing and executing large scale distributed systems in Fortune 20 companies. Lead the build out of a cutting edge FX payments data platform and analytics solution for a tier-1 investment bank in the US. Currently leading the data platform and real-time analytics build out for a tier-1 exchange related institution in United States.
• Certified on Cloudera Big Data Platform (Developer, Admin and HBase).
• Actively follow Hadoop (MapReduce, HDFS, YARN), Spark (Streaming / SQL / MLlib / GraphX / SparkR), Hive, Pig, Zookeeper, Amazon AWS, Cassandra, HBase, Neo4j, BlockChain, KDB+, RedShift and MongoDB while being fully grounded in traditional IBM/Oracle/Microsoft technology stacks for business continuity and integration.
• Key interests include cognitive models, big data, Hadoop, Spark, streaming systems, deep machine learning, google brain project, swarm algorithms, quantum computing, trading signal discovery, long term commodity cycles, cryptography, digital / crypto currencies, BlockChain, probabilistic graphical models and NLP.
OCC’s Role - Issues and guarantees U.S. Listed contracts - Provides clearing and settlement services- Provides a risk management to ensure marketplace is not
disrupted in the event of a clearing member default
- Options listed on more than 3,600 stocks and more than 600 indices and ETFs
- Cleared more than 4.2 billion contracts in 2015 - Year-to-date average daily options volume :16.4 M- Highest volume trading day on 8/8/11: 41.5 million
3
Options Clearing Corporation Of Course we are hiring!
Rise of Connected Consciousness & Deep Learning
5
The Dawn of 4th Industrial Revolution – Data is the new Silicon!
A Darwinian moment in time …..
Tren
ds th
at a
re em
ergi
ng a
t the
sa
me p
oint
in ti
me
Ø Commoditization of data and infrastructure
ü Point to ponder : Commoditization and
democratization are the two sides of the same coin!
Ø The rise of augmented intelligence facilitated by
near real-time computational engine
Ø Spark like platforms with the potential to be the
common language (franca linguae or trade
language) for the enterprise in the new economy
Ø Advances in GPU paired with cluster computing
Ø Cloud makes the world of technology frictionless
Ø Data Science Accelerator – Extreme parallels to Quants on Wall Street and how they changed finance in 70s, 80s, 90s and 2000s
Ø Survival of the fittest
ü Rise of the digital economy and changes in customer behaviors will force companies to adapt to machine learning
Ø Machine Learning as a competitive advantage
7
Image processing error rate
Speech recognition error rate
Economical Value(EV): Why is Deep Learning getting so much attention?
Because it is better!
Beat the competition hands on in a visible way
Better performance than state of the art!
Equivalent performance with less cost (labor)
Important use cases remain
Imagesandsourcecreditedtooriginalcreator
8
Stanford / Google : Deep Learning at Terra Scale
Face Detector Human Body Detector Cat Detector
Deep Learning and complex nonlinearity in real-world examples
9
Machine Learning Scheme Deep Learning & Neuron Network Based Scheme
Imagesandsourcecreditedtooriginalcreator
What is Deep Learning? How is it different?Ø It is a form of Machine Learning
ü It is a subclass of machine learning in which the system learns complex
abstract concept in layers from the data that is provided
Ø It is a form of Representation Learningü Choice of features has a profound effect on performance
ü Feature engineering is very important, but extremely labor intensive
ü Makes it easier to extract features and understand the surrounding world
ü In a probabilistic model, a good feature representation captures the posterior
distribution of the underlying world that it is operating in
Ø It can deal with complex non-linear functionsü It forms complex non-linear regions by combining simpler non-linear activation
functions in hierarchy
ü It uses weights and stacking of levels to achieve non-trivial non-linearity
ü The optimization (SGD, AdaGD, Dropout) discovers the probability space for
object classification
Ø It does not require hand crafted feature engineeringü It figures out what is relevant in a hierarchy of layers
ü It does not require human interaction
ü For example : You never tell it that an 8 is made of two circle like objects
Ø It can be generative!ü It can dream
ü It can provide a stereotypical of what it discovered – “The Google Cat”
Imagesandsourcecreditedtooriginalcreator
Key Deep Learning Architectures
Key Deep Learning
Architectures
Stacked Auto Encoder (SAE)
Restricted Boltzmann Machine(RBM)
Recurrent Neural
Network(RNN)
Deep Belief Network (DBN)
Convolution Neural Network
(CNN)Others : Multi-Modal/Multi Tasking, RNTN,
DSN, etc.
Cognitive Psychology : How should we teach machines to learn?
• We try to see the forest and blind ourselves to changes to
individual trees
• The price for evolutionary survival?
Ø Invariant Recognition• We are not fooled by variation of the same object
• We will recognize variation of the same object in complex
un-natural surrounding
Ø Neural Cortex
• Hubel and Wiesel
• Nobel prize – 1981 : https://www.youtube.com/watch?v=y_l4kQ5wjiw
• Discoveries in Visual Systems
• Vision is learned – Kitten Experiment
• Revealed the pattern of organization of brain cell
• How connections between nerve cells filter and transform
sensory information from retina to cerebral cortex
Imagesandsourcecreditedtooriginalcreator
13
The machine sees,hears and understands objects differently!
A horse is no longer a horse in Deep Learning!
Multi-LayerLearningAHorseasseenbyConvNet
AHorseasseenbythehuman
Imagesandsourcecreditedtooriginalcreator
Demo
14
Image recognition
Deep Learning
“The Intuition - Roots in physical sciences”
• Derivation of Ising model / Spin Glass– energy
• A Restricted Markov Random Field model meaning we will not allow any connection at any level (Ln)
• Restriction make it easy to learn
• Two layers of nodes
• Input, Hidden(s) and then Output layer
• Tries to learn the abstractions via the hidden layers and then passes it to output layer which in turn classifies the output into the desired outcomes
• Symmetrical in nature
• Technically you could get away with just one hidden layer providing you the ability to observe the restrictions, but in reality you need to stack more layers which will lead to DBNs (Deep Belief Networks)
Restricted Boltzmann Machine (RBMs)
Hidden Units
Weights Energy Probabilities
17
Deep Learning is Physics + Chemistry + some statistic in between – Say what?
ʘ Named after Ernst Ising, but invented by Whilhelm Lenz in
1920
ʘ Mathematical model of ferromangetism in statistical mechanics
ʘ Discrete variable that can be on two states (-1,1) in a graph
setting
ʘ 1) W: tortured real values, 2) X: Gaussian random Variable, 3) Hamiltonian Energy Function 4) Frustration: Constrains
ʘ Two dimensional lattice, but can be extended into ‘n’ dimension
ʘ Spins interact with their neighbors
ʘ This model even in its simplest form (square-lattice) allows for
identification of phase transition
ʘ Spins and their neighbors system like to converge to a low
energy state
ʘ The relationship (degree of correlation) of the spins is governed
by weights – Funnel Energy Landscapeʘ The probability that the entire system will settle to a given state
is closely related to Boltzmann Distribution
ʘ We can bias the system by changing the weights
ʘ The Bias will act as a training
The Ising / Spin Glass Model“Our team focus: P-Spherical Spin Glass Frustration Model”
18
From Ising Model to Restricted Boltzmann Machine – The Ising Model
A hierarchical Ising model yields a Restricted Boltzman Machine (RBM) – The intuition
Triangle
Rectangle
Circle
HiddenLayer
InputLayer
OutputLayer
We know how to1. Calculate the energy output of two connected neurons
2. Calculate the energy of a neighborhood of neurons
3. Calculate the total energy of the whole system
We now assume a multi-dimensional Ising model1. The dimensions or plains of neurons are stacked in a hierarchical manner
2. No neuron is allowed to make a connection to another neuron on the same plane (level)
3. A neuron can only be connected to a neuron in the next plane
4. The weight of the connection determines the strength (degree of affinity) of the
connection
5. Each neuron can be connected to as many other neurons it wishes to (usually fully
connected) on the next plane
6. Each neuron will follow the usual rules (input, output, activation, etc.) except, no
connection to the neuron at the same level (original Ising model)
7. You do all these and you will have a DBN or RBM
A hierarchical Ising model yields a Restricted Boltzman Machine (RBM) – The intuition
HiddenLayer
OutputLayerCircle
Triangle
Rectangle
HiddenLayer
My Unsupervised Training Set
InputLayer
21
A hierarchical Ising model yields a Restricted Boltzman Machine (RBM)
Does Figure 2 Look Familiar?
Figure1 Figure2
Imagesandsourcecreditedtooriginalcreator
Why forecasting in finance is hard
Trending Noisy
Meandering Quasi-Periodic
Cyclical Leveled
Time Series – Why is it so hard?
Variety• Trending• Noisy• Meandering• Periodic
Types• Univariate• Multivariate
SwitchingSystem• Threshold• Jumps
TimeVaryingProperties
• Mean• Variance• OtherMomentums
VolumeofData• Randomness• Natureof
forecasting
Ø Temporal Stability, Strength of Disturbance, Lead Time
Ø LSTM – Not enough (signal/noise ratio in finance)
24
Easy Learning Functions vs. Real world learning
Smoothness and Locality makes it simple to predict
Real-world problems do not have strong locality
Ø Easy problems are smooth functions with occasional bumps which make them easy to estimate and optimize based on principle of parsimony and locality.
Ø Function would not vary too much within a neighborhoodØ I can exploit locality and interpolation and predictØ Real world problems often have a non-smooth function or
surface without a continuous locality which make them hard to estimate and forecast.
Ø In real world I might not have enough samples to interpolate Ø We get zero probability to most and very high probability to
few – the kernel function is not smooth but very skewed to be useful.
Ø Core issue: There is too much variation in the data for the sample size which causes the Gaussian process to break down – It is the variations!
Ø How: We need to discover structure non-locally specially in high-dimensions.
Option Trading & Risk management: as a Game Theory / Optimization problem?Variables of the Standard Black-Scholes (theoretical):•StockPrice•StrikePrice•Timeremaininguntilexpirationexpressedasapercentofayear•Currentrisk-freeinterestrate•Volatilitymeasuredbyannualstandarddeviation
The GreeksAcollectionofstatisticalvaluesthatgivetheinvestorabetteroverallviewofoptionpremiumschangegivenchangesinpricingmodelinputs.Thesevaluescanhelpdecidewhatoptionsstrategiestouse.
Hadoop summit:http://2013.hadoopsummit.org/http://2014.hadoopsummit.org/http://2015.hadoopsummit.org/Oreilly Books and Media:http://www.oreilly.com/Lynda.com:http://www.lynda.com/Googlehttp://Google.com, http://Youtube.com, http://slideshare.com (web Images, video, news, search books)Plus various other books and white papers on big data, Hadoop, Machine Learning , Programming, Science and SparkQuotes: http://www.azquotes.com/Supporting material: www.slideshare.com
Tutorialhttp://info.usherbrooke.ca/hlarochelle/neural_networks/content.htmlhttp://deeplearning.stanford.edu/tutorialPaperDeep learning: Method and Applications (Deng & Yu, 2013)Representation learning: A review and new perspective (Benjio et al., 2014)Learning deep architecture for AI (Benjio, 2009)Slide & Videohttp://www.cs.toronto.edu/~fleet/courses/cifarSchool09/slidesBengio.pdfhttp://nlp.stanford.edu/courses/NAACL2013BookDeep learning (Benjio et al. 2015) http://www.iro.umontreal.ca/~bengioy/dlbook/Torch (Lua) https://github.com/torch/torch7Theano (Python) https://github.com/Theano/Theano/Deeplearning4j (word2vec for Java) https://github.com/SkymindIO/deeplearning4jND4J (Java) http://nd4j.org https://github.com/SkymindIO/nd4jDeepLearn Toolbox (matlab) https://github.com/rasmusbergpalm/DeepLearnToolbox/graphs/contributorsconvnetjs (javascript) https://github.com/karpathy/convnetjsGensim (word2vec for Python) https://github.com/piskvorky/gensimCaffe (image) http://caffe.berkeleyvision.org— More+++
Deep Learning Librarieshttp://deeplearning4j.org/compare-dl4j-torch7-pylearn.htmlhttp://www.kdnuggets.com/2015/06/popular-deep-learning-tools.htmlhttp://ucb-icsi-vision-group.github.io/caffe-paper/caffe.pdfhttps://www.reddit.com/r/MachineLearning/comments/2c9x0s/best_framework_for_deep_neural_nets/https://github.com/soumith/convnet-benchmarkshttp://www.picalike.com/blog/2015/01/12/the-portrait-of-a-machine-learning-priestess/http://openann.github.io/OpenANN-apidoc/OtherLibs.htmlhttp://www.infoworld.com/article/2853707/machine-learning/11-open-source-tools-machine-learning.html#slide11http://www.infoworld.com/article/2853707/machine-learning/11-open-source-tools-machine-learning.html#slide12http://torch.chhttp://deeplearning4j.orghttp://www.predictiveanalyticstoday.com/deep-learning-software-libraries/http://www.predictiveanalyticstoday.com/deep-learning-software-libraries/http://fastml.com/torch-vs-theano/https://www.quora.com/Which-is-the-best-deep-learning-framework-Theano-Torch7-or-Caffehttps://www.quora.com/Which-is-the-best-deep-learning-framework-Theano-Torch7-or-CaffeDemos:ConvNet in actionhttp://cs231n.github.io/convolutional-networks/http://cs231n.stanford.eduStacked autoencoder:http://ufldl.stanford.edu/wiki/index.php/Stacked_AutoencodersRNN:https://en.wikipedia.org/wiki/Backpropagation_through_time etc:http://www.cs.toronto.edu/~kriz/cifar.htmlhttp://yann.lecun.com/exdb/mnist/