S eptember 19, 2:15 pm A n alyt ics L an d scap e E vo lu ...Deep Learning is Physics + Chemistry + some statistic in between –Say what? ʘ Named after Ernst Ising, but invented

JPK

Gro

up

Business Forecasting & Analytics ForumSeptember 18-19, 2017 • Chicago, IL

Analytics Landscape Evolution

September 19, 2:15 pm

View presentation online at: https://jpkgroupsummits.com/attendee1/

“Sammy” Amirghodsi – Options Clearing Corporation

How neuron nets and deep learning are changing the analytics landscape

I am a Senior Enterprise Technology Executive delivering the strategic planning, technology leadership, and innovative platforms that create revolutionary insights with Big Data, substantial cost savings, and improve productivity. I harness technology to

create end-to-end massive change. Key advisor to executive management and architect beyond multi-million dollar projects, I deliver the insight and actions that shape and

grow organizations. I break down complexity and technological barriers into language that leaders can unite behind. Leveraging my expertise and business acumen, I court buy-in on a technology roadmap that is perfectly aligned to corporate goals and long-

term goals. Having worked across many departments and industries from financial services, healthcare, eCommerce, and consumer product development, I develop the

strategic relationships built on value and integrity that get everyone on board with complex projects.

Siamak Amirghodsi (Sammy)Data Intelligence and Analytics Summit – Chicago

Analytics Landscape EvolutionHowneuronnetsanddeeplearningarechangingtheadvanceanalyticslandscape

Siamak Amirghodsi (Sammy) :

Vice President - Head of Data & Analytics

Financial Industry

• Over 20+ years of designing, building, managing and executing large scale distributed systems in Fortune 20 companies. Lead the build out of a cutting edge FX payments data platform and analytics solution for a tier-1 investment bank in the US. Currently leading the data platform and real-time analytics build out for a tier-1 exchange related institution in United States.

• Certified on Cloudera Big Data Platform (Developer, Admin and HBase).

• Actively follow Hadoop (MapReduce, HDFS, YARN), Spark (Streaming / SQL / MLlib / GraphX / SparkR), Hive, Pig, Zookeeper, Amazon AWS, Cassandra, HBase, Neo4j, BlockChain, KDB+, RedShift and MongoDB while being fully grounded in traditional IBM/Oracle/Microsoft technology stacks for business continuity and integration.

• Key interests include cognitive models, big data, Hadoop, Spark, streaming systems, deep machine learning, google brain project, swarm algorithms, quantum computing, trading signal discovery, long term commodity cycles, cryptography, digital / crypto currencies, BlockChain, probabilistic graphical models and NLP.

• Contact Info:

• [email protected]

• https://www.linkedin.com/in/siamakamirghodsi

Profile

OCC’s Role - Issues and guarantees U.S. Listed contracts - Provides clearing and settlement services- Provides a risk management to ensure marketplace is not

disrupted in the event of a clearing member default

- Options listed on more than 3,600 stocks and more than 600 indices and ETFs

- Cleared more than 4.2 billion contracts in 2015 - Year-to-date average daily options volume :16.4 M- Highest volume trading day on 8/8/11: 41.5 million

3

Options Clearing Corporation Of Course we are hiring!

Rise of Connected Consciousness & Deep Learning

5

The Dawn of 4th Industrial Revolution – Data is the new Silicon!

A Darwinian moment in time …..

Tren

ds th

at a

re em

ergi

ng a

t the

sa

me p

oint

in ti

me

Ø Commoditization of data and infrastructure

ü Point to ponder : Commoditization and

democratization are the two sides of the same coin!

Ø The rise of augmented intelligence facilitated by

near real-time computational engine

Ø Spark like platforms with the potential to be the

common language (franca linguae or trade

language) for the enterprise in the new economy

Ø Advances in GPU paired with cluster computing

Ø Cloud makes the world of technology frictionless

Ø Data Science Accelerator – Extreme parallels to Quants on Wall Street and how they changed finance in 70s, 80s, 90s and 2000s

Ø Survival of the fittest

ü Rise of the digital economy and changes in customer behaviors will force companies to adapt to machine learning

Ø Machine Learning as a competitive advantage

7

Image processing error rate

Speech recognition error rate

Economical Value(EV): Why is Deep Learning getting so much attention?

Because it is better!

Beat the competition hands on in a visible way

Better performance than state of the art!

Equivalent performance with less cost (labor)

Important use cases remain

Imagesandsourcecreditedtooriginalcreator

8

Stanford / Google : Deep Learning at Terra Scale

Face Detector Human Body Detector Cat Detector

Deep Learning and complex nonlinearity in real-world examples

9

Machine Learning Scheme Deep Learning & Neuron Network Based Scheme


What is Deep Learning? How is it different?Ø It is a form of Machine Learning

ü It is a subclass of machine learning in which the system learns complex

abstract concept in layers from the data that is provided

Ø It is a form of Representation Learningü Choice of features has a profound effect on performance

ü Feature engineering is very important, but extremely labor intensive

ü Makes it easier to extract features and understand the surrounding world

ü In a probabilistic model, a good feature representation captures the posterior

distribution of the underlying world that it is operating in

Ø It can deal with complex non-linear functionsü It forms complex non-linear regions by combining simpler non-linear activation

functions in hierarchy

ü It uses weights and stacking of levels to achieve non-trivial non-linearity

ü The optimization (SGD, AdaGD, Dropout) discovers the probability space for

object classification

Ø It does not require hand crafted feature engineeringü It figures out what is relevant in a hierarchy of layers

ü It does not require human interaction

ü For example : You never tell it that an 8 is made of two circle like objects

Ø It can be generative!ü It can dream

ü It can provide a stereotypical of what it discovered – “The Google Cat”


Key Deep Learning Architectures

Key Deep Learning

Architectures

Stacked Auto Encoder (SAE)

Restricted Boltzmann Machine(RBM)

Recurrent Neural

Network(RNN)

Deep Belief Network (DBN)

Convolution Neural Network

(CNN)Others : Multi-Modal/Multi Tasking, RNTN,

DSN, etc.

Cognitive Psychology : How should we teach machines to learn?

Ø Change Blindness : https://www.youtube.com/watch?v=0grANlx7y2E

• We are blind to our own “change blindness”

• We try to see the forest and blind ourselves to changes to

individual trees

• The price for evolutionary survival?

Ø Invariant Recognition• We are not fooled by variation of the same object

• We will recognize variation of the same object in complex

un-natural surrounding

Ø Neural Cortex

• Hubel and Wiesel

• Nobel prize – 1981 : https://www.youtube.com/watch?v=y_l4kQ5wjiw

• Discoveries in Visual Systems

• Vision is learned – Kitten Experiment

• Revealed the pattern of organization of brain cell

• How connections between nerve cells filter and transform

sensory information from retina to cerebral cortex


13

The machine sees,hears and understands objects differently!

A horse is no longer a horse in Deep Learning!

Multi-LayerLearningAHorseasseenbyConvNet

AHorseasseenbythehuman


Demo

14

Image recognition

Deep Learning

“The Intuition - Roots in physical sciences”

• Derivation of Ising model / Spin Glass– energy

• A Restricted Markov Random Field model meaning we will not allow any connection at any level (Ln)

• Restriction make it easy to learn

• Two layers of nodes

• Input, Hidden(s) and then Output layer

• Tries to learn the abstractions via the hidden layers and then passes it to output layer which in turn classifies the output into the desired outcomes

• Symmetrical in nature

• Technically you could get away with just one hidden layer providing you the ability to observe the restrictions, but in reality you need to stack more layers which will lead to DBNs (Deep Belief Networks)

Restricted Boltzmann Machine (RBMs)

Hidden Units

Weights Energy Probabilities

17

Deep Learning is Physics + Chemistry + some statistic in between – Say what?

ʘ Named after Ernst Ising, but invented by Whilhelm Lenz in

1920

ʘ Mathematical model of ferromangetism in statistical mechanics

ʘ Discrete variable that can be on two states (-1,1) in a graph

setting

ʘ 1) W: tortured real values, 2) X: Gaussian random Variable, 3) Hamiltonian Energy Function 4) Frustration: Constrains

ʘ Two dimensional lattice, but can be extended into ‘n’ dimension

ʘ Spins interact with their neighbors

ʘ This model even in its simplest form (square-lattice) allows for

identification of phase transition

ʘ Spins and their neighbors system like to converge to a low

energy state

ʘ The relationship (degree of correlation) of the spins is governed

by weights – Funnel Energy Landscapeʘ The probability that the entire system will settle to a given state

is closely related to Boltzmann Distribution

ʘ We can bias the system by changing the weights

ʘ The Bias will act as a training

The Ising / Spin Glass Model“Our team focus: P-Spherical Spin Glass Frustration Model”

18

From Ising Model to Restricted Boltzmann Machine – The Ising Model

Wi

1)W(I)=+.80Likelytoflip2)W(I)=-.80LessLikelytoflip

E(x,w)=(-1)*.80*(+1)=- .80

E(x,w)=-∑xi.xj .wij

Ρ Χ𝜄 𝑤) = 7 𝑒9:(;(<),=)𝑒9:(;(>),=)

?

>@A

BoltzmannDistributionSample

E(x,w)=(-1*(-.90)*-1)+(-1*(+.90)*+1)+(-1*(+.50)*+1)+(-1*(-.75)*-1)=-.25

X0X1

X2

X3

X4

19

A hierarchical Ising model yields a Restricted Boltzman Machine (RBM) – The intuition

Triangle

Rectangle

Circle

HiddenLayer

InputLayer

OutputLayer

We know how to1. Calculate the energy output of two connected neurons

2. Calculate the energy of a neighborhood of neurons

3. Calculate the total energy of the whole system

We now assume a multi-dimensional Ising model1. The dimensions or plains of neurons are stacked in a hierarchical manner

2. No neuron is allowed to make a connection to another neuron on the same plane (level)

3. A neuron can only be connected to a neuron in the next plane

4. The weight of the connection determines the strength (degree of affinity) of the

connection

5. Each neuron can be connected to as many other neurons it wishes to (usually fully

connected) on the next plane

6. Each neuron will follow the usual rules (input, output, activation, etc.) except, no

connection to the neuron at the same level (original Ising model)

7. You do all these and you will have a DBN or RBM

A hierarchical Ising model yields a Restricted Boltzman Machine (RBM) – The intuition

HiddenLayer

OutputLayerCircle

Triangle

Rectangle

HiddenLayer

My Unsupervised Training Set

InputLayer

21

A hierarchical Ising model yields a Restricted Boltzman Machine (RBM)

Does Figure 2 Look Familiar?

Figure1 Figure2


Why forecasting in finance is hard

Trending Noisy

Meandering Quasi-Periodic

Cyclical Leveled

Time Series – Why is it so hard?

Variety• Trending• Noisy• Meandering• Periodic

Types• Univariate• Multivariate

SwitchingSystem• Threshold• Jumps

TimeVaryingProperties

• Mean• Variance• OtherMomentums

VolumeofData• Randomness• Natureof

forecasting

Ø Temporal Stability, Strength of Disturbance, Lead Time

Ø LSTM – Not enough (signal/noise ratio in finance)

24

Easy Learning Functions vs. Real world learning

Smoothness and Locality makes it simple to predict

Real-world problems do not have strong locality

Ø Easy problems are smooth functions with occasional bumps which make them easy to estimate and optimize based on principle of parsimony and locality.

Ø Function would not vary too much within a neighborhoodØ I can exploit locality and interpolation and predictØ Real world problems often have a non-smooth function or

surface without a continuous locality which make them hard to estimate and forecast.

Ø In real world I might not have enough samples to interpolate Ø We get zero probability to most and very high probability to

few – the kernel function is not smooth but very skewed to be useful.

Ø Core issue: There is too much variation in the data for the sample size which causes the Gaussian process to break down – It is the variations!

Ø How: We need to discover structure non-locally specially in high-dimensions.

Characteristics of asset returns are leptokurtic

Don'tgetconfused• Fattailsdon'tmeanmore variance;justdifferent variance.

Foragivenvariance,ahigherchanceofextremedeviationsimpliesalowerchanceofmediumones.ToparaphraseNassimN.Taleb:

• Thenormaldistributionspends68%ofthetimewithinonestandarddeviationofitsmean.Iffinancehasfattails,howmuchtimedostocksspendwithinonestandarddeviation?

• Everyoneanswers:'Lessthan68%!Fattailsmeanmoredeviation.'They'rewrong:stockpricesspendbetween78%and98%oftheirtimewithinonestandarddeviationofthemean.

http://vudlab.com/fat-tails.html

Ø Distribution of financial assets exhibit certain characteristic Ø 1 Excess Kurtosis >= 3.0

The distribution has heavy tails, tick tailsThe distribution is leptokurticThe normal distribution has Kurtosis < 3.0

SimpleReturns(2011-2014)

R (t)=f(Xt)+Ut Mean Equation

Shock:Unexplainedpartofchangetoreturnalsocalledinnovation

Expectedvariationofreturn

Discover value creation Manifold via Deep Learning Lens

l In 20 years (1/1/1995 – 12/31/2014), missing the worst 40 days will boost S&P500 annualized return from 9.85% to 22.19%.

l These worst days could be a manifold of a high-dimensional space in a chaotic system.

Source:https://www.ifa.com/12steps/step4/missing_the_best_and_worst_days/https://www.cs.cornell.edu/~kilian/research/manifold/manifold.html

27

Brokers&Banks

Fed

Brokers&Banks

Clearing&SettlementHouse

Fed

MutualFunds

Hedgefunds

Insurance

PensionFunds

Retailers

Prop.Shops

Retailers

OTCMarkets

ListedMarkets

Derivative Trading: A Complex Chaotic eco-system at scale

TechnologyRiskAnalyticsAdequateModelsClearingFunds

UnderlyingInstrument

ActiveOptionchain

28

Option Trading & Risk management: as a Game Theory / Optimization problem?Variables of the Standard Black-Scholes (theoretical):•StockPrice•StrikePrice•Timeremaininguntilexpirationexpressedasapercentofayear•Currentrisk-freeinterestrate•Volatilitymeasuredbyannualstandarddeviation

The GreeksAcollectionofstatisticalvaluesthatgivetheinvestorabetteroverallviewofoptionpremiumschangegivenchangesinpricingmodelinputs.Thesevaluescanhelpdecidewhatoptionsstrategiestouse.

Greeks

Delta

Beta

Gamma

Rho

Theta

Vega

Measureoftherelationshipbetweenanoptionpremiumandtheunderlyingstock

price.

Measureofhowcloselythemovementofanindividualstocktracksthemovementof

theentirestockmarket.

SensitivityofDeltatoaone-unitchangeintheunderlying.GammaIndicatesan

absolutechangeinDelta.

Sensitivityofoptionvaluetochangeininterestrate.Indicatesabsolutechangeinoptionvaluefora1%changeinthe

interestrate

Sensitivityofoptionvaluetochangesimpliedvolatility.Vegaindicatesan

absolutechangeinoptionvaluefora1%changeinvolatility.

Sensitivityofanoption’spremiumtochangeintime.Thetaindicatesanabsolutechangeintheoptionvalueforaone-unit

reductionintimeuntilexpiration.

Our Journey into Deep Learning with Spark &

TensorFlow

One Step ahead of the current tick– A hybrid approach recipe

30

• InputGate[I-Gate]• ForgetGate[F- Gate]• Cell• OutputGate[O-Gate]• HiddenStateOutput

1. Methodology:1. BeingainstrumentSMEmatter– mattersalot!2. Frequencystructureofdatamatters(spatialanalysis)3. Multiplenets- Preparetheinputbeforefeedingitintomainpipeline4. MainPipeline:LSTM– 5layer,50to1000neuronineach5. Langevin MCMC– weonlysampletheposteriorofdistributionaroundtheedge(notareplacement)6. ADMM– AlternatingDirectionMethodofMultiplierlooksreallypromising– notimplementedyet7. Mini-BatchSGD- pre-processingnets(ensembleofNets)

2. Input:Fundamentals(acres,wages,yield,etc.),price,variousindicators,long-termwhethervectors,Economic(dollarindex,oil,etc.)3. Output:Temporal Probability/AffinityMapforTime– Inflection/pivotpoints

Topology:• Input:N-dimfullyconnected• LSTM:7-Layerx500units,varies• Post:Fullyconnected(500unit)• Dropout• LRU,sigmoid• Output:N-Dim{post-processed}• PostProcess:SparkML{AffinityMap}• TensorFlow 1.1• Python,Scala,Julia• ExcelOutput– financefriendly• Data:Lotsofpre-processing,SMEs

A Deep Learning Journey – Landscape

AmazonS3

AmazonDynamoDB

AmazonRedshift

YARN

EMRCluster

SlaveNode

SlaveNode

SlaveNode

SlaveNode

SparkMaster

MasterNode

AmazonRDS

Driver

SparkExecutor(G2)

SparkCore

TensorFlow

TensorFlowOnSpark

A Deep Learning Journey - Our Roadmap and Version Matrix

Roadmap Version Matrix

DNN RBM RNN LSTM NTM DNC

TF-CPU

TF-GPU

TF-GPUCluster

TFonSpark

•Nvidia Driver:367.57• Cuda Library:7.5• cuDNN Library:5.1• TensorFlow:0.6• EMRLinux:2017Mar

Experimental“issues…issues”

•Nvidia Driver:367.57•Cuda Library:8.0• cuDNN Library:5.1• TensorFlow:1.1• EMRLinux:2016Sep

Finalized‘ok’

Next

Good Data Engineering matters as much as Good Data Science

• The team did not want to constantly configure a Tensorflow on Spark EMR cluster for every use

• The team created an AWS Cloudformation template to launch an EMR cluster seamlessly

• Cloudformation template invokes a bootstrap script to install Nvidia driver, CUDA library, and cuDNN library on cluster nodes

• Next step was to import template into AWS Service Catalog to enable single click launch of Tensorflow on Spark EMR Cluster

What would be the implication?

34

if your training error is too high then {

add more rocket fuel which is data }

if your test error is high then {

add more rockets which are the layer and number of neurons

}


Thank You!

Q & A

35

References

37

Intellectual Property Acknowledgement

Partsofthepresentationhavedrawnfrompublicallypublishedinformationfromvarioussourcesforeducationalpurposes

only.Wehavedoneourbesttolistthepublicallyavailableworkandthewebsite.Ifthesourceorattributionforany

data/contentisinadvertentlymissed,wegratefullycreditthesametothevariousscientists,engineersand/orotherownersoftheinformationandweclaimnoownership.Alltheworks,

slidesandIPmaterialremainwithoriginalowners.

WethankfullyacknowledgetheworksandcontributionofallthegreatmindstothefieldofDeepMachineLearningplus

timeSeriesanalysisandhavebeeninspiredtolearnfromtheirfindings!

ReferencesACMhttp://www.acm.org/IEEEhttp://www.ieee.org/index.htmlUS Patent Officehttp://www.uspto.gov/Data Science Centralwww.datasciencecentral.com/KDNuggetshttp://www.kdnuggets.com/Linkedin Users Goups and postsGray Sort competition - Databricks : https://databricks.com/blog/2014/11/05/spark-officially-sets-a-new-record-in-large-scale-sorting.html38IOT:http://www.informationweek.com/big-data/big-data-analytics/apache-spark-3-promising-use-cases/a/d-id/1319660http://www.ibmbigdatahub.com/blog/fogs-logs-and-cogs-newer-bigger-shape-big-data-internet-thingsUse cases:http://www.datanami.com/2014/03/06/apache_spark_3_real-world_use_cases/Spark (SQL/Mllib/Streaming/DataFrames/GaphX):https://databricks.com/https://databricks.com/bloghttps://databricks.com/blog/2015/04/28/project-tungsten-bringing-spark-closer-to-bare-metal.htmlAMPLabs:https://amplab.cs.berkeley.edu/Apache foundation:http://www.apache.org/http://www.apache.org/https://spark.apache.org/docs/latest/https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.packagehttps://hadoop.apache.org/Spark summit:https://spark-summit.org/2013/https://spark-summit.org/2014/https://spark-summit.org/2015/TechCon:http://www.bigdatatechcon.com/ (2012,2013,2014,2015) 38

Hadoop summit:http://2013.hadoopsummit.org/http://2014.hadoopsummit.org/http://2015.hadoopsummit.org/Oreilly Books and Media:http://www.oreilly.com/Lynda.com:http://www.lynda.com/Googlehttp://Google.com, http://Youtube.com, http://slideshare.com (web Images, video, news, search books)Plus various other books and white papers on big data, Hadoop, Machine Learning , Programming, Science and SparkQuotes: http://www.azquotes.com/Supporting material: www.slideshare.com

Tutorialhttp://info.usherbrooke.ca/hlarochelle/neural_networks/content.htmlhttp://deeplearning.stanford.edu/tutorialPaperDeep learning: Method and Applications (Deng & Yu, 2013)Representation learning: A review and new perspective (Benjio et al., 2014)Learning deep architecture for AI (Benjio, 2009)Slide & Videohttp://www.cs.toronto.edu/~fleet/courses/cifarSchool09/slidesBengio.pdfhttp://nlp.stanford.edu/courses/NAACL2013BookDeep learning (Benjio et al. 2015) http://www.iro.umontreal.ca/~bengioy/dlbook/Torch (Lua) https://github.com/torch/torch7Theano (Python) https://github.com/Theano/Theano/Deeplearning4j (word2vec for Java) https://github.com/SkymindIO/deeplearning4jND4J (Java) http://nd4j.org https://github.com/SkymindIO/nd4jDeepLearn Toolbox (matlab) https://github.com/rasmusbergpalm/DeepLearnToolbox/graphs/contributorsconvnetjs (javascript) https://github.com/karpathy/convnetjsGensim (word2vec for Python) https://github.com/piskvorky/gensimCaffe (image) http://caffe.berkeleyvision.org— More+++

39

Adaptive Gradient Descenthttp://jmlr.org/papers/volume15/gupta14a/gupta14a.pdfHubel & Wiselhttps://www.youtube.com/watch?v=y_l4kQ5wjiwhttp://centennial.rucares.org/index.php?page=Neural_Basis_Visual_PerceptionYoutube:https://www.youtube.com/watch?v=0grANlx7y2EWhite paperhttps://www.cs.princeton.edu/~rajeshr/papers/icml09-ConvolutionalDeepBeliefNetworks.pdfhttp://static.googleusercontent.com/media/research.google.com/en//archive/unsupervised_icml2012.pdfhttp://www.cs.stanford.edu/~acoates/papers/wangwucoatesng_icpr2012.pdfhttp://ai.stanford.edu/~ang/papers/nips10-TiledConvolutionalNeuralNetworks.pdfhttp://arxiv.org/pdf/1206.5538.pdfPictures:Quora.comhttp://florianhartl.com/logistic-regression-geometric-intuition.htmlhttp://tjo-en.hatenablog.com/entry/2014/01/06/234155https://www.reddit.com/http://www.carl-olsson.com/fall-semester-2013/http://www.clarifai.com - Error rateshttp://Coursera.comhttp://mns.k.u-tokyo.ac.jp/~mashio/product.htmlhttp://www.ganino.com/brain_structures_and_their_functionshttp://mathworld.wolfram.com/SigmoidFunction.htmlData science toolshttp://www.kdnuggets.com/2015/05/poll-r-rapidminer-python-big-data-spark.htmlRegularizationhttp://cs.nyu.edu/~rostami/presentations/L1_vs_L2.pdfDropout:https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf

Deep Learning Librarieshttp://deeplearning4j.org/compare-dl4j-torch7-pylearn.htmlhttp://www.kdnuggets.com/2015/06/popular-deep-learning-tools.htmlhttp://ucb-icsi-vision-group.github.io/caffe-paper/caffe.pdfhttps://www.reddit.com/r/MachineLearning/comments/2c9x0s/best_framework_for_deep_neural_nets/https://github.com/soumith/convnet-benchmarkshttp://www.picalike.com/blog/2015/01/12/the-portrait-of-a-machine-learning-priestess/http://openann.github.io/OpenANN-apidoc/OtherLibs.htmlhttp://www.infoworld.com/article/2853707/machine-learning/11-open-source-tools-machine-learning.html#slide11http://www.infoworld.com/article/2853707/machine-learning/11-open-source-tools-machine-learning.html#slide12http://torch.chhttp://deeplearning4j.orghttp://www.predictiveanalyticstoday.com/deep-learning-software-libraries/http://www.predictiveanalyticstoday.com/deep-learning-software-libraries/http://fastml.com/torch-vs-theano/https://www.quora.com/Which-is-the-best-deep-learning-framework-Theano-Torch7-or-Caffehttps://www.quora.com/Which-is-the-best-deep-learning-framework-Theano-Torch7-or-CaffeDemos:ConvNet in actionhttp://cs231n.github.io/convolutional-networks/http://cs231n.stanford.eduStacked autoencoder:http://ufldl.stanford.edu/wiki/index.php/Stacked_AutoencodersRNN:https://en.wikipedia.org/wiki/Backpropagation_through_time etc:http://www.cs.toronto.edu/~kriz/cifar.htmlhttp://yann.lecun.com/exdb/mnist/

References

Learning ‘Deep Learning’

40

S eptember 19, 2:15 pm A n alyt ics L an d scap e E vo lu ...Deep Learning is Physics + Chemistry + some statistic in between –Say what? ʘ Named after Ernst Ising, but invented

Documents