Top Banner
REVIEW SUMMARY GEOPHYSICS Machine learning for data-driven discovery in solid Earth geoscience Karianne J. Bergen, Paul A. Johnson, Maarten V. de Hoop, Gregory C. Beroza* BACKGROUND: The solid Earth, oceans, and atmosphere together form a complex interact- ing geosystem. Processes relevant to under- standing Earths geosystem behavior range in spatial scale from the atomic to the planetary, and in temporal scale from milliseconds to billions of years. Physical, chemical, and bio- logical processes interact and have substan- tial influence on this complex geosystem, and humans interact with it in ways that are in- creasingly consequential to the future of both the natural world and civilization as the finite- ness of Earth becomes increasingly apparent and limits on available energy, mineral re- sources, and fresh water increasingly affect the human condition. Earth is subject to a variety of geohazards that are poorly under- stood, yet increasingly impactful as our expo- sure grows through increasing urbanization, particularly in hazard-prone areas. We have a fundamental need to develop the best possible predictive understanding of how the geosys- tem works, and that understanding must be informed by both the present and the deep past. This understanding will come through the analysis of increasingly large geo-datasets and from computationally intensive simulations, often connected through inverse problems. Geo- scientists are faced with the challenge of extract- ing as much useful information as possible and gaining new insights from these data, simula- tions, and the interplay between the two. Tech- niques from the rapidly evolving field of machine learning (ML) will play a key role in this effort. ADVANCES: The confluence of ultrafast com- puters with large memory, rapid progress in ML algorithms, and the ready availability of large datasets place geoscience at the thresh- old of dramatic progress. We anticipate that this progress will come from the application of ML across three categories of research effort: (i) automation to perform a complex predic- tion task that cannot easily be described by a set of explicit commands; (ii) modeling and inverse problems to create a representation that approximates numerical simulations or captures relationships; and (iii) discovery to reveal new and often unanticipated patterns, structures, or relationships. Examples of auto- mation include geologic mapping using remote- sensing data, characterizing the topology of fracture systems to model subsurface transport, and classifying volcanic ash particles to infer eruptive mechanism. Exam- ples of modeling include approximating the visco- elastic response for com- plex rheology, determining wave speed models direct- ly from tomographic data, and classifying diverse seismic events. Exam- ples of discovery include predicting laboratory slip events using observations of acoustic emis- sions, detecting weak earthquake signals using similarity search, and determining the connec- tivity of subsurface reservoirs using ground- water tracer observations. OUTLOOK: The use of ML in solid Earth geo- sciences is growing rapidly, but is still in its early stages and making uneven progress. Much remains to be done with existing datasets from long-standing data sources, which in many cases are largely unexplored. Newer, un- conventional data sources such as light detec- tion and ranging (LiDAR), fiber-optic sensing, and crowd-sourced measurements may demand new approaches through both the volume and the character of information that they present. Practical steps could accelerate and broad- en the use of ML in the geosciences. Wider adoption of open-science principles such as open source code, open data, and open access will better position the solid Earth community to take advantage of rapid developments in ML and artificial intelligence. Benchmark data- sets and challenge problems have played an important role in driving progress in artificial intelligence research by enabling rigorous per- formance comparison and could play a similar role in the geosciences. Testing on high-quality datasets produces better models, and bench- mark datasets make these data widely availa- ble to the research community. They also help recruit expertise from allied disciplines. Close collaboration between geoscientists and ML researchers will aid in making quick progress in ML geoscience applications. Extracting max- imum value from geoscientific data will require new approaches for combining data-driven methods, physical modeling, and algorithms capable of learning with limited, weak, or biased labels. Funding opportunities that target the intersection of these disciplines, as well as a greater component of data science and ML ed- ucation in the geosciences, could help bring this effort to fruition. RESEARCH Bergen et al., Science 363, 1299 (2019) 22 March 2019 1 of 1 The list of author affiliations is available in the full article online. *Corresponding author. Email: [email protected] Cite this article as K. J. Bergen et al., Science 363, eaau0323 (2019). DOI: 10.1126/science.aau0323 Digital geology. Digital representation of the geology of the conterminous United States. [Geology of the Conterminous United States at 1:2,500,000 scale; a digital representation of the 1974 P. B. King and H. M. Beikman map by P.G. Schruben, R. E. Arndt,W. J. Bawiec] ON OUR WEBSITE Read the full article at http://dx.doi. org/10.1126/ science.aau0323 .................................................. on March 28, 2019 http://science.sciencemag.org/ Downloaded from
12

GEOPHYSICS Machine learning for data-driven discovery in ...icnem.org/uploads/6/4/0/4/64044933/bergenscience19.pdf · geosciences Scientists have been applying ML techniques to problems

Jul 15, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: GEOPHYSICS Machine learning for data-driven discovery in ...icnem.org/uploads/6/4/0/4/64044933/bergenscience19.pdf · geosciences Scientists have been applying ML techniques to problems

REVIEW SUMMARY◥

GEOPHYSICS

Machine learning for data-drivendiscovery in solid Earth geoscienceKarianne J. Bergen, Paul A. Johnson, Maarten V. de Hoop, Gregory C. Beroza*

BACKGROUND: The solid Earth, oceans, andatmosphere together form a complex interact-ing geosystem. Processes relevant to under-standing Earth’s geosystem behavior range inspatial scale from the atomic to the planetary,and in temporal scale from milliseconds tobillions of years. Physical, chemical, and bio-logical processes interact and have substan-tial influence on this complex geosystem, andhumans interact with it in ways that are in-creasingly consequential to the future of boththe natural world and civilization as the finite-ness of Earth becomes increasingly apparentand limits on available energy, mineral re-sources, and fresh water increasingly affectthe human condition. Earth is subject to avariety of geohazards that are poorly under-stood, yet increasingly impactful as our expo-sure grows through increasing urbanization,particularly in hazard-prone areas. We have afundamental need to develop the best possiblepredictive understanding of how the geosys-tem works, and that understanding must beinformed by both the present and the deep

past. This understanding will come throughthe analysis of increasingly large geo-datasetsand fromcomputationally intensive simulations,often connected through inverse problems. Geo-scientists are facedwith the challenge of extract-ing as much useful information as possible andgaining new insights from these data, simula-tions, and the interplay between the two. Tech-niques from the rapidly evolving field ofmachinelearning (ML) will play a key role in this effort.

ADVANCES: The confluence of ultrafast com-puters with large memory, rapid progress inML algorithms, and the ready availability oflarge datasets place geoscience at the thresh-old of dramatic progress. We anticipate thatthis progress will come from the application ofML across three categories of research effort:(i) automation to perform a complex predic-tion task that cannot easily be described by aset of explicit commands; (ii) modeling andinverse problems to create a representationthat approximates numerical simulations orcaptures relationships; and (iii) discovery to

reveal new and often unanticipated patterns,structures, or relationships. Examples of auto-mation include geologicmapping using remote-sensing data, characterizing the topology offracture systems tomodel subsurface transport,and classifying volcanic ash particles to infer

eruptivemechanism.Exam-ples of modeling includeapproximating the visco-elastic response for com-plex rheology, determiningwave speedmodels direct-ly from tomographic data,

and classifying diverse seismic events. Exam-ples of discovery include predicting laboratoryslip events using observations of acoustic emis-sions, detecting weak earthquake signals usingsimilarity search, and determining the connec-tivity of subsurface reservoirs using ground-water tracer observations.

OUTLOOK: The use of ML in solid Earth geo-sciences is growing rapidly, but is still in itsearly stages and making uneven progress.Much remains to be donewith existing datasetsfrom long-standing data sources, which inmany cases are largely unexplored. Newer, un-conventional data sources such as light detec-tion and ranging (LiDAR), fiber-optic sensing,and crowd-sourcedmeasurementsmaydemandnew approaches through both the volume andthe character of information that they present.Practical steps could accelerate and broad-

en the use of ML in the geosciences. Wideradoption of open-science principles such asopen source code, open data, and open accesswill better position the solid Earth communityto take advantage of rapid developments inML and artificial intelligence. Benchmark data-sets and challenge problems have played animportant role in driving progress in artificialintelligence research by enabling rigorous per-formance comparison and could play a similarrole in the geosciences. Testing on high-qualitydatasets produces better models, and bench-mark datasets make these data widely availa-ble to the research community. They also helprecruit expertise from allied disciplines. Closecollaboration between geoscientists and MLresearchers will aid in making quick progressinMLgeoscience applications. Extractingmax-imumvalue fromgeoscientific datawill requirenew approaches for combining data-drivenmethods, physical modeling, and algorithmscapable of learningwith limited,weak, or biasedlabels. Funding opportunities that target theintersection of these disciplines, as well as agreater component of data science andML ed-ucation in the geosciences, could help bringthis effort to fruition.▪

RESEARCH

Bergen et al., Science 363, 1299 (2019) 22 March 2019 1 of 1

The list of author affiliations is available in the full article online.*Corresponding author. Email: [email protected] this article as K. J. Bergen et al., Science 363, eaau0323(2019). DOI: 10.1126/science.aau0323

Digital geology. Digital representation of the geology of the conterminous United States.[Geology of the Conterminous United States at 1:2,500,000 scale; a digital representation ofthe 1974 P. B. King and H. M. Beikman map by P. G. Schruben, R. E. Arndt, W. J. Bawiec]

ON OUR WEBSITE◥

Read the full articleat http://dx.doi.org/10.1126/science.aau0323..................................................

on March 28, 2019

http://science.sciencem

ag.org/D

ownloaded from

Page 2: GEOPHYSICS Machine learning for data-driven discovery in ...icnem.org/uploads/6/4/0/4/64044933/bergenscience19.pdf · geosciences Scientists have been applying ML techniques to problems

REVIEW◥

GEOPHYSICS

Machine learning for data-drivendiscovery in solid Earth geoscienceKarianne J. Bergen1,2, Paul A. Johnson3, Maarten V. de Hoop4, Gregory C. Beroza5*

Understanding the behavior of Earth through the diverse fields of the solid Earth geosciencesis an increasingly important task. It is made challenging by the complex, interacting, andmultiscale processes needed to understand Earth’s behavior and by the inaccessibility of nearlyall of Earth’s subsurface to direct observation. Substantial increases in data availability andin the increasingly realistic character of computer simulations hold promise for acceleratingprogress, but developing a deeper understanding based on these capabilities is itselfchallenging.Machine learningwill play a key role in this effort.We review the state of the fieldand make recommendations for how progress might be broadened and accelerated.

The solid Earth, oceans, and atmospheretogether form a complex interacting geo-system. Processes relevant to understand-ing its behavior range in spatial scale fromthe atomic to the planetary, and in tempo-

ral scale from milliseconds to billions of years.Physical, chemical, and biological processes in-teract and have substantial influence on thiscomplex geosystem. Humans interact with ittoo, in ways that are increasingly consequen-tial to the future of both the natural world andcivilization as the finiteness of Earth becomesincreasingly apparent and limits on availableenergy, mineral resources, and fresh water in-creasingly affect the human condition. Earth issubject to a variety of geohazards that are poorlyunderstood, yet increasingly impactful as our ex-posure grows through increasing urbanization,particularly in hazard-prone areas. We have afundamental need to develop the best possiblepredictive understanding of how the geosystemworks, and that understanding must be informedby both the present and the deep past.In this review we focus on the solid Earth.

Understanding the material properties, chemis-try, mineral physics, and dynamics of the solidEarth is a fascinating subject, and essential tomeeting the challenges of energy, water, andresilience to natural hazards that humanity facesin the 21st century. Efforts to understand thesolid Earth are challenged by the fact that nearlyall of Earth’s interior is, and will remain, in-accessible to direct observation. Knowledge ofinterior properties and processes are based onmeasurements taken at or near the surface, arediscrete, and are limited by natural obstructions

such that aspects of that knowledge are notconstrained by direct measurement.For this reason, solid Earth geoscience (sEg)

is both a data-driven and a model-driven fieldwith inverse problems often connecting the two.Unanticipated discoveries increasingly will comefrom the analysis of large datasets, new develop-ments in inverse theory, and procedures enabledby computationally intensive simulations. Overthe past decade, the amount of data available togeoscientists has grown enormously, throughlarger deployments of traditional sensors andthrough new data sources and sensing modes.Computer simulations of Earth processes arerapidly increasing in scale and sophisticationsuch that they are increasingly realistic and rele-vant to predicting Earth’s behavior. Among theforemost challenges facing geoscientists is howto extract as much useful information as possibleand how to gain new insights fromboth data andsimulations and the interplay between the two.We argue that machine learning (ML) will play akey role in that effort.ML-driven breakthroughs have come initially

in traditional fields such as computer vision andnatural language processing, but scientists inother domains have rapidly adopted and ex-tended these techniques to enable discoverymorebroadly (1–4). The recent interest in ML amonggeoscientists initially focused on automated anal-ysis of large datasets, but has expanded into theuse of ML to reach a deeper understanding ofcoupled processes through data-driven discov-eries and model-driven insights. In this reviewwe introduce the challenges faced by the geo-sciences, present emerging trends in geoscienceresearch, and provide recommendations to helpaccelerate progress.ML offers a set of tools to extract knowledge

and draw inferences from data (5). It can also bethought of as the means to artificial intelligence(AI) (6), which involves machines that can per-form tasks characteristic of human intelligence(7, 8). ML algorithms are designed to learn from

experience and recognize complex patterns andrelationships in data. ML methods take a differ-ent approach to analyzing data than classicalanalysis techniques (Fig. 1)—an approach thatis robust, fast, and allows exploration of a largefunction space (Fig. 2).The two primary classes of ML algorithms are

supervised and unsupervised techniques. In sup-ervised learning, the ML algorithm “learns” torecognize a pattern or make general predictionsusing known examples. Supervised learning algo-rithms create a map, or model, f that relates adata (or feature) vector x to a correspondinglabel or target vector y: y = f(x), using labeledtraining data [data for which both the input andcorresponding label (x(i), y(i)) are known andavailable to the algorithm] to optimize the mod-el. For example, a supervised ML classifier mightlearn to detect cancer in medical images usinga set of physician-annotated examples (9). Awell-trained model should be able to generalize andmake accurate predictions for previously un-seen inputs (e.g., label medical images from newpatients).Unsupervised learningmethods learn patterns

or structure in datasets without relying on labelcharacteristics. In a well-known example, re-searchers at Google’s X lab developed a feature-detection algorithm that learned to recognizecats after being exposed to millions of imagesfrom YouTube without prompting or prior in-formation about cats (10). Unsupervised learningis often used for exploratory data analysis or vi-sualization in datasets for which no or few labelsare available, and includes dimensionality reduc-tion and clustering.The many different algorithms for supervised

and unsupervised learning each have relativestrengths and weaknesses. The algorithm choicedepends on a number of factors including (i)availability of labeled data, (ii) dimensionalityof the data vector, (iii) size of dataset, (iv)continuous- versus discrete-valued predictiontarget, and (v) desired model interpretability.The level of model interpretability may be ofparticular concern in geoscientific applications.Although interpretability may not be necessaryin a highly accurate image recognition system,it is critical when the goal is to gain physicalinsight into the system.

Machine learning in solid Earthgeosciences

Scientists have been applying ML techniques toproblems in the sEg for decades (11–13). Despitethe promise shown by early proof-of-conceptstudies, the community has been slow to adoptML more broadly. This is changing rapidly.Recent performance breakthroughs in ML, in-cluding advances in deep learning and the avail-ability of powerful, easy-to-use ML toolboxes,have led to renewed interest in ML among geo-scientists. In sEg, researchers have leveragedMLto tackle a diverse range of tasks that we groupinto the three interconnected modes of automa-tion, modeling and inverse problems, and dis-covery (Fig. 3).

RESEARCH

Bergen et al., Science 363, eaau0323 (2019) 22 March 2019 1 of 10

1Institute for Computational and Mathematical Engineering,Stanford University, Stanford, CA 94305, USA. 2Departmentof Earth and Planetary Sciences, Harvard University,Cambridge, MA 02138, USA. 3Geophysics Group, Los AlamosNational Laboratory, Los Alamos, NM 87545, USA.4Department of Computational and Applied Mathematics,Rice University, Houston, TX 77005, USA. 5Department ofGeophysics, Stanford University, Stanford, CA 94305, USA.*Corresponding author. Email: [email protected]

on March 28, 2019

http://science.sciencem

ag.org/D

ownloaded from

Page 3: GEOPHYSICS Machine learning for data-driven discovery in ...icnem.org/uploads/6/4/0/4/64044933/bergenscience19.pdf · geosciences Scientists have been applying ML techniques to problems

Automation is the use of ML to perform acomplex task that cannot easily be described by aset of explicit commands. In automation tasks,ML is selected primarily as a tool for makinghighly accurate predictions (or labeling data),particularly when the task is difficult for humansto perform or explain. Examples of ML used forautomation outside the geosciences includeimage recognition (14) or movie recommenda-tion (15) systems. ML can improve upon expert-

designed algorithms by automatically identifyingbetter solutions among a larger set of possibili-ties. Automation takes advantage of a strengthof ML algorithms—their ability to process andextract patterns from large or high-dimensionaldatasets—to replicate or exceed human perform-ance. In the sEg, ML is used to automate thesteps in large-scale data analysis pipelines, asin earthquake detection (16) or earthquake earlywarning (17–19), and to perform specialized, re-

petitive tasks that would otherwise require time-consuming expert analysts, such as categorizingvolcanic ash particles (20).ML can also be used for modeling, or creating

a representation that captures relationships andstructure in a dataset. This can take the form ofbuilding amodel to represent complex, unknown,or incompletely understood relationships be-tween data and target variables; e.g., the rela-tionship between earthquake source parametersand peak ground acceleration for groundmotionprediction (21, 22). ML can also be used to buildapproximate or surrogate models to speed largecomputations, including numerical simulations(23, 24) and inversion (25). Inverse problems con-nect observational data, computational models,and physics to enable inference about physicalsystems in the geosciences. ML, especially deeplearning, can aid in the analysis of inverse prob-lems (26). Deep neural networks, with architec-tures informed by the inverse problem itself, canlearn an inverse map for critical speedups overtraditional reconstructions, and the analysis ofgeneralization of MLmodels can provide insightsinto the ill-posedness of an inverse problem.Data-driven discovery, the ability to extract

new information from data, is one of most ex-citing capabilities of ML for scientific applica-tions. ML provides scientists with a set of toolsfor discovering new patterns, structure, and rela-tionships in scientific datasets that are not easilyrevealed through conventional techniques. MLcan reveal previously unidentified signals or phy-sical processes (27–31), and extract key featuresfor representing, interpreting, or visualizing data(32–34). ML can help to minimize bias—for ex-ample, by discovering patterns that are counter-intuitive or unexpected (29). It can also be usedto guide the design of experiments or future datacollection (35).These themes are all interrelated; modeling

and inversion can also provide the capability forautomated predictions, and the use of ML forautomation, modeling, or inversion may yieldnew insights and fundamental discoveries.

Methods and trends forsupervised learning

Supervised learning methods use a collection ofexamples (training data) to learn relationshipsand build models that are predictive for pre-viously unseen data. Supervised learning is apowerful set of tools that have successfully beenused in applications spanning the themes ofautomation, modeling and inversion, and dis-covery (Fig. 4). In this section we organize recentsupervised learning applications in the sEg byMLalgorithm, which we order roughly by modelcomplexity, starting with the relatively simplelogistic regression classifier and endingwith deepneural networks. In general, more complex mod-els require more training data and less featureengineering.

Logistic regression

Logistic regression (36) is a simple binary classi-fier that estimates the probability that a new data

Bergen et al., Science 363, eaau0323 (2019) 22 March 2019 2 of 10

Fig. 2. The function space used bydomain experts and that used byML.The function space of user-definedfunctions employed by scientists, incontrast to the functional space usedby ML, is contained within the entirepossible function space. The functionspace that ML employs is expandingrapidly as the computational costs andruntimes decrease and memory,depths of networks, and availabledata increase.

The full function space

Functions explored by current machine

learning methodologies

Domain specific classes of functions

Fig. 1. How scientists analyze data: the conventional versus the ML lens for scientific analysis.ML is akin to looking at the data through a new lens. Conventional approaches applied by domainexperts (e.g., Fourier analysis) are preselected and test a hypothesis or simply display datadifferently. ML explores a larger function space that can connect data to some target or label. Indoing so, it provides the means to discover relations between variables in high-dimensional space.Whereas some ML approaches are transparent in how they find the function and mapping, others areopaque. Matching an appropriate ML approach to the problem is therefore extremely important.

RESEARCH | REVIEW

on March 28, 2019

http://science.sciencem

ag.org/D

ownloaded from

Page 4: GEOPHYSICS Machine learning for data-driven discovery in ...icnem.org/uploads/6/4/0/4/64044933/bergenscience19.pdf · geosciences Scientists have been applying ML techniques to problems

point belongs to one of two classes. Reynen andAudet (37) apply a logistic regression classifier todistinguish automatically between earthquakesignals and explosions, using polarization andfrequency features extracted from seismic wave-form data. They extend their approach to detectearthquakes in continuous data by classifying eachtime segment as earthquake or noise. They usedclass probabilities at each seismic station to com-bine the detection results from multiple stationsin the seismic network. Pawley et al. (38) uselogistic regression to separate aseismic from seis-mogenic injection wells for induced seismicity,using features in the model to identify geologicfactors, including proximity of the well to base-ment, associated with a higher risk of inducedseismicity.

Graphical models

Many datasets in the geosciences have a tempo-ral component, such as the ground motion time-series data recorded by seismometers. Althoughmost ML algorithms can be adapted for use ontemporal data, somemethods, like graphicalmod-els, can directly model temporal dependencies.For example, hidden Markov models (HMMs)are a technique for modeling sequential dataand have been widely used in speech recognition(39). HMMs have been applied to continuousseismic data for the detection and classification

of alpine rockslides (40), volcanic signals (41, 42),regional earthquakes (43), and induced earth-quakes (44). A detailed explanation of HMMsand their application to seismic waveform datacan be found in Hammer et al. (42). DynamicBayesian networks (DBNs), another type of graph-ical model that generalizes HMMs, have alsobeen used for earthquake detection (45, 46). Inexploration geophysics, hierarchical graphicalmodels have been applied to determine the con-nectivity of subsurface reservoirs from time-seriesmeasurementsusingpriorsderived fromconvection-diffusion equations (47). The authors report thatuse of a physics-based prior is key to obtaininga reliable model. Graph-based ML emulatorswere used by Srinivasan et al. (48) tomimic high-performance physics-based computations of flowthrough fracture networks, making robust un-certainty quantification of fractured systems pos-sible (48).

Support vector machine

Support vector machine (SVM) is a binary clas-sification algorithm that identifies the optimalboundary between the training data from twoclasses (49). SVMs use kernel functions, similar-ity functions that generalize the inner product, toenable an implicit mapping of the data into ahigher-dimensional feature space. SVMs withlinear kernels separate classeswith a hyperplane,

whereas nonlinear kernel functions allow fornonlinear decision boundaries between classes[see Cracknell and Reading (50) and Shahnaset al. (51) for explanations of SVMs and kernelmethods, respectively].Shahnas et al. (51) use an SVM to studymantle

convection processes by solving the inverse prob-lemof estimatingmantle density anomalies fromthe temperature field. Temperature fields com-puted by numerical simulations of mantle con-vection are used as training data. The authorsalso train an SVM to predict the degree ofmantleflow stagnation. Both support vector machines(18) and support vector regression (19) have beenused for rapid magnitude estimation of seismicevents for earthquake early warning. Support vec-tor machines have also been used for discrimi-nation of earthquakes and explosions (52) andfor earthquake detection in continuous seismicdata (53).

Random forests and ensemble learning

Decision trees are a supervised method for clas-sification and regression that learn a piecewise-constant function, equivalent to a series of if-thenrules that can be visualized by a binary treestructure. A random forest (RF) is an ensemblelearning algorithm that can learn complex rela-tionships by voting among a collection (“forest”)of randomized decision trees (54) [see Cracknell

Bergen et al., Science 363, eaau0323 (2019) 22 March 2019 3 of 10

Fig. 3. Common modes of ML.The top row shows an example of anautomated approach to mappinglithology using remote-sensing databy applying a random forest MLapproach. This approach works wellwith sparse ground-truth data andgives robust estimates of theuncertainty of the predicted lithol-ogy (35). The second row showstraining a deep neural network tolearn a computationally efficientrepresentation of viscoelasticsolutions in Earth, allowing calcula-tions to be done quickly, reliably,and with high spatial and temporalresolutions (23). The third rowshows an example of inversionwhere the input is a nonnegativeleast-squares reconstruction andthe network is trained to reconstructa projection into one subspace. Theapproach provides the means toaddress inverse problems withsparse data and still obtain goodreconstructions (79). Here, (under)sampling is encoded in the trainingdata that can be compensated bythe generation of a low-dimensionallatent (concealed) space from whichthe reconstructions are obtained.The fourth row shows results fromapplying a random forest approachto continuous acoustic data to extract fault friction and bound fault failure time. The approach identified signals that were deemed noise beforehand(28). [Reprinted with permission from John Wiley and Sons (23), (28), and the Society of Exploration Geophysicists (35)]

RESEARCH | REVIEW

on March 28, 2019

http://science.sciencem

ag.org/D

ownloaded from

Page 5: GEOPHYSICS Machine learning for data-driven discovery in ...icnem.org/uploads/6/4/0/4/64044933/bergenscience19.pdf · geosciences Scientists have been applying ML techniques to problems

and Reading (50) for a detailed description ofRFs]. Random forests are relatively easy to useand interpret. These are important advantagesover methods that are opaque or require tuningmany hyperparameters (e.g., neural networks,described below), and have contributed to thebroad application of RFs within sEg.Kuhn et al. (35) produced lithological maps in

Western Australia using geophysical and remote-sensing data that were trained on a small subsetof the ground area. Cracknell and Reading (55)found that random forests provided the bestperformance for geological mapping by com-paring multiple supervised ML algorithms.Random forest predictions also improved three-dimensional (3D) geological models using re-motely sensed geophysical data to constraingeophysical inversions (56).Trugman and Shearer (21) discern a predictive

relationship between stress drop andpeak groundacceleration using RFs to learn nonlinear, non-parametric ground motion prediction equations(GMPEs) from a dataset of moderate-magnitudeevents in northern California. This departed fromthe typical use of linear regression to model therelationship between expected peak ground velo-city or acceleration and earthquake site andsource parameters that define GMPEs.Valera et al. (24) used ML to characterize the

topology of fracture patterns in the subsurfacefor modeling flow and transport. A graph rep-

resentation of discrete fracture networks allowedRF and SVMs to identify subnetworks that char-acterize the network flow and transport of thefull network. The reduced network representa-tions greatly decreased the computational effortrequired to estimate system behavior.Rouet-Leduc et al. (28, 29) trained a RF on

continuous acoustic emission in a laboratoryshear experiment to determine instantaneousfriction and to predict time-to-failure. Fromthe continuous acoustic data using the samelaboratory apparatus, Hulbert et al. (30) apply adecision tree approach to determine the instan-taneous fault friction and displacement on thelaboratory fault. Rouet-Leduc et al. (31) scaledthe approach to Cascadia by applying instanta-neous seismic data to predict the instantaneousdisplacement rate on the subducting plate inter-face using GPS data as the label. In the lab-oratory and the field study in Cascadia, MLrevealed unknown signals. Of interest is thatthe same features apply both at laboratory andfield scale to infer fault physics, suggesting auniversality across systems and scales.

Neural networks

Artificial neural networks (ANNs) are an algo-rithm loosely modeled on the interconnectednetworks of biological neurons in the brain (57).ANN models are represented as a set of nodes(neurons) connected by a set of weights. Each

node takes a weighted linear combination ofvalues from the previous layer and applies anonlinear function to produce a single value thatis passed to the next layer. “Shallow” networkscontain an input layer (data), a single hiddenlayer, and an output layer (predicted response).Valentine and Woodhouse (58) present a de-tailed explanation of ANNs and the process oflearning weights from training data. ANNs canbe used for both regression and classification,depending on the choice of output layer.ANNs have a long history of use in the geo-

sciences [see (59, 60) for reviews of early work],and they remain popular for modeling nonlinearrelationships in a range of geoscience appli-cations. De Wit et al. (61) estimate both the 1DP-wave velocity structure and model uncertain-ties from P-wave travel-time data by solving theBayesian inverse problem with an ANN. Thisneural network–based approach is an alternativeto using the standard Monte Carlo sampling ap-proach for Bayesian inference. Käufl et al. (25)built an ANN model that estimates source pa-rameters from strong motion data. The ANNmodel performs rapid inversion for source pa-rameters in real time by precomputing compu-tationally intensive simulations that are thenused to train the neural network model.ANNs have been used to estimate short-period

response spectra (62), to model ground motionprediction equations (22), to assess data quality

Bergen et al., Science 363, eaau0323 (2019) 22 March 2019 4 of 10

Deep Neural Networks

Autoencoder Networks

Feature Learning

Dictionary Learning

Dynamic decisions

Inverse problems

Fast simulations & surrogate models

Recurrent Neural Networks

Convolutional Neural Networks

Artifical Neural Networks

Support Vector Machines

Random Forests & Ensembles

Graphical Models

Logistic Regression

ReinforcementLearning

Featurization

Prediction

Detection & classification

Determine optimal boundary

Domain adaptationSparse representation

Feature representationDimensionality reduction

Learn jointprobability distribution

Semi-Supervised

Learning

Clustering &Self-organizing maps

Unsurpervised Learning

Supervised Learning

DeepGenerative

Models

Fig. 4. ML methods and their applications. Most ML applications insEg fall within two classes: unsupervised learning and supervised learning.In supervised learning tasks, such as prediction (21, 28) and classification(16, 20), the goal is to learn a general model based on known (labeled)examples of the target pattern. In unsupervised learning tasks, thegoal is instead to learn structure in the data, such as sparse orlow-dimensional feature representations (27). Other classes of ML tasks

include semi-supervised learning, in which both labeled and unlabeleddata are available to the learning algorithm, and reinforcement learning.Deep neural networks represent a class of ML algorithms that includeboth supervised and unsupervised tasks. Deep learning algorithmshave been used to learn feature representations (32, 89), surrogatemodels for performing fast simulations (23, 75), and joint probabilitydistributions (98, 100).

RESEARCH | REVIEW

on March 28, 2019

http://science.sciencem

ag.org/D

ownloaded from

Page 6: GEOPHYSICS Machine learning for data-driven discovery in ...icnem.org/uploads/6/4/0/4/64044933/bergenscience19.pdf · geosciences Scientists have been applying ML techniques to problems

for focal mechanism and hypocenter location(58), and to perform noise tomography (63).Logistic regression and ANN models allowedMousavi et al. (64) to characterize the sourcedepth of microseimic events induced by under-ground collapse and sinkhole formation.Kong et al.(17) use an ANN with a small number of easy-to-compute features for use on a smartphone-basedseismic network to distinguish between earth-quake motion and motion due to user activity.

Deep neural networks

Deep neural networks (DNNs), or deep learning,are an extension of the classical ANN that in-corporatemultiple hidden layers (65). Deep learn-ing does not represent a single algorithm, but abroad class of methods with diverse networkarchitectures, including both supervised andunsupervised methods. Deep architectures in-clude multiple processing layers and nonlineartransformations, with the outputs from eachlayer passed as inputs to the next. SupervisedDNNs simultaneously learn a feature represen-tation and amapping from features to the target,enabling good model performance without re-quiring well-chosen features as inputs. Ross et al.(66) provide an illustrative example of a convo-lutional neural network (CNN), a popular class ofDNNs, with convolutional layers for feature ex-traction and a fully connected layer for classifi-cation and regression. However, training a deepnetwork also requires fitting a large number ofparameters, which requires large training data-sets and techniques to prevent overfitting themodel (i.e., memorizing the training data ratherthan learning a general trend). The complexity ofdeep learning architectures can also make themodels difficult to interpret.DNNs trained on simulation-generated data

can learn a model that approximates the outputof physical simulations. DeVries et al. (23) use adeep, fully connected neural network to learn acompact model that accurately reproduces thetime-dependent deformation of Earth as mod-eled by computationally intensive codes thatsolve for the response to an earthquake of anelastic layer over an infinite viscoelastic halfspace. Substantial computational overhead is re-quired to generate simulation data for trainingthe network, but once trained themodel acts as afast operator, accelerating the computation ofnew viscoelastic solutions by orders of magni-tude. Moseley et al. (67) use a CNN, trained onsynthetic data from a finite difference model, toperform fast full wavefield simulations.Shoji et al. (20) use a CNN to classify volcanic

ash particles on the basis of their shape, witheach of the four classes corresponding to a dif-ferent physical eruption mechanism. The authorsuse the class probabilities returned by the net-work to identify themixing ratio for ash particleswith complex shapes, a task that is difficult forexpert analysts.Several recent studies have appliedDNNswith

various architectures for automatic earthquakeand seismic event detection (16, 68, 69), phase-picking (66, 70), and classification of volcano-

seismic events (71). Wiszniowski et al. (72) in-troduced a real-time earthquake detection algo-rithm using a recurrent neural network (RNN),an architecture designed for sequential data.Magaña- Zook and Ruppert (73) use a long short-termmemory (LSTM)network (74), a sophisticatedRNN architecture for sequential data, to discrimi-nate natural seismicity from explosions. An ad-vantage of DNNs for earthquake detection is thatfeature extraction is performed by the network,so minimal preprocessing is required. By con-trast, shallow ANNs and other classical learningalgorithms require the user to select a set of keydiscriminative features, and poor feature selectionwill hurt model performance. Because it may bedifficult to define the distinguishing characteristicsof earthquake waveforms, the automatic featureextraction of DNNs can improve detection per-formance, provided large training sets are available.Araya-Polo et al. (75) use a DNN to learn an

inverse for a basic type of tomography. Ratherthan using ML to automate or improve individ-ual elements of a standard workflow, they aim tolearn to estimate a wave speed model directlyfrom the raw seismic data. The DNN model cancomputemodels faster than traditionalmethods.Understanding the fundamental properties

and interpretability of DNNs is a very active lineof research. A scattering transform (76, 77) canprovide natural insights in CNNs relevant togeoscience. This transform is a complex CNNthat discards the phase and thus exposes spectralcorrelations otherwise hidden beneath the phasefluctuations, to define moments. The scatteringtransform by design has desirable invariants. Ascattering representation of stationary processesincludes their second-order and higher-ordermoment descriptors. The scattering transformis effective, for example, in capturing key proper-ties in multifractal analysis (78) and stratifiedcontinuum percolation relevant to representationsof sedimentary processes and transport in porousmedia, respectively. Interpretable DNN architec-tures have been obtained through constructionfrom the analysis of inverse problems in the geo-sciences (79); these are potentially large improve-ments over the original reconstructions andalgorithms incorporating sparse data acquisi-tion, and acceleration.

Methods and trends forunsupervised learning

Clustering and self-organizing maps

There are many different clustering algorithms,including k-means, hierarchical clustering, andself-organizing maps (SOMs). A SOM is a typeof unsupervised neural network that can be usedfor either dimensionality reduction or clustering(80) [see Roden et al. (81) for thorough explana-tion of SOMs]. Carneiro et al. (33) applied a SOMto airborne geophysical data to identify key geo-physical signatures and determine their relation-ship to rock types for geological mapping in theBrazilian Amazon. Roden et al. (81) identifiedgeological features from seismic attributes usinga combination of PCA for dimensionality reduc-

tion followed by SOM for clustering. SOMs areoften used to identify seismic facies, but standardSOMs do not account for spatial relationshipsamong the data points. Zhao et al. (82) proposeimposing a stratigraphy constraint on the SOMalgorithm to obtain more detailed facies maps.SOMs have also been applied to seismic wave-form data for feature selection (83) and to clustersignals to identify multiple event types (84, 85).Supervised and unsupervised techniques

are commonly used together in ML workflows.Cracknell et al. (86) train a RF classifier to iden-tify lithology from geophysical and geochem-ical survey data. They then apply a SOM to thevolcanic units from the RF-generated geologicmap to identify subunits that reveal composi-tional differences. In the geosciences it is com-mon to have large datasets in which only asmall subset of the data are labeled. Such casescall for semi-supervised learning methods de-signed to learn from both labeled and unlabeleddata. In a semi-supervised approach, Köhler et al.(87) detect rockfalls and volcano-tectonic eventsin continuous waveform data using an SOM forclustering and assigning each cluster a labelbased on a small number of known examples.Sick et al. (88) also use a SOM with nearest-neighbor classification to classify seismic eventsby type (quarry blast versus seismic) and depth.

Feature learning

Unsupervised feature learning can be used tolearn a low-dimensional or sparse feature rep-resentation for a dataset. Valentine and Trampert(32) learn a compact feature representation forearthquake waveforms using an autoencodernetwork, a type of unsupervised DNN designedto learn efficient encodings for data. Qian et al.(89) apply a deep convolutional autoencodernetwork to prestack seismic data to learn a fea-ture representation that can be used in a clus-tering algorithm for facies mapping.Holtzman et al. (27) use nonnegative matrix

factorization and HMMs together to learn fea-tures to represent earthquakewaveforms.K-meansclustering is applied to these features to identifytemporal patterns among 46,000 low-magnitudeearthquakes in the Geysers geothermal field. Theauthors observe a correlation between the injec-tion rate and spectral properties of the earthquakes.

Dictionary learning

Sparse dictionary learning is a representationlearning method that constructs a sparse repre-sentation in the form of a linear combination ofbasic elements, or atoms, as well as those basicelements themselves. The dictionary of atomsis learned from a set of input data while findingthe sparse representations. Dictionary learningmethods, which learn an overcomplete basis forsparse representation of data, have been used tode-noise seismic data (90, 91). Bianco and Gerstoft(92) develop a linearized (surface-wave) travel-time tomography approach that sparsely modelslocal behaviors of overlapping groups of pixelsfrom a discrete slowness map following a maxi-mumaposteriori (MAP) formulation. They employ

Bergen et al., Science 363, eaau0323 (2019) 22 March 2019 5 of 10

RESEARCH | REVIEW

on March 28, 2019

http://science.sciencem

ag.org/D

ownloaded from

Page 7: GEOPHYSICS Machine learning for data-driven discovery in ...icnem.org/uploads/6/4/0/4/64044933/bergenscience19.pdf · geosciences Scientists have been applying ML techniques to problems

iterative thresholding and signed K-means dic-tionary learning to enhance sparsity of the re-presentation of the slowness estimated fromtravel-time perturbations.

Deep generative models

Generative models are a class of ML methodsthat learn joint probability distributions overthe dataset. Generative models can be appliedto both unsupervised and supervised learningtasks. Recent work has explored applicationsof deep generative models, in particular genera-tive adversarial networks (GANs) (93). A GAN isa system of two neural networks with opposingobjectives: a generator network that uses train-ing data to learn a model to generate realisticsynthetic data and a discriminator network thatlearns to distinguish the synthetic data from realtraining data [see (94) for a clear explanation].Deep generative models, such as the Deep

Rendering Model (95), Variational Autoencoders(VAEs) (96), and GANs (93), are hierarchical prob-abilistic models that explain data at multiplelevels of abstraction, and thereby accelerate learn-ing. The power of abstraction in these modelsallows their higher levels to learn concepts andcategories far more rapidly than their lower levels,owing to strong inductive biases and exposureto more data (97). The unsupervised learning cap-ability of the deep generative models is partic-ularly attractive to many inverse problems ingeophysics where labels are often not available.The use of neural networks can substantially

reduce the computational cost of generating syn-thetic seismograms compared with numericalsimulation models. Krischer and Fichtner (98)

use a GAN to map seismic source and receiverparameters to synthetic multicomponent seismo-grams. Mosser et al. (99) use a domain transferapproach, similar to artistic style transfer, witha deep convolutional GAN (DCGAN) to learnmappings from seismic amplitudes to geologicalstructure and vice versa. The authors’ approachenables both forward modeling and fast inver-sion. GANs have also been applied to geologicalmodeling by Dupont et al. (100), who infer localgeological patterns in fluvial environments froma limited number of rock type observations usinga GAN similar to those used for image inpaint-ing. Chan and Elsheikh (101) generate realistic,complex geological structures and subsurfaceflow patternswith aGAN. Veillard et al. (102) useboth a GAN and a VAE (96) to interpret geolo-gical structures in 3D seismic data.

Other techniques

Reinforcement learning is a ML framework inwhich the algorithm learns to make decisions tomaximize a reward by trial and error. Draelos et al.(103) propose a reinforcement learning–basedapproach for dynamic selection of thresholdsfor single-station earthquake detectors based onthe observations at neighboring stations. Thisapproach is a general method for automatedparameter tuning that can be used to improvethe sensitivity of single-station detectors usinginformation from the seismic network.Several recent studies in seismology have used

techniques for fast near-neighbor search to de-termine focal mechanisms of seismic events (104),to estimate ground motion and source param-eters (105), or to enable large-scale template

matching for earthquake detection (106). Eachof these three applications requires a databaseof known or precomputed earthquake featuresand uses an efficient search algorithm to reducethe computational runtime. By contrast, Yoon et al.(107) take an unsupervised pattern-mining ap-proach to earthquake detection; the authors usea fast similarity search algorithm to search thecontinuous waveform data for similar or re-peating, allowing the method to discover newevents with previously unknown sources. Thisapproach has been extended to multiple stations(108) and can process up to 10 years of continuousdata (109).Network analysis techniques—methods for

analyzing data that can be represented using agraph structure of nodes connected by edges—have also been used for data-driven discovery inthe sEg. Riahi and Gerstoft (110) detect weaksources in a dense array of seismic sensorsusing a graph clustering technique. The authorsidentify sources by computing components ofa graph where each sensor is a node and theedges are determined by the array coherencematrix. Aguiar and Beroza (111) use PageRank,a popular algorithm for link analysis, to analyzethe relationships between waveforms and dis-cover potential low-frequency earthquake (LFE)signals.

Recommendations and opportunities

ML techniques have been applied to a widerange of problems in the sEg; however, theirimpact is limited (Fig. 5). Data challenges canhinder progress and adoption of the new MLtools; however, adoption of these methods has

Bergen et al., Science 363, eaau0323 (2019) 22 March 2019 6 of 10

Fig. 5. Recommendations for advancing MLin geoscience. To make rapid progress inML applications in geoscience, educationbeginning at the undergraduate universitylevel is key. ML offers an important new setof tools that will advance science and engineer-ing rapidly if the next generation is well trainedin their use. Open-source software such asSci-kit Learn, TensorFlow, etc. are importantcomponents as well, as are open-accesspublications that all can access for free suchas the LANL/Cornell arXiv and PLOS (PublicLibrary of Science). Also important are selectingthe proper ML approach and developing newarchitectures as needs arise. Working withML experts is the best approach at present,until there are many experts in the geosciencedomain. Competitions, conferences, andspecial sessions could also help drive thefield forward.

Conference& Workshops

Joint workwith MLexperts

InterpretableML

Physics-Informed

ML

Domainadaptation

ChallengeProblems

Data ScienceCompetitions

Opendata

Opensource

software

OpenScience

Advances in ourUnderstanding of

the Solid Earth

BenchmarkData sets

Geo-DataScience

Education

New MLarchitectures

& models

Openaccess

RESEARCH | REVIEW

on March 28, 2019

http://science.sciencem

ag.org/D

ownloaded from

Page 8: GEOPHYSICS Machine learning for data-driven discovery in ...icnem.org/uploads/6/4/0/4/64044933/bergenscience19.pdf · geosciences Scientists have been applying ML techniques to problems

lagged some other scientific domains with sim-ilar data quality issues.Our recommendations are informed by the

characteristics of geoscience datasets that pre-sent challenges for standardMLalgorithms.Data-sets in the sEg represent complex, nonlinear,physical systems that act across a vast range oflength and time scales. Many phenomena, suchas fluid injection and earthquakes, are stronglynonstationary. The resulting data are complex,with multiresolution, spatial, and temporal struc-tures requiring innovative approaches. Further,much existing data are unlabeled. When availa-ble, labels are often highly subjective or biasedtoward frequent or well-characterized phenome-na, limiting the effectiveness of algorithms thatrely on training datasets. The quality and com-pleteness of datasets create another challenge asuneven data collection, incomplete datasets, andnoisy data are common.

Benchmark datasets

The lack of clear ground-truth and standardbenchmarks in solid sEg problems impedes theevaluation of performance in geoscience appli-cations. Ground truth, or reference data forevaluating performance, may be unavailable, in-complete, or biased. Without suitable ground-truth data, validating algorithms, evaluatingperformance, and adopting best practices are dif-ficult. Automatic earthquake detection providesan illustrative example. New signal processingor ML-based earthquake detection algorithmshave been regularly developed and applied overthe past several decades. Each method is typi-cally applied to a different dataset, and authorsset their own criteria for evaluating perform-ance in the absence of ground truth. This makesit difficult to determine the relative detectionperformance, advantages, and weaknesses ofeach method, which prevents the communityfrom adopting and iterating on the best newdetection algorithms.Benchmark datasets and challenge problems

have played an important role in driving pro-gress and innovation inML research.High-qualitybenchmark datasets have two key benefits: (i)enabling rigorous performance comparisons and(ii) producing better models. Well-known chal-lenge problems include mastering game play(112–115), competitions for movie recommen-dation [Netflix prize (15)], and image recognition[ImageNet (14)]. The performance gain demon-strated by a CNN (116) in the 2012 ImageNetcompetition triggered a wave of research in deeplearning. In computer vision, it is common prac-tice to report performance of new algorithms onstandard datasets, such as the MNIST handwrit-ten digit dataset (117).Greater use of benchmark datasets can accel-

erate progress in applyingML to problems in thesEg. This will require an investment from theresearch community, both in terms of creatingand maintaining datasets and also in reportingalgorithm performance on benchmark datasetsin published work. The contribution of compil-ing and sharing benchmark datasets is unlikely

to go unrecognized. The ImageNet image rec-ognition dataset (118) has been cited in over6000 papers.Recently, the Institute of Geophysics at the

Chinese Earthquake Administration (CEA) andAlibaba Cloud hosted a data science competitionwith more than 1000 teams centered aroundautomatic detection and phase picking of after-shocks following the 2008 Ms 8.0 Wenchuanearthquake (119, 120). The ground-truth phase-arrival data, against which entries were assessed,were determined by CEA analysts. Such chal-lenges are useful for researchers seeking to testand improve their detection algorithms. Futurecompetitions should have greater impact if theyare accompanied with some form of broaderfollow-up, such as publications associated withtop-performing entries or a summary of effec-tive methods and lessons learned from compe-tition organizers.Creating benchmark datasets and determin-

ing evaluation metrics are challenging when theunderlying data are incompletely understood.The ground truth used as a benchmark maysuffer from the same biases as training data.Although benchmark datasets can provide a use-ful guide, the research community must not dis-regard or penalize methods that discover newphenomena not represented in the ground truth.Performance evaluation needs to include boththe overall error rate, along with the relativestrengths and weaknesses, including kinds oferrors made by the algorithm (121). Ideally,within a given problem domain, several diversebenchmark datasets would be available to theresearch community to avoid an overly narrowfocus on algorithm development. For examplethe performance of earthquake detection algo-rithms can vary on the basis of the type of eventsto be detected (e.g., regional events, volcanictremor, tectonic tremor) and the characteristicsof the noise signals. An additional approachwouldbe to create datasets from simulations wherethe simulated data are released but the under-lying model is kept hidden, e.g., a model of com-plex fault geometry based on seismic reflectiondata. Researchers could then compete to deter-mine which approaches best recover the inputmodel.

Open science

Adoption of open science principles (122) willbetter position the sEg community to take ad-vantage of the rapid pace of development in AI.This should include a commitment to make code(open source), datasets (open data), and research(open access) publicly available. Open science ini-tiatives are especially important for validatingand ensuring reproducibility of results frommore-difficult-to-interpret, “black-box”MLmodelssuch as DNNs.Open source codes, often shared through on-

line platforms (123, 124), have already beenadopted for general data processing in seismol-ogy with the ObsPy (125, 126) and Pyrocko (127)toolboxes. Scikit-learn is another example ofbroadly applied open-source software (128). Along

with benchmark datasets, greater sharing ofthe code that implements new ML-based sol-utions will help accelerate the development andvalidation of these new approaches. Active re-search areas like earthquake detection benefit asavailable open source codes enable direct com-parisons ofmultiple algorithms on the same data-sets to assess the relative performance. To theextent possible, this should also extend to thesharing of the original datasets and pretrainedML models.The use of electronic preprints [e.g., (129–131)]

may also help to accelerate the pace of research(132) at the intersection of Earth science and AI.Preprints allow authors to share preliminary re-sults with a wider community and receive feed-back in advance of the formal review process.The practice of making research available onpreprint servers is common in many science,technology, engineering, andmathematics (STEM)fields, including computer vision and natural lan-guage processing—two fields that are drivingdevelopment in deep learning (133); however,this practice has yet to be widely adopted withinthe sEg community.

New data sources

In recent years new, unconventional data sourceshave become available but have not been fullyexploited, presenting new opportunities for theapplication and development of new ML-basedanalysis tools. Data sources such as light detec-tion and ranging (LiDAR) point clouds (134),distributed acoustic sensing with fiber opticcables (135–137), and crowd-sourced data fromsmartphones (17), social media (138, 139), webtraffic (140), and microelectromechanical sys-tems (MEMS) accelerometers (141) are well-suitedto applications using ML. Interferometric syn-thetic aperture radar (InSAR) data are widelyused for applications such as identifying crops ordeforestation, but have seen minimal use in MLapplications for geological and geophysical prob-lems. High-resolution satellite and multispectralimagery (142) provide rich datasets for geologicaland geophysical applications, including the studyof evolving systems such as volcanoes, earth-quakes, land-surface change, mapping geology,and soils. A disadvantage of nongovernmentalsatellite data can be cost. Imagery data fromsources such as Google Maps or lower-resolutionmultispectral data from government-sponsoredsatellites such as SPOT and ASTER are availablewithout cost.

Machine learning solutions, new models,and architectures

Researchers have many opportunities for collab-orative research between the geoscience and MLcommunities, including new models and algo-rithms to address data challenges that arise insEg [see also Karpatne et al. (143)]. Real-timedata collection from geophysical sensors offernew test cases for online learning in streamingdata. Domain expertise is required to interpretmany geoscientific datasets, making these inter-esting use cases for the development of interactive

Bergen et al., Science 363, eaau0323 (2019) 22 March 2019 7 of 10

RESEARCH | REVIEW

on March 28, 2019

http://science.sciencem

ag.org/D

ownloaded from

Page 9: GEOPHYSICS Machine learning for data-driven discovery in ...icnem.org/uploads/6/4/0/4/64044933/bergenscience19.pdf · geosciences Scientists have been applying ML techniques to problems

ML algorithms, including scientist-in-the-loopsystems.A challenge that comes with entirely data-

driven approaches is the need for large quan-tities of training data, especially for modelingthrough deep learning. Moreover, ML modelsmay end up replicating the biases in trainingdata, which can arise during data collection oreven through the use of specific training data-sets. Thus, extracting maximum value fromgeoscientific datasets will require methods cap-able of learning with limited, weak, or biasedlabels. Furthermore, because the phenomenaof interest are governed by complex and dy-namic physical processes, there is a need fornew approaches to analyzing scientific data-sets that combine data-driven and physical mod-eling (144).Much of the representational power of mod-

ern ML techniques, such as DNNs, comes fromthe ability to recognize paths to data inversionoutside of the established physical and mathe-matical frameworks, through nonlinearities thatgive rise to highly realistic yet nonconvex regu-larizers. Recently, interpretable DNN architec-tures were constructed based on the analysisof inverse problems in the geosciences (79) thathave the potential to mitigate ill-posedness,accelerate reconstruction (after training), andaccommodate sparse (constrained) data acqui-sition. In the framework of linear inverse prob-lems, various imaging operators induce particularnetwork architectures (26). Furthermore, deepgenerative models will play an important rolein bridging multilevel regularized iterative tech-niques in inverse problems with deep learning.In the same context, priors may be learned fromthe data.Another approach to mitigating deep learn-

ing’s reliance on large datasets is to use simula-tions to generate supplemental synthetic trainingdata. In such cases, domain adaption can be usedto correct for differences in the data distributionbetween real and synthetic data. Domain adap-tation architectures, including the mixed-realitygenerative adversarial networks (145), iterativelymap simulated data to the space of real dataand vice versa. Studying trained deep genera-tive models can reveal insights into the under-lying data-generating process, and invertingthese models involves inference algorithmsthat can extract useful representations fromthe data.A note of caution in applyingML to geoscience

problems: ML should not be applied naïvely tocomplex geoscience problems. Biased data, mix-ing training and testing data, overfitting, andimproper validation will lead to unreliable re-sults. As elsewhere, working with data scientistswill help mitigate these potential issues.

Geoscience curriculum

Data science is moving quickly, and geoscientistswould benefit from close collaboration with MLresearchers (146) to take full advantage of de-velopments at the cutting edge. Collaborationwill require focused effort on the part of geo-

scientists. A greater component of data scienceand ML in geoscience curricula could help, aswould recruiting students trained in data scienceto work on geoscience research. Interdisciplinaryresearch conferences could and are being used topromote collaborations through identifying com-mon interests and complementary capabilities.

REFERENCES AND NOTES

1. P. Baldi, P. Sadowski, D. Whiteson, Searching for exoticparticles in high-energy physics with deep learning.Nat. Commun. 5, 4308 (2014). doi: 10.1038/ncomms5308;pmid: 24986233

2. A. A. Melnikov, H. P. Nautrup, M. Krenn, V. Dunjko, M. Tiersch,A. Zeilinger, H. J. Briegel, Active learning machine learns tocreate new quantum experiments. Proc. Natl. Acad. Sci. USA.115, 1221–1226 (2018). doi: 10.1073/pnas.1714936115

3. C. J. Shallue, A. Vanderburg, Identifying Exoplanets withDeep Learning: A Five-planet Resonant Chain aroundKepler-80 and an Eighth Planet around Kepler-90. Astron. J.155, 94 (2018). doi: 10.3847/1538-3881/aa9e09

4. F. Ren, L. Ward, T. Williams, K. J. Laws, C. Wolverton,J. Hattrick-Simpers, A. Mehta, Accelerated discovery ofmetallic glasses through iteration of machine learningand high-throughput experiments. Sci. Adv. 4, eaaq1566(2018).

5. M. I. Jordan, T. M. Mitchell, Machine learning: Trends,perspectives, and prospects. Science 349, 255–260 (2015).doi: 10.1126/science.aaa8415; pmid: 26185243

6. R. Kohavi, F. Provost, Glossary of terms. MachineLearning—Special Issue on Applications of MachineLearning and the Knowledge Discovery Process. Mach. Learn.30, 271–274 (1998).doi: 10.1023/A:1017181826899

7. J. McCarthy, E. Feigenbaum, In Memoriam. Arthur Samuel:Pioneer in Machine Learning. AI Mag. 11, 10 (1990).doi.org/10.1609/aimag.v11i3.840

8. I. Goodfellow, Y. Bengio, A. Courville, Deep Learning(MIT Press, 2016).

9. G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi,M. Ghafoorian, J. A. van der Laak, B. van Ginneken,C. I. Sanchez, A survey on deep learning in medical imageanalysis. Med. Image Anal. 42, 60–88 (2017). doi: 10.1016/j.media.2017.07.005; pmid: 28778026

10. Q. V. Le, Building high-level features using large-scaleunsupervised learning, in 2013 IEEE International Conferenceon Acoustics, Speech and Signal Processing (ICASSP)(IEEE, 2013), pp. 8595–8598.

11. F. U. Dowla, S. R. Taylor, R. W. Anderson, Seismicdiscrimination with artificial neural networks: Preliminaryresults with regional spectral data. Bull. Seismol. Soc. Am.80, 1346–1373 (1990).

12. P. S. Dysart, J. J. Pulli, Regional seismic event classificationat the NORESS array: Seismological measurements and the useof trained neural networks. Bull. Seismol. Soc. Am. 80, 1910(1990).

13. H. Dai, C. MacBeth, Automatic picking of seismic arrivals inlocal earthquake data using an artificial neural network.Geophys. J. Int. 120, 758–774 (1995). doi: 10.1111/j.1365-246X.1995.tb01851.x

14. O. Russakovsky et al., ImageNet Large Scale VisualRecognition Challenge. Int. J. Comput. Vis. 115, 211–252(2015). doi: 10.1007/s11263-015-0816-y

15. J. Bennett, S. Lanning, The Netflix Prize, in Proceedings of KDDcup and workshop (New York, 2007), vol. 2007, p. 35.

16. T. Perol, M. Gharbi, M. Denolle, Convolutional neural networkfor earthquake detection and location. Sci. Adv. 4, e1700578(2018). doi: 10.1126/sciadv.1700578; pmid: 29487899

17. Q. Kong, R. M. Allen, L. Schreier, Y.-W. Kwon, MyShake:A smartphone seismic network for earthquake early warningand beyond. Sci. Adv. 2, e1501055 (2016). doi: 10.1126/sciadv.1501055; pmid: 26933682

18. R. Reddy, R. R. Nair, The efficacy of support vector machines(SVM) in robust determination of earthquake earlywarning magnitudes in central Japan. J. Earth Syst. Sci. 122,1423–1434 (2013). doi: 10.1007/s12040-013-0346-3

19. L. H. Ochoa, L. F. Niño, C. A. Vargas, Fast magnitudedetermination using a single seismological station recordimplementing machine learning techniques. Geodesy Geodyn.9, 34–41 (2017). doi: 10.1016/j.geog.2017.03.010

20. D. Shoji, R. Noguchi, S. Otsuki, H. Hino, Classification ofvolcanic ash particles using a convolutional neural networkand probability. Sci. Rep. 8, 8111 (2018). doi: 10.1038/s41598-018-26200-2; pmid: 29802305

21. D. T. Trugman, P. M. Shearer, Strong Correlation betweenStress Drop and Peak Ground Acceleration for Recent M 1–4Earthquakes in the San Francisco Bay Area. Bull. Seismol.Soc. Am. 108, 929–945 (2018). doi: 10.1785/0120170245

22. B. Derras, P. Y. Bard, F. Cotton, Towards fully data driven ground-motion prediction models for Europe. Bull. Earthquake Eng. 12,495–516 (2014). doi: 10.1007/s10518-013-9481-0

23. P. M. R. DeVries, T. B. Thompson, B. J. Meade, Enablinglarge-scale viscoelastic calculations via neural networkacceleration. Geophys. Res. Lett. 44, 2662–2669 (2017).doi: 10.1002/2017GL072716

24. M. Valera, Z. Guo, P. Kelly, S. Matz, V. A. Cantu, A. G. Percus,J. D. Hyman, G. Srinivasan, H. S. Viswanathan, Machinelearning for graph-based representations of three-dimensional discrete fracture networks. Computat. Geosci.22, 695–710 (2018). doi: 10.1007/s10596-018-9720-1

25. P. Käufl, A. P. Valentine, J. Trampert, Probabilistic pointsource inversion of strong-motion data in 3-D media usingpattern recognition: A case study for the 2008 M w 5.4 ChinoHills earthquake. Geophys. Res. Lett. 43, 8492–8498 (2016).doi: 10.1002/2016GL069887

26. M. T. McCann, K. H. Jin, M. Unser, Convolutional NeuralNetworks for Inverse Problems in Imaging: A Review. IEEESignal Process. Mag. 34, 85–95 (2017). doi: 10.1109/MSP.2017.2739299

27. B. K. Holtzman, A. Paté, J. Paisley, F. Waldhauser,D. Repetto, Machine learning reveals cyclic changes inseismic source spectra in Geysers geothermal field.Sci. Adv. 4, eaao2929 (2018). doi: 10.1126/sciadv.aao2929;pmid: 29806015

28. B. Rouet-Leduc, C. Hulbert, N. Lubbers, K. Barros,C. J. Humphreys, P. A. Johnson, Machine Learning PredictsLaboratory Earthquakes. Geophys. Res. Lett. 44, 9276–9282(2017). doi: 10.1002/2017GL074677

29. B. Rouet-Leduc, C. Hulbert, D. C. Bolton, C. X. Ren, J. Riviere,C. Marone, R. A. Guyer, P. A. Johnson, Estimating Fault FrictionFrom Seismic Signals in the Laboratory. Geophys. Res. Lett.45, 1321–1329 (2018). doi: 10.1002/2017GL076708

30. C. Hulbert, B. Rouet-Leduc, P. Johnson, C. X. Ren, J. Riviere,D. C. Bolton, C. Marone, Similarity of fast and slowearthquakes illuminated by machine learning. Nat. Geosci. 12,69–74 (2019). doi: 10.1038/s41561-018-0272-8

31. B. Rouet-Leduc, C. Hulbert, P. A. Johnson, Continuouschatter of the Cascadia subduction zone revealed by machinelearning. Nat. Geosci. 12, 75–79 (2019). doi: 10.1038/s41561-018-0274-6

32. A. P. Valentine, J. Trampert, Data space reduction, qualityassessment and searching of seismograms: Autoencodernetworks for waveform data. Geophys. J. Int. 189, 1183–1202(2012). doi: 10.1111/j.1365-246X.2012.05429.x

33. C. C. Carneiro, S. J. Fraser, A. P. Crósta, A. M. Silva,C. E. M. Barros, Semiautomated geologic mapping usingself-organizing maps and airborne geophysics in the BrazilianAmazon. Geophysics 77, K17–K24 (2012). doi: 10.1190/geo2011-0302.1

34. X. Wu, D. Hale, 3D seismic image processing forfaults. Geophysics 81, IM1–IM11 (2016). doi: 10.1190/geo2015-0380.1

35. S. Kuhn, M. J. Cracknell, A. M. Reading, Lithologic mappingusing Random Forests applied to geophysical and remote-sensing data: A demonstration study from the EasternGoldfields of Australia. Geophysics 83, B183–B193 (2018).doi: 10.1190/geo2017-0590.1

36. D. R. Cox, The Regression Analysis of Binary Sequences.J. R. Stat. Soc. B 20, 215–232 (1958). doi: 10.1111/j.2517-6161.1958.tb00292.x

37. A. Reynen, P. Audet, Supervised machine learning on anetwork scale: Application to seismic event classification anddetection. Geophys. J. Int. 210, 1394–1409 (2017).doi: 10.1093/gji/ggx238

38. S. Pawley, R. Schultz, T. Playter, H. Corlett, T. Shipman,S. Lyster, T. Hauck, The Geological Susceptibility of InducedEarthquakes in the Duvernay Play. Geophys. Res. Lett. 45,1786–1793 (2018). doi: 10.1002/2017GL076100

39. L. R. Rabiner, A tutorial on hidden Markov models andselected applications in speech recognition. Proc. IEEE 77,257–286 (1989). doi: 10.1109/5.18626

40. F. Dammeier, J. R. Moore, C. Hammer, F. Haslinger, S. Loew,Automatic detection of alpine rockslides in continuous seismic

Bergen et al., Science 363, eaau0323 (2019) 22 March 2019 8 of 10

RESEARCH | REVIEW

on March 28, 2019

http://science.sciencem

ag.org/D

ownloaded from

Page 10: GEOPHYSICS Machine learning for data-driven discovery in ...icnem.org/uploads/6/4/0/4/64044933/bergenscience19.pdf · geosciences Scientists have been applying ML techniques to problems

data using hidden Markov models. J. Geophys. Res. Earth Surf.121, 351–371 (2016). doi: 10.1002/2015JF003647

41. M. Beyreuther, R. Carniel, J. Wassermann, Continuous HiddenMarkov Models: Application to automatic earthquakedetection and classification at Las Canãdas caldera, Tenerife.J. Volcanol. Geotherm. Res. 176, 513–518 (2008).doi: 10.1016/j.jvolgeores.2008.04.021

42. C. Hammer, M. Beyreuther, M. Ohrnberger, A Seismic-Event Spotting System for Volcano Fast-Response Systems.Bull. Seismol. Soc. Am. 102, 948–960 (2012). doi: 10.1785/0120110167

43. M. Beyreuther, J. Wassermann, Continuous earthquakedetection and classification using discrete Hidden MarkovModels. Geophys. J. Int. 175, 1055–1066 (2008). doi: 10.1111/j.1365-246X.2008.03921.x

44. M. Beyreuther, C. Hammer, J. Wassermann, M. Ohrnberger,T. Megies, Constructing a Hidden Markov Model basedearthquake detector: Application to induced seismicity.Geophys. J. Int. 189, 602–610 (2012). doi: 10.1111/j.1365-246X.2012.05361.x

45. C. Riggelsen, M. Ohrnberger, F. Scherbaum, DynamicBayesian networks for real-time classification of seismicsignals, in Knowledge Discovery in Databases. PKDD 2007.Lecture Notes in Computer Science, vol. 4702,J. N. Kok et al., Eds. (Springer, 2007), pp. 565–572.doi: 10.1007/978-3-540-74976-9_59

46. C. Riggelsen, M. Ohrnberger, A Machine Learning Approachfor Improving the Detection Capabilities at 3C SeismicStations. Pure Appl. Geophys. 171, 395–411 (2014).doi: 10.1007/s00024-012-0592-3

47. F. Janoos, H. Denli, N. Subrahmanya, Multi-scale graphicalmodels for spatio-temporal processes. Adv. Neural Inf.Process. Syst. 27, 316–324 (2014).

48. S. Srinivasan, J. Hyman, S. Karra, D. O’Malley, H. Viswanathan,G. Srinivasan, Robust system size reduction of discretefracture networks: A multi-fidelity method that preservestransport characteristics. Computat. Geosci. 22, 1515–1526(2018). doi: 10.1007/s10596-018-9770-4

49. C. Cortes, V. Vapnik, Support-vector networks. Mach. Learn.20, 273–297 (1995). doi: 10.1023/A:1022627411411

50. M. J. Cracknell, A. M. Reading, The upside of uncertainty:Identification of lithology contact zones from airbornegeophysics and satellite data using random forests andsupport vector machines. Geophysics 78, WB113–WB126(2013). doi: 10.1190/geo2012-0411.1

51. M. H. Shahnas, D. A. Yuen, R. N. Pysklywec, Inverse Problemsin Geodynamics Using Machine Learning Algorithms.J. Geophys. Res. Solid Earth 123, 296–310 (2018).doi: 10.1002/2017JB014846

52. J. Kortström, M. Uski, T. Tiira, Automatic classification ofseismic events within a regional seismograph network.Comput. Geosci. 87, 22–30 (2016). doi: 10.1016/j.cageo.2015.11.006

53. A. E. Ruano, G. Madureira, O. Barros, H. R. Khosravani,M. G. Ruano, P. M. Ferreira, Seismic detection using supportvector machines. Neurocomputing 135, 273–283 (2014).doi: 10.1016/j.neucom.2013.12.020

54. L. Breiman, Random Forests. Mach. Learn. 45, 5–32 (2001).doi: 10.1023/A:1010933404324

55. M. J. Cracknell, A. M. Reading, Geological mapping usingremote sensing data: A comparison of five machine learningalgorithms, their response to variations in the spatialdistribution of training data and the use of explicit spatialinformation. Comput. Geosci. 63, 22–33 (2014).doi: 10.1016/j.cageo.2013.10.008

56. A. M. Reading, M. J. Cracknell, D. J. Bombardieri, T. Chalke,Combining Machine Learning and Geophysical Inversion forApplied Geophysics. ASEG Extended Abstracts 2015,1 (2015). doi: 10.1071/ASEG2015ab070

57. C. M. Bishop, Neural Networks for Pattern Recognition(Oxford Univ. Press, 1995).

58. A. P. Valentine, J. H. Woodhouse, Approaches toautomated data selection for global seismictomography. Geophys. J. Int. 182, 1001–1012 (2010).doi: 10.1111/j.1365-246X.2010.04658.x

59. M. van der Baan, C. Jutten, Neural networks in geophysicalapplications. Geophysics 65, 1032–1047 (2000).doi: 10.1190/1.1444797

60. M. M. Poulton, Neural networks as an intelligenceamplification tool: A review of applications. Geophysics 67,979–993 (2002). doi: 10.1190/1.1484539

61. R. W. L. de Wit, A. P. Valentine, J. Trampert, Bayesianinference of Earth’s radial seismic structure from body-wave

traveltimes using neural networks. Geophys. J. Int. 195,408–422 (2013). doi: 10.1093/gji/ggt220

62. R. Paolucci, F. Gatti, M. Infantino, C. Smerzini,A. Guney Ozcebe, M. Stupazzini, Broadband groundmotions from 3D physics-based numerical simulations usingartificial neural networks. Bull. Seismol. Soc. Am. 108,1272–1286 (2018).

63. P. Paitz, A. Gokhberg, A. Fichtner, A neural network for noisecorrelation classification. Geophys. J. Int. 212, 1468–1474(2018). doi: 10.1093/gji/ggx495

64. S. M. Mousavi, S. P. Horton, C. A. Langston, B. Samei,Seismic features and automatic discrimination of deep andshallow induced-microearthquakes using neural network andlogistic regression. Geophys. J. Int. 207, 29–46 (2016).doi: 10.1093/gji/ggw258

65. Y. LeCun, Y. Bengio, G. Hinton, Deep learning.Nature 521, 436–444 (2015). doi: 10.1038/nature14539;pmid: 26017442

66. Z. E. Ross, M.-A. Meier, E. Hauksson, P Wave ArrivalPicking and First-Motion Polarity Determination With DeepLearning. J. Geophys. Res. Solid Earth 123, 5120–5129(2018). doi: 10.1029/2017JB015251

67. B. Moseley, A. Markham, T. Nissen-Meyer, Fast approximatesimulation of seismic waves with deep learning.arXiv:1807.06873 [physics.geo-ph] (2018).

68. Y. Wu, Y. Lin, Z. Zhou, D. C. Bolton, J. Liu, P. Johnson,Cascaded region-based densely connected networkfor event detection: A seismic application. arXiv:1709.07943[cs.LG] (2017).

69. Y. Wu, Y. Lin, Z. Zhou, A. Delorey, Seismic-Net: A deepdensely connected neural network to detect seismic events.arXiv:1802.02241 (eess.SP) (2018).

70. W. Zhu, G. C. Beroza, PhaseNet: A deep-neural-network-based seismic arrival time picking method. arXiv:1803.03211[physics.geo-ph] (2018).

71. M. Titos, A. Bueno, L. García, C. Benítez, A deep neuralnetworks approach to automatic recognition systems forvolcano-seismic events. IEEE J. Sel. Top. Appl. Earth Obs.Remote Sens. 11, 1533–1544 (2018). doi: 10.1109/JSTARS.2018.2803198

72. J. Wiszniowski, B. Plesiewicz, J. Trojanowski, Application ofreal time recurrent neural network for detection of smallnatural earthquakes in Poland. Acta Geophysica 62, 469–485(2014). doi: 10.2478/s11600-013-0140-2

73. S. A. Magaña-Zook, S. D. Ruppert, Explosion monitoring withmachine learning: A LSTM approach to seismic eventdiscrimination. American Geophysical Union Fall Meeting,abstract S43A-0834 (2017).

74. S. Hochreiter, J. Schmidhuber, Long short-term memory.Neural Comput. 9, 1735–1780 (1997). doi: 10.1162/neco.1997.9.8.1735; pmid: 9377276

75. M. Araya-Polo, J. Jennings, A. Adler, T. Dahlke, Deep-learningtomography. Leading Edge (Tulsa Okla.) 37, 58–66 (2018).doi: 10.1190/tle37010058.1

76. S. Mallat, Group Invariant Scattering. Commun. Pure Appl.Math. 65, 1331–1398 (2012). doi: 10.1002/cpa.21413

77. J. Bruna, S. Mallat, Invariant scattering convolutionnetworks. IEEE Trans. Pattern Anal. Mach. Intell. 35,1872–1886 (2013). doi: 10.1109/TPAMI.2012.230;pmid: 23787341

78. J. Bruna, Scattering representations for recognition, Ph.D.thesis, Ecole Polytechnique X (2013).

79. S. Gupta, K. Kothari, M. V. de Hoop, I. Dokmanić, Randommesh projectors for inverse problems. arXiv:1805.11718[cs.CV] (2018).

80. T. Kohonen, Essentials of the self-organizing map. NeuralNetw. 37, 52–65 (2013). doi: 10.1016/j.neunet.2012.09.018;pmid: 23067803

81. R. Roden, T. Smith, D. Sacrey, Geologic pattern recognitionfrom seismic attributes: Principal component analysis andself-organizing maps. Interpretation (Tulsa) 3, SAE59–SAE83(2015). doi: 10.1190/INT-2015-0037.1

82. T. Zhao, F. Li, K. J. Marfurt, Constraining self-organizing mapfacies analysis with stratigraphy: An approach to increase thecredibility in automatic seismic facies classification.Interpretation (Tulsa) 5, T163–T171 (2017). doi: 10.1190/INT-2016-0132.1

83. A. Köhler, M. Ohrnberger, C. Riggelsen, F. Scherbaum,Unsupervised feature selection for pattern search in seismictime series. Journal of Machine Learning Research, inWorkshop and Conference Proceedings: New challenges forFeature Selection in Data Mining and Knowledge Discovery,vol. 4, pp. 106–121.

84. A. Esposito, F. Giudicepietro, S. Scarpetta ,L. D’Auria,M. Marinaro, M. Martini, Automatic Discrimination amongLandslide, Explosion-Quake, and Microtremor SeismicSignals at Stromboli Volcano Using Neural Networks.Bull. Seismol. Soc. Am. 96 (4A), 1230–1240 (2006).doi: 10.1785/0120050097

85. A. Esposito, F. Giudicepietro, L. D’Auria, S. Scarpetta,M. Martini, M. Coltelli, M. Marinaro, Unsupervised NeuralAnalysis of Very-Long-Period Events at Stromboli VolcanoUsing the Self-Organizing Maps. Bull. Seismol. Soc. Am. 98,2449–2459 (2008).doi: 10.1785/0120070110

86. M. J. Cracknell, A. M. Reading, A. W. McNeill, Mappinggeology and volcanic-hosted massive sulfide alteration in theHellyer–Mt Charter region, Tasmania, using RandomForests™ and Self-Organising Maps. Aust. J. Earth Sci. 61,287–304 (2014). doi: 10.1080/08120099.2014.858081

87. A. Köhler, M. Ohrnberger, F. Scherbaum, Unsupervisedpattern recognition in continuous seismic wavefield recordsusing Self-Organizing Maps. Geophys. J. Int. 182, 1619–1630(2010). doi: 10.1111/j.1365-246X.2010.04709.x

88. B. Sick, M. Guggenmos, M. Joswig, Chances and limits ofsingle-station seismic event clustering by unsupervisedpattern recognition. Geophys. J. Int. 201, 1801–1813 (2015).doi: 10.1093/gji/ggv126

89. F. Qian et al., Unsupervised seismic facies analysis via deepconvolutional autoencoders. Geophysics 83, A39–A43(2018). doi: 10.1190/geo2017-0524.1

90. S. Beckouche, J. Ma, Simultaneous dictionary learning anddenoising for seismic data. Geophysics 79, A27–A31 (2014).doi: 10.1190/geo2013-0382.1

91. Y. Chen, J. Ma, S. Fomel, Double-sparsity dictionary forseismic noise attenuation. Geophysics 81, V103–V116 (2016).doi: 10.1190/geo2014-0525.1

92. M. Bianco, P. Gerstoft, Travel time tomography withadaptive dictionaries. arXiv:1712.08655 [physics.geo-ph](2017).

93. I. Goodfellow et al., Generative adversarial networks.Adv. Neural Inf. Process. Syst. 27, 2672–2680 (2014).

94. L. Mosser, O. Dubrule, M. J. Blunt, Reconstruction ofthree-dimensional porous media using generative adversarialneural networks. Phys. Rev. E 96, 043309 (2017).doi: 10.1103/PhysRevE.96.043309; pmid: 29347591

95. A. B. Patel, M. T. Nguyen, R. Baraniuk, A probabilisticframework for deep learning. Adv. Neural Inf. Process. Syst.29, 2558–2566 (2016).

96. D. P. Kingma, M. Welling, Auto-encoding variational Bayes.arXiv:1312.6114 [stat.ML] (2013).

97. J. B. Tenenbaum, C. Kemp, T. L. Griffiths, N. D. Goodman,How to grow a mind: Statistics, structure, and abstraction.Science 331, 1279–1285 (2011). doi: 10.1126/science.1192788; pmid: 21393536

98. L. Krischer, A. Fichtner, Generating seismograms with deepneural networks. AGU Fall Meeting Abstracts, abstractS41D-03 (2017).

99. L. Mosser, W. Kimman, J. Dramsch, S. Purves, A. De la Fuente,G. Ganssle, Rapid seismic domain transfer: Seismic velocityinversion and modeling using deep generative neural networks.arXiv:1805.08826 [physics.geo-ph] (2018).

100. E. Dupont, T. Zhang, P. Tilke, L. Liang, W. Bailey, Generatingrealistic geology conditioned on physical measurementswith generative adversarial networks. arXiv:1802.03065[stat.ML] (2018).

101. S. Chan, A. H. Elsheikh, Parametrization and generation ofgeological models with generative adversarial networks.arXiv:1708.01810 [stat.ML] (2017).

102. A. Veillard, O. Morère, M. Grout, J. Gruffeille, Fast 3Dseismic interpretation with unsupervised deep learning:Application to a potash network in the North Sea. 80th EAGEConference and Exhibition 2018 (2018). doi: 10.3997/2214-4609.201800738

103. T. J. Draelos, M. G. Peterson, H. A. Knox, B. J. Lawry,K. E. Phillips-Alonge, A. E. Ziegler, E. P. Chael, C. J. Young,A. Faust, Dynamic tuning of seismic signal detector triggerlevels for local networks Bull. Seismol. Soc. Am. 108,1346–1354 (2018). doi.org/10.1785/0120170200

104. J. Zhang, H. Zhang, E. Chen, Y. Zheng, W. Kuang, X. Zhang,Real-time earthquake monitoring using a search enginemethod. Nat. Commun. 5, 5664 (2014). doi: 10.1038/ncomms6664; pmid: 25472861

105. L. Yin, J. Andrews, T. Heaton, Reducing process delays forreal-time earthquake parameter estimation – An applicationof KD tree to large databases for Earthquake Early Warning.

Bergen et al., Science 363, eaau0323 (2019) 22 March 2019 9 of 10

RESEARCH | REVIEW

on March 28, 2019

http://science.sciencem

ag.org/D

ownloaded from

Page 11: GEOPHYSICS Machine learning for data-driven discovery in ...icnem.org/uploads/6/4/0/4/64044933/bergenscience19.pdf · geosciences Scientists have been applying ML techniques to problems

Comput. Geosci. 114, 22–29 (2018). doi.10.1016/j.cageo.2018.01.001

106. R. Tibi, C. Young, A. Gonzales, S. Ballard, A. Encarnacao,Rapid and robust cross-correlation-based seismic phaseidentification using an approximate nearest neighbormethod. Bull. Seismol. Soc. Am. 107, 1954–1968 (2017).doi: 10.1785/0120170011

107. C. E. Yoon, O. O’Reilly, K. J. Bergen, G. C. Beroza, Earthquakedetection through computationally efficient similarity search.Sci. Adv. 1, e1501057 (2015). doi: 10.1126/sciadv.1501057;pmid: 26665176

108. K. J. Bergen, G. C. Beroza, Detecting earthquakes overa seismic network using single-station similaritymeasures. Geophys. J. Int. 213, 1984–1998 (2018).doi: 10.1093/gji/ggy100

109. K. Rong, C. E. Yoon, K. J. Bergen, H. Elezabi, P. Bailis,P. Levis, G. C. Beroza, Locality-sensitive hashing forearthquake detection: A case study scaling data-drivenscience. Proceedings of the International Conferenceon Very Large Data Bases (PVLDB) 11, 1674 (2018).doi: 10.14778/3236187.3236214

110. N. Riahi, P. Gerstoft, Using graph clustering to locate sourceswithin a dense sensor array. Signal Processing 132, 110–120(2017). doi: 10.1016/j.sigpro.2016.10.001

111. A. C. Aguiar, G. C. Beroza, PageRank for Earthquakes. Seismol.Res. Lett. 85, 344–350 (2014). doi: 10.1785/0220130162

112. M. Campbell, A. J. Hoane Jr., F.-H. Hsu, Deep Blue. Artif. Intell.134, 57–83 (2002). doi: 10.1016/S0004-3702(01)00129-1

113. D. Ferrucci, A. Levas, S. Bagchi, D. Gondek, E. T. Mueller,Watson: Beyond Jeopardy! Artif. Intell. 199-200, 93–105(2013). doi: 10.1016/j.artint.2012.06.009

114. D. Silver et al., Mastering the game of Go with deep neuralnetworks and tree search. Nature 529, 484–489 (2016).doi: 10.1038/nature16961; pmid: 26819042

115. M. Moravčík et al., DeepStack: Expert-level artificialintelligence in heads-up no-limit poker. Science 356,508–513 (2017). doi: 10.1126/science.aam6960;pmid: 28254783

116. A. Krizhevsky, I. Sutskever, G. E. Hinton, ImageNetclassification with deep convolutional neural networks.Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012).

117. Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-basedlearning applied to document recognition. Proc. IEEE 86,2278–2324 (1998). doi: 10.1109/5.726791

118. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei,Imagenet: A large-scale hierarchical image database, inComputer Vision and Pattern Recognition, 2009.CVPR 2009. IEEE Conference on (IEEE, 2009),pp. 248–255. doi: 10.1109/CVPR.2009.5206848

119. Aftershock detection contest, https://tianchi.aliyun.com/competition/introduction.htm?raceId=231606; accessed:7 September 2018.

120. L. Fang, Z. Wu, K. Song, SeismOlympics. Seismol. Res. Lett.88, 1429–1430 (2017). doi: 10.1785/0220170134

121. D. Sculley, J. Snoek, A. Wiltschko, A. Rahimi, Winner’s curse?on pace, progress, and empirical rigor, in InternationalConference on Learning Representations (ICLR) 2018Workshop. (2018); https://openreview.net/forum?id=rJWF0Fywf.

122. Y. Gil et al., Toward the Geoscience Paper of the Future:Best practices for documenting and sharing researchfrom data to software to provenance. Earth Space Sci. 3,388–415 (2016). doi: 10.1002/2015EA000136

123. GitHub, https://github.com.124. GitLab, https://about.gitlab.com.125. M. Beyreuther et al., ObsPy: A Python Toolbox for

Seismology. Seismol. Res. Lett. 81, 530–533 (2010).doi: 10.1785/gssrl.81.3.530

126. L. Krischer et al., ObsPy: A bridge for seismology into thescientific Python ecosystem. Comput. Sci. Discov. 8, 014003(2015). doi: 10.1088/1749-4699/8/1/014003

127. S. Heimann, et al., Pyrocko - An open-source seismologytoolbox and library. V. 0.3. GFZ Data Services. (2017).doi: 10.5880/GFZ.2.1.2017.001

128. F. Pedregosa et al., Scikit-learn: Machine Learning in Python.J. Mach. Learn. Res. 12, 2825–2830 (2011).

129. arXiv e-Print archive, https://arxiv.org.130. EarthArXiv Preprints, https://eartharxiv.org.131. Earth and Space Science Open Archive, https://essoar.org.132. J. Kaiser, The preprint dilemma. Science 357,

1344–1349 (2017). doi: 10.1126/science.357.6358.1344;pmid: 28963238

133. C. Sutton, L. Gong, Popularity of arXiv.org within computerscience. arXiv:1710.05225 [cs.DL] (2017).

134. A. J. Riquelme, A. Abellán, R. Tomás, M. Jaboyedoff, A newapproach for semi-automatic rock mass joints recognitionfrom 3D point clouds. Comput. Geosci. 68, 38–52 (2014).doi: 10.1016/j.cageo.2014.03.014

135. N. J. Lindsey et al., Fiber-Optic Network Observations ofEarthquake Wavefields. Geophys. Res. Lett. 44, 11,792–11,799(2017). doi: 10.1002/2017GL075722

136. E. R. Martin et al., A Seismic Shift in ScalableAcquisition Demands New Processing: Fiber-OpticSeismic Signal Retrieval in Urban Areas withUnsupervised Learning for Coherent Noise Removal.IEEE Signal Process. Mag. 35, 31–40 (2018).doi: 10.1109/MSP.2017.2783381

137. G. Marra et al., Ultrastable laser interferometry forearthquake detection with terrestrial and submarinecables. Science 361, 486–490 (2018). doi: 10.1126/science.aat4458

138. T. Sakaki, M. Okazaki, Y. Matsuo, Earthquake shakes Twitterusers: Real-time event detection by social sensors, inProceedings of the 19th International Conference on World

Wide Web (ACM, 2010), pp. 851–860. doi: 10.1145/1772690.1772777

139. P. S. Earle, D. C. Bowden, M. Guy, Twitter earthquakedetection: Earthquake monitoring in a social world. Ann.Geophys. 54, 708–715 (2012). doi: 10.4401/ag-5364

140. R. Bossu, G. Mazet-Roux, V. Douet, S. Rives, S. Marin,M. Aupetit, Internet Users as Seismic Sensors for ImprovedEarthquake Response. Eos 89, 225–226 (2008).doi: 10.1029/2008EO250001

141. E. S. Cochran, J. F. Lawrence, C. Christensen, R. S. Jakka,The Quake-Catcher Network: Citizen Science ExpandingSeismic Horizons. Seismol. Res. Lett. 80, 26–30 (2009).doi: 10.1785/gssrl.80.1.26

142. DigitalGlobe, http://www.digitalglobe.com.143. A. Karpatne, I. Ebert-Uphoff, S. Ravela, H. A. Babaie,

V. Kumar, Machine learning for the geosciences:Challenges and opportunities. arXiv:1711.04708[cs.LG] (2017).

144. A. Karpatne et al., Theory-Guided Data Science: A NewParadigm for Scientific Discovery from Data. IEEE Trans.Knowl. Data Eng. 29, 2318–2331 (2017). doi: 10.1109/TKDE.2017.2720168

145. J.-Y. Zhu, T. Park, P. Isola, A. A. Efros, Unpairedimage-to-image translation using cycle-consistentadversarial networks. arXiv:1703.10593 [cs.CV](2017).

146. I. Ebert-Uphoff, Y. Deng, Three steps to successfulcollaboration with data scientists. Eos 98, (2017).doi: 10.1029/2017EO079977

ACKNOWLEDGMENTS

This article evolved from presentations and discussions at theworkshop “Information is in the Noise: Signatures of EvolvingFracture Systems” held in March 2018 in Gaithersburg, Maryland.The workshop was sponsored by the Council on ChemicalSciences, Geosciences and Biosciences of the U.S. Departmentof Energy, Office of Science, Office of Basic Energy Sciences. Wethank the members of the council for their encouragement andassistance in developing this workshop. Funding: K.J.B. and G.C.B.acknowledge support from National Science Foundation (NSF)grant no. EAR-1818579. K.J.B. acknowledges support from theHarvard Data Science Initiative. P.A.J. acknowledges support fromInstitutional Support (LDRD) at Los Alamos and the Office ofScience (OBES) grant KC030206. M.V.d.H. acknowledges supportfrom the Simons Foundation under the MATH þ X program andthe National Science Foundation (NSF) grant no. DMS-1559587.P.A.J. thanks B. Rouet-LeDuc, I. McBrearty, and C. Hulbertfor fundamental insights. Competing interests: The authorsdeclare no competing interests.

10.1126/science.aau0323

Bergen et al., Science 363, eaau0323 (2019) 22 March 2019 10 of 10

RESEARCH | REVIEW

on March 28, 2019

http://science.sciencem

ag.org/D

ownloaded from

Page 12: GEOPHYSICS Machine learning for data-driven discovery in ...icnem.org/uploads/6/4/0/4/64044933/bergenscience19.pdf · geosciences Scientists have been applying ML techniques to problems

Machine learning for data-driven discovery in solid Earth geoscienceKarianne J. Bergen, Paul A. Johnson, Maarten V. de Hoop and Gregory C. Beroza

DOI: 10.1126/science.aau0323 (6433), eaau0323.363Science

, this issue p. eaau0323Sciencecomplex data collected in the geosciences.machine-learning techniques is important for extracting information and for understanding the increasing amount of

review how these methods can be applied to solid Earth datasets. Adoptinget al.machine-learning methods. Bergen Solid Earth geoscience is a field that has very large set of observations, which are ideal for analysis with

Automating geoscience analysis

ARTICLE TOOLS http://science.sciencemag.org/content/363/6433/eaau0323

REFERENCEShttp://science.sciencemag.org/content/363/6433/eaau0323#BIBLThis article cites 110 articles, 23 of which you can access for free

PERMISSIONS http://www.sciencemag.org/help/reprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAAS.Sciencelicensee American Association for the Advancement of Science. No claim to original U.S. Government Works. The title Science, 1200 New York Avenue NW, Washington, DC 20005. 2017 © The Authors, some rights reserved; exclusive

(print ISSN 0036-8075; online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on March 28, 2019

http://science.sciencem

ag.org/D

ownloaded from