• Avoid processing signals with non-causal filtering; this can introduce post-disruption effects into pre-disruption data • Pre-processed signals in database to avoid excessive smoothing and interpolation • Analyzed 7/40 dimensionless or machine-independent parameters from database using a machine learning algorithm • Difference in timescales on DIII-D and C-Mod evident when comparing design points and time evolution of parameters • Poorer predictive capability on Alcator C-Mod compared to DIII-D may be due to faster disruption-relevant timescales • At present data acquisition rate, difficult to predict disrupts • Compare performance of other ML algorithms and study dependence on new features as the database is updated Disruption Warning Database Development and Exploratory Machine Learning Studies on Alcator C-Mod K. Montes, C. Rea, R. Granetz Plasma Science and Fusion Center, Massachusetts Institute of Technology Introduction Conclusions and Future Work References Disruption Warning Database • SQL database of > 40 parameters from 1821 shots (~160k time slices) from 2015 C-Mod campaign • Only time slices in I p flattop included; composed of non-disruptive discharges and discharges that disrupted during the flattop • Ignored intentional massive gas injection (MGI) disruptions • Each database record consists of all parameter values at one time slice, recorded every 20 ms; for each disruption, take additional time slices every 1 ms during the 20 ms period before disruption C-Mod and DIII-D Comparison • Given input parameters Ԧ and historical knowledge of disrupted shots , how can we find patterns to distinguish disruptions in our database? • Random forest for classification using 3 different labeling schemes AXUV diode channel (no smoothing) with non-causal smoothing (not ok near disruptions) from blackened bolometer Non-causal filtering example: calculation on C-Mod taken from AXUV diode to avoid non-causal filter Total # of Shots 1821 Non-Disruptive Flattop Shots 1160 Disruptions in Flattop 206 Intentional MGI Disruptions 17 Flattop Shot # 1150501010 [1] C. Rea et al. APS (Oct. 2017) [2] O. Sauter and Y. Martin Nuclear Fusion 40 (2000) 955 [3] C. M. Greenfield et al. Plasma Physics and Controlled Fusion 46 (2004) 12B [4] G.M. Wallace et al. IAEA Conference (2012) [5] J. Vega et al. Fusion Engineering and Design 88 (2013) [6] E. Alpaydin, “Introduction to Machine Learning”, 2 nd Edition, MIT Press [7] L. Breiman, “Random Forests”, Machine Learning, 45(1), 5-32, 2001 Major Radius Minor Radius Toroidal Field Plasma Current Confinement Time [2] Current Relaxation Time [3,4] DIII-D 1.67 m 67 cm ~2 T 3.0 MA ≈ 0.1 ~1 C-Mod 68 cm 22 cm 3-8 T 0.4-2 MA ≈ 0.04 ~ 0.2 • Common cause of C-Mod disruptions is radiative collapse from high-Z first wall molybdenum (1-2 ms timescale) • In contrast, DIII-D has low-Z carbon wall; most disruptions due to MHD instabilities Supervised Learning for Classification Binary Phase Classification: • ‘stable’ = non-disrupted or > 40 ms from disruption • ‘disruptive’ = < 40 ms from disruption Classification Accuracy: • Disruptive: 48.5 % • Stable: 99.3 % • Overall: 97.3 % Binary Classification: • ‘non-disrupted’ = sample from shot with no disruption • ‘disrupted’ = sample from disrupted shot Classification Accuracy: • Disrupted: 52.6 % • Non-Disrupted: 97.0 % • Overall: 91.2 % • Predicting and mitigating disruptions in tokamaks is critical to the mission of sustaining a fusion plasma • To understand what causes disruptions, we want to answer: • Which parameters are correlated with the approach of a disruption? What are their threshold levels? • Are the thresholds reached with significant warning time? • Are there combinations of parameters that are useful? • Are the same parameters useful on different tokamaks? • Goal: Develop a disruption warning algorithm that works in near real-time, embedded in the plasma control system Yes No Yes No 1 > −0.55 2 > 0.3 branches R 1 R 3 R 2 Minimize impurity measure to determine splitting value at each node: leaves decision node 1 Plasma Current Error Fraction ip_error_frac 2 Internal Inductance li 3 Greenwald Fraction n/nG 4 q 95 (Safety Factor at r = 0.95a) q95 5 Poloidal Beta Ratio betap 6 Loop Voltage Vloop 7 Radiated Power Fraction prad_frac Multi-Class Classification: • ‘non-disrupted’ = sample from shot with no disruption • ‘far from disr’ = sample from disrupted shot > 40 ms from disruption • ‘close to disr’ = sample from disrupted shot < 40 ms from disruption Classification Accuracy: • Non-Disrupted: 97.4 % • Far from Disr: 37.3 % • Close to Disr: 53.3 % Overall Accuracy: 90.1 % • Large overlap of internal inductance distributions compared to DIII-D for time slices grouped via the multi-class classification case; Supervised Learning Learn = () Unsupervised Learning Search = Ԧ for structure & patterns Clustering Discover groupings in parameter space Machine Learning Algorithms Association Discover rules that relate data Classification = discrete (class) Regression = continuous (likelihood or time) Linear regression, neural networks, random forest, etc. Random forest, logistic regression, support vector machines, etc. K-means clustering, self-organizing maps, Gaussian mixture models, etc. Apriori algorithm, equivalence class transformation, etc. 40 parameters I p (MA) n e (m -3 ) q 95 [1] C. Rea et al. APS (Oct. 2017) Shot # 1150806029 C-Mod l i distribution mean Power Spike Before C-Mod Disruption