Anomaly Detection in Gamma Ray Spectra: A Machine Learning Perspective Nathalie Japkowicz, Colin Bellinger, Shiven Sharma, Rodney Berg, Kurt Ungar University of Ottawa, Northern Illinois University Radiation Protection Bureau, Health Canada
Feb 23, 2016
Anomaly Detection in Gamma Ray Spectra: A Machine Learning
Perspective
Nathalie Japkowicz, Colin Bellinger, Shiven Sharma, Rodney Berg, Kurt
Ungar
University of Ottawa, Northern Illinois UniversityRadiation Protection Bureau, Health Canada
Goal and MethodologyGoal: To identify people concealing radioactive
material that may represent a threat to attendees at public gatherings.
Methodology: Analysis of Gamma-Ray spectra produced by spectrometer s at short intervals of time and decision on the fly of whether a threat is present.
General idea: to place spectrometers in strategic locations (e.g., the entry points to the event) and try to detect whether the new spectra coming in are similar or different from a normal spectrum for this particular location.
Gamma-Ray Spectroscopy (Wikipedia)
The gamma-ray spectrum of natural uranium, showing about a dozen discrete lines superimposed on a smooth continuum, allows the identification the nuclides 226Ra, 214Pb, and 214Bi of the uranium decay chain.
The quantitative study of theEnergy spectra of gamma-ray Sources.
Most radioactive sources produce gamma rays ofvarious energy levels and intensities
The data
I= Iodine, Tc=Technicium, Th= Thallium, Cs=Cesium, Co=Cobalt
Approach To apply Machine Learning/Pattern recognition
techniques to the data.Issue 1: There is a lot of background data, but very
few alarms. E.g., for one station: 24,712/6Data was augmented with simulated Cobalt entries
(though we only used that data for testing)We used one-class learning/anomaly detection
algorithms to deal with this extreme class imbalanceIssue 2: We discovered that rain was a problem as it
masked the presence of isotopes in the spectra.Since we had labelled data of both the rain and non-rain
classes, we used binary classification on this problem.
The effect of rain
Hypothesis
Separating rain from non-rain data in a first phase and
applying an anomaly detection system on each
group of data separately in a second phase could help us
improve the results.
Approach (cont’d)
Experiments
Experiments (Cont’d)We experimented with different classifiers in
both phases.Phase 1:
Classifiers tried: SVM, J48, NB, MLP and IBL.Winner: NB
Phase 2:Classifiers tried: oc-SVM, AA, Mahalanobis
DistanceWinner: Mahalanobis Distance
Experiments (Cont’d)
Results
Conclusions and report on further experiments