Ontology Driven Ontology Driven Data Mining Data Mining A.K. Sinha Dept. Of Geo Sciences Virginia Tech Satish Tadepalli Dept. Of Computer Science Virginia Tech
Jan 15, 2016
Ontology Driven Data Ontology Driven Data MiningMining
A.K. SinhaDept. Of Geo SciencesVirginia Tech
Satish TadepalliDept. Of Computer ScienceVirginia Tech
Ontology-Driven Data Ontology-Driven Data MiningMining
Data Mining:Data Mining:– Analysis of observational data sets to find Analysis of observational data sets to find
unsuspected relationships and to summarize unsuspected relationships and to summarize the data in novel waysthe data in novel ways
OntologyOntology– Represents domain knowledgeRepresents domain knowledge– Relationships between concepts in a domainRelationships between concepts in a domain
Ontology-driven data miningOntology-driven data mining– Use the knowledge represented by ontologies Use the knowledge represented by ontologies
to create a hierarchical structure in the datato create a hierarchical structure in the data– Apply data mining techniques on the Apply data mining techniques on the
structured data setsstructured data sets
GeoROC DatabaseGeoROC Database(http://georoc.mpch-(http://georoc.mpch-
mainz.gwdg.de/)mainz.gwdg.de/) GeoROC Data and Present Tectonic
Setting
Broad tectonic classification of GeoROC Data set for applying Data mining Techniques
Classes· Convergent
Margins· Continental
Flood Basalts· Ocean Basin
Flood Basalts· Ocean Island
Groups· Ocean Island
Plateaus· Others
Subclasses(Location-based)· Tonga· New Zealand· Papua New
Guinea· Central America· Others
Attributes (Chemical/Isotope)· SiO2· Al2O3· MnO· Sr87/Sr86· Others
Structuring the data sets based on ontology
Correlation AnalysisCorrelation Analysis
Correlations in Continental Covergent Margins
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Cascades Andean Both
Si-K
Si-Na2O
Si-Fe
Correlations in Oceanic Convergent Margins
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Tonga Mariana Both
Si-K
Si-Na2O
Si-Fe
Classification Using Neural Networks
Present day Plate Tectonic settings and associated data are the key to recognizing paleo-tectonic settings of rocks.
Ongoing ResearchOngoing Research
Data mining of spatial data sets Data mining of spatial data sets using Gaussian processesusing Gaussian processes
Sparse data miningSparse data mining
ConclusionConclusion
Ontology driven data mining Ontology driven data mining – Meaningful patterns at multiple levels of Meaningful patterns at multiple levels of
abstractionabstraction– Multiple views of same data setMultiple views of same data set– Ease in choosing the relevant data sets for Ease in choosing the relevant data sets for
comparisoncomparison