University of Illinois at Urbana- Champaign PET Program Year-End Review, 1999 Neural, Bayesian, and Evolutionary Systems for High-Performance Computational Knowledge Management: Progress Report Wednesday, August 4, 1999 William H. Hsu, Ph.D. Automated Learning Group National Center for Supercomputing Applications http://www.ncsa.uiuc.edu/People/bhsu
15
Embed
Neural, Bayesian, and Evolutionary Systems for High-Performance
Neural, Bayesian, and Evolutionary Systems for High-Performance Computational Knowledge Management: Progress Report. Wednesday, August 4, 1999 William H. Hsu, Ph.D. Automated Learning Group National Center for Supercomputing Applications http://www.ncsa.uiuc.edu/People/bhsu. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
University of Illinois at Urbana-ChampaignPET Program Year-End Review, 1999
Neural, Bayesian, and Evolutionary Systemsfor High-Performance
University of Illinois at Urbana-ChampaignPET Program Year-End Review, 1999
Data Mining: ObjectivesData Mining: Objectivesfor Testing and Evaluationfor Testing and Evaluation
• Objectives– Scalability: handling disparity in temporal, spatial granularity
– Data integrity: verification (formal model) or validation (testing)
– Multimodality: ability to integrate knowledge/data sources
– Efficiency: consume only the necessary bandwidth for model– Acquisition (data warehousing)– Maintenance (incrementality)– Analysis (interactive, configurable data mining system)– Visualization (transparent user interface)
• Applicable Technologies– Selective downsampling: adapting grain size of data model
– Data model validation– Simple relational database (RDB) model
– Ontology: knowledge base definition, units, abstract data types
– Multimodal sensor integration: mixture models for data fusion
– Data preparation: selection, synthesis, partitioning of data channels
University of Illinois at Urbana-ChampaignPET Program Year-End Review, 1999
Data Models and OntologiesData Models and Ontologies(Super ADOCS Data Format)(Super ADOCS Data Format)
Ballistics
Hazard
Diagnostic
University of Illinois at Urbana-ChampaignPET Program Year-End Review, 1999
MultiattributeData Set
xAttribute Selection
and Partitioning'1x
'nx
SubproblemDefinition
'1y
'ny
?
?
?
?
PartitionEvaluator
Metric-BasedModel Selection
LearningArchitecture
LearningMethod
Learning Specification
Subproblem ( Architecture,Method )
DataFusionOverall
Prediction
Data Mining: Data Fusion SystemData Mining: Data Fusion Systemfor Testing and Evaluationfor Testing and Evaluation
University of Illinois at Urbana-ChampaignPET Program Year-End Review, 1999
Data Mining: Integrated Modeling and Testing Data Mining: Integrated Modeling and Testing (IMT) Information Systems(IMT) Information Systems
• Application Testbed
– Aberdeen Test Center: M1 Abrams main battle tank (SEP data, SDF)
– Reliability testing
• T&E Information Systems: Common Characteristics
– Large-Scale Data Model • Input (M1 A2 SEP): 1.8Mb ~ 459Mb; minutes to hours• Output: 33 caution/warning channels; internal diagnostics
– Data Integrity Requirements• Specification of test objective and metrics (in progress)• Generated by end user (e.g., author of test report, instrumentation report)
– Multimodality• Selection of relevant data channels (given prediction objective)
• Data fusion problem: data channels from different categories
– Data Reduction Requirements• Excess bandwidth: non-uniform downsampling (frequency reduction)• Irrelevant data channels (e.g., targeting with respect to excess RPMs)
University of Illinois at Urbana-ChampaignPET Program Year-End Review, 1999
Relevance Determination Problems inRelevance Determination Problems inTesting and EvaluationTesting and Evaluation
• Problems– Machine learning for decision support and monitoring– Extraction of temporal features– Model selection– Sensor and data fusion
• Solutions– Clustering and decomposition of learning tasks– Selection, synthesis, and partitioning of data channels
• Approach– Simple relational data model– Relevance determination (importance ranking) for data channels– Multimodal Data Fusion
– Hierarchy of time series models
– Quantitative (metric-based) model selection
University of Illinois at Urbana-ChampaignPET Program Year-End Review, 1999
Deployment of KDD and VisualizationDeployment of KDD and VisualizationComponentsComponents
• Database Access
– SDF (Super ADOCS Data File) import
– Flat file export
– Internal data model: interaction with learning modules
• Deployment
– Java stand-alone application
– Interactive management of modules, data flow
• Presentation: Web-Based Interface
– Simple, URL-based invocation system• Common Gateway Interface (CGI) and Perl• Alternative implementation: servlets (http://www.javasoft.com)
– Configurable using forms
• Messaging Systems (Deployment Presentation)
– Between configurators and deployment layer
– Between data management modules and visualization components
University of Illinois at Urbana-ChampaignPET Program Year-End Review, 1999
NCSA Infrastructure for High-Performance NCSA Infrastructure for High-Performance Computation in Data Mining [1]Computation in Data Mining [1]
Rapid KDD Development Environment
University of Illinois at Urbana-ChampaignPET Program Year-End Review, 1999
NCSA Infrastructure for High-Performance NCSA Infrastructure for High-Performance Computation in Data Mining [2]Computation in Data Mining [2]
University of Illinois at Urbana-ChampaignPET Program Year-End Review, 1999
Cluster (Network of Workstations) ModelCluster (Network of Workstations) Modelfor Master/Slave Genetic Wrapperfor Master/Slave Genetic Wrapper