Data Science Publication for NSF Polar Cyberinfrastructure Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community http://semanticommunity.info/ http://www.meetup.com/Federal-Big-Data-Working-Group/ http://semanticommunity.info/Data_Science/Federal_Big_Data_Work ing_Group_Meetup November 4, 2014 1
22
Embed
Data Science Publication for NSF Polar Cyberinfrastructure Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Data Science Publication for NSF Polar Cyberinfrastructure
Dr. Brand NiemannDirector and Senior Data Scientist/Data Journalist
• Some prep work is already underway (if you scour the Open Science Codefest site you will find some) to prepare some datasets of relevance to the Polar community. We will provide some of this prepared data to interested parties ahead of the workshop in the next few weeks in case folks want to start hacking early. We will tweet under the hash tag: #nsfpolardatavis
Source: Chris Mattmann
3
Overview• Build the knowledge base (in MindTouch) and spreadsheet (in Excel) first,
which then makes the Spotfire (data browser) application easier to “storify” the results.
• Follow the Cross-Industry Data Mining Standard by:– 1 Business Understanding (of the Hackathon),– 2 Data Understanding (by mining the Sessions),– 3 Data Preparation (by screen scraping and downloading),– 4 Modeling (enough data for statistical significance?),– 5 Evaluation (How collected?, Where stored?, What results?, and Believe them?;
and– 6 Deployment (Story and Demo).
• The documentation will be in the form of the Data Science Publication for NSF Polar Cyberinfrastructure.
• My goal is to see if I can integrate and federate these multiple data sources.
4
Data Science for Business:Data Mining Process
Source: Data Science for Business: Chapter 2, 2014
5
Data Science for NSF Polar Cyberinfrastructure: Knowledge Base