Data Science for Tackling the Challenges of Big Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community http://semanticommunity.info/ http://www.meetup.com/Federal-Big-Data-Working-Group/ http://semanticommunity.info/Data_Science/Federal_Big_Data_Work ing_Group_Meetup November 14, 2014 1
28
Embed
Data Science for Tackling the Challenges of Big Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Data Science for Tackling the Challenges of Big Data
Dr. Brand NiemannDirector and Senior Data Scientist/Data Journalist
– Started November 4th and Completed November 12th.• Mined this MIT Online Course for Data Sets and Ideas:
– Found subset of the slides that contained data sets and ideas and were interesting and useful visualizations in themselves.
• Professor Karger's Lecture Slides on Visualization User Interfaces Were All About My Heroes:– Tukey, Tufte, Sneiderman, and Spotfire. (In fact it was everything leading
up to Spotfire, but Spotfire itself!)• Preserve My Work & Present Tutorial to the Federal Big Data
Working Group Meetup:– MindTouch Knowledge Base, Excel Spreadsheet Index, and Spotfire
Interactive Visualizations.
3
MITProfessionalX 6.BDx Tackling the Challenges of Big Data: Course Assessment
Courseware: Big Data Storage• I was especially interested in the following since both
Professors Stonebraker and Madden presented to our Federal Big Data Working Group Meetup:– This module begins with an overview of a number of these technologies by
renowned database professor Mike Stonebraker. In his unique and ardent fashion, Mike expresses his skepticism about many new technologies, particularly Hadoop/MapReduce and NoSQL, and voices support for many new relational technologies, including column stores and main memory databases.
– After that, Professors Matei Zaharia and Samuel Madden provide a more nuanced view of the tradeoffs between the various approaches, discussing Hadoop and its derivatives, as well as NoSQL and its tradeoffs, in more detail.
– Professor Stonebraker expresses a number of strong opinions in this module. Which of them do you agree with? Which do you disagree with? Why?
3.0 Introduction to Big Data Storage and Discussion 3