Building Data Science Teams - Meetupfiles.meetup.com/3343012/2013 Boston Meetups - Building... · 2013-09-04 · Predictive Analytics and Data Mining (Data Science) Typical Techniques
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Building Data Science Teams David Dietrich Advisory Technical Education Consultant EMC Education Services @imdaviddietrich Boston Data Scientist Meetup, Leading Analytics Series September 3, 2013
Interpreting the Resume of a Senior Data Scientist
John Smith [email protected] Skills R, SAS, Java, data mining, sta8s8cs, ontology, bioinforma8cs, human-‐computer interac8on, research Experience 2009—Present, Senior Data Scien8st, ABC Analy)cs 2007—2009, Founder&CEO, Genome
Genome specializes in consumer health informa8on. The main product is InherithHealth, a tool for acquisi8on of family medical histories that provides familial disease risk assessment.
2005—2007, Knowledge Engineer, ScienceExperts.com Managed technical outsourcing efforts. Developed criterion and evaluated engineering outsourcing agencies and individuals …
2004—2006, Research Scien8st, University of Washington Developed rigorous sta8s8cal and computa8onal models for addressing primary shortcomings of observa8onal data analysis in the context of disease risk and drug response.
2000—2004, Research Developer, Nat’l Inst. of Standards and Technology Designed and implemented prototypes. Evaluated tools for represen8ng rules of autonomous on-‐road naviga8on.
Educa6on Ph.D, Biomedical Informa8cs, University of Washington, 2011
Disserta8on: Detec8on of Protein–protein Interac8on in Living Cells by Flow Cytometry
BS, Computer Science, University of Texas at Aus)n, 2004
Responsibili6es: • Work with business owners to map business requirements into technical solu8ons • Analyze and extract relevant informa8on from large amounts of data to help iden8fy key revenue-‐driven features • Perform ad-‐hoc sta8s8cal and data mining analyses • Design and implement scalable and repeatable solu8ons, and establish scalable, efficient, automated processes for large scale data analyses • Work closely with the sodware engineering team to drive new feature crea8on • Design mul8-‐factor experiments and validate hypothesis Qualifica6ons: • A proven passion for genera8ng insights from data, with a strong familiarity with the higher-‐level trends in data growth, open-‐source plaeorms, and public data sets • Experience with sta8s8cal languages and packages, including R, S-‐Plus, SAS and Matlab, and/or Mahout • Experience working with rela8onal databases and/or distributed compu8ng plaeorms, and their query interfaces, such as SQL, MapReduce, Hadoop, Cassandra, PIG, and Hive • Strong communica8on skills, with ability to communicate at all levels of the organiza8on • Masters/PhD degree in mathema8cs, sta8s8cs, computer science or a similar quan8ta8ve field • Experience in designing and implemen8ng scalable data mining solu8ons • Preferably experience with addi8onal programming languages, including Python, Java, and C/C++ • Ability to travel as-‐needed to meet with customers
Data Scientist Job Description Sample Data Scientist Resume
“… If Your Organization Can Arrange It … Have Someone In A Key Operational Role -- Business Unit Head, Chief Operations Officer, Even CEO -- To Be An Enthusiastic Advocate Of Matters Quantitative.”
Now You Know How To Develop Data Science Teams…What Next?
• Determine How You Would Like To Develop Data Science Capabilities • Hire People To Fill Out Your Data Science Team • Consider Which Organizational Model Will Work Best For Your Situation • Assess How Much Executive Engagement You Have Or Need • Map Out Potential Projects -- Balance Quick Wins With Longer-term Wins
1. EMC Education Services curriculum on Data Science and Big Data Analytics
for Business Transformation: http://education.emc.com/guest/campaign/data_science.aspx
2. My Blog on Data Science & Big Data Analytics: http://infocus.emc.com/author/david_dietrich/
3. Blog on applying Data Analytics Lifecycle to measuring innovation data: http://stevetodd.typepad.com/my_weblog/data-science-and-big-data-curriculum/