Top Banner
Data Modelers Save Their Careers: Surviving and Thriving with NoSQL Joe Maguire Data Quality Strategies, LLC http:// www.DataQualityStrategies.com/ © 2013 Data Quality Strategies, LLC
40

Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

Jan 26, 2015

Download

Technology

DataStax

Data modeling emerged in the 1970’s in response to the needs of database designers. This accident of history has influenced perceptions and practices of data modeling in harmful ways. Most notably, business-focused requirements analysis has been wrongly commingled with relational modeling. Compounding the problem, vendors have produced data-modeling tools that blur the important distinction between the client’s problem and the technologist’s solution.

Enter NoSQL, with its promise of liberating practitioners from the tiresome burden of designing relational databases. The chance to dispense with relational modeling was embraced enthusiastically, but for many organizations, it has meant discarding the only rigorous activity that had any hope of formally expressing the client’s data needs. This is a textbook case of throwing out the baby with the bathwater. This presentation shows you how to save the baby, and your career as a data modeler.

Understanding the client’s data problem remains essential, regardless of the technology used to build the solution. For that matter, understanding the client’s data problem is the first step toward making an informed choice of technology for the solution.
Using concrete, real-world examples, the presenter will show the following:

- How abandoning modeling altogether is a recipe for disaster, even in—or especially in—NoSQL environments
How experienced relational modelers can leverage their skills for NoSQL projects
- How the NoSQL context both simplifies and complicates the modeling endeavor
- How lessons learned modeling for NoSQL projects can make you a more effective modeler for any kind of project
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

Data Modelers Save Their Careers: Surviving and Thriving with NoSQL

Joe MaguireData Quality Strategies, LLC

http://www.DataQualityStrategies.com/

© 2013 Data Quality Strategies, LLC

Page 2: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

2

Thesis

• Relational DBMS’s have dominated,• ...so relational modeling subsumed other

forms, including conceptual modeling.• As R-DBMS wanes, so does relational

modeling – and sadly, whatever it subsumed.• Conceptual modeling must be saved.• Relational modelers can step in to save it...• ...with some significant effort.

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 3: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

3

My Perspective

• Over three decades in industry• Career is a three-legged stool

– Product development for software vendors– Solution design for enterprises– Author, Industry Analyst, Thought Leader

• Specialize in – Modeling– Requirements analysis– Data architecture– Data quality

[email protected] 25 June 2013 © 2013 Data Quality Strategies, LLC

Page 4: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

4

Agenda

• History• Current Events• Your Future as a Data Modeler• Q&A

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 5: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

5

A Big-Picture Framework

25 June 2013 © 2013 Data Quality Strategies, LLC

Meta-model Data Perspective

Conceptual • Entities• Attributes• Relationships• Identifiers

Logical • Tables• Columns• Primary and foreign keys

Physical • Indexes• Table spaces• Vertical and horizontal partitioning• Denormalizations

Page 6: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

6

Good Ideas in the Framework• Information Hiding– e.g., conceptual excludes implementation details

• The Type/Instance distinction– Models describe categories, data describes members

• Application/Data Independence– Data modeling is separate from process modeling

• User Requirements ≠ System Requirements– Users should not participate in logical and physical

• Model-Driven Development– Forward and reverse engineering across model levels

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 7: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

7

A Big-Picture Framework, distorted

25 June 2013 © 2013 Data Quality Strategies, LLC

Meta-model Data Perspective

Relational • Entities / Tables• Attributes / Columns• Relationships / FKs• Identifiers / PKs

Physical • Indexes• Table spaces• Vertical and horizontal partitioning• Denormalizations

Page 8: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

8

How the Distortion Happens• Tool Vendors Dismiss Conceptual Modeling– Because their tools cannot support it anyway

• Info Mgmt Specialists Confuse Models w Reality– E.g., believing the relational model suffices to

describe the universe• Institutionalized Expediency – We know about conceptual modeling, but to save

time, we combine it with relational modeling...– ...then we formalize that into our dev processes...– ...and eventually, that becomes the “best practices.”

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 9: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

9

Distortions, Revisited

• Summary of Distortions:– Distortion: Conceptual means vague– Distortion: Logical implies relational• Rather than implying XML, OO, KV Store, Array Database,

Graph Database

• Results of Distortions:– Two levels only: relational and physical– Relational modeling used for user requirements

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 10: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

10

Agenda

• History• Current Events• Your Future as a Data Modeler• Q&A

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 11: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

11

Current Events: NoSQL• The “Just Say No” Interpretation

25 June 2013 © 2013 Data Quality Strategies, LLC

Meta-model Data Perspective

LogicalRelational

• Entities / Tables• Attributes / Columns• Relationships / FKs• Identifiers / PKs

Physical NO LONGER RELATIONAL:• Schemas Based on Big Table Implementations• Alien DDL language• Limited Support from Modeling Tools

Page 12: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

12

Current Events: NoSQL

25 June 2013 © 2013 Data Quality Strategies, LLC

• The “Not Only SQL” Interpretation– Okay, so there might be some work for you– But you’re at risk of being marginalized

Page 13: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

13

Agenda

• History• Current Events• Your Future as a Data Modeler• Summary• Q&A

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 14: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

14

Your Future as a Modeler

25 June 2013 © 2013 Data Quality Strategies, LLC

• Remaining Relevant– Selfishly: Saving your career– Nobly: Serving your client / company / customer

• What You Can Do:– Wait for relational projects– Become a NoSQL database designer– Help your client choose data platforms• That starts with understanding the problems

– which starts with CONCEPTUAL MODELING.

Page 15: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

15

A New (?) Modeling Framework

• Conceptual Modeling• Choosing a Logical Meta-model• Logical Modeling• Physical Modeling

• Tool Support?

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 16: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

16

Conceptual Modeling

• Behaviors and constructs will compare to relational modeling:– Keep some– Discard some– Stress some– Change some

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 17: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

17

Conceptual Data Model Example

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 18: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

18

Keep Some

• Keep Entities• Keep Attributes• Keep Relationships• Keep Identifiers• Keep Maximum Cardinality of Relationships

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 19: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

19

Keep Entities

• Minimum Expressiveness• Entities, Not Tables– Don’t express horizontal or vertical partitioning for

performance• But yes if motivated by privacy/security/risk

• Entity names, not table names– Honor user vocabulary, not IT naming standards

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 20: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

20

Keep Attributes

• Honor The User Phenomenon– Attributes are part of user discourse

• Attributes, Not Columns– Worry about scale (nominal, numeric, ordinal,

Boolean, cyclic), not data type– Attribute names, not column names

• Support In-Progress Models– During which attributes can become entities

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 21: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

21

Keep Relationships

• Minimum Expressiveness– Relationships are part of user discourse

• Allow Many-Many and Collection Entities– If the latter seem illegal, you’ve been in IT too long

• Relationships, not FKs

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 22: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

22

• Relationships, not Foreign Keys

– (achievement DOES NOT have code or creatureID)

Keep Relationships

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 23: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

23

• Many-Many AllowedKeep Relationships

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 24: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

24

Keep Identifiers

• Identifiers, Not PKs– IDs are not motivated by computerization, but by

typography– IDs predate the information revolution• and the automotive revolution, for that matter

– Allow collection entities• Support In-Progress Modeling– IDs help the modeler ferret out the homonym

problem

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 25: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

25

Keep Identifiers

• Identifiers, not PKs. (E.g., Collection Entities):

– (each squad is identified by the skaters on it.)

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 26: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

26

Discard Some

• Discard Foreign Keys– They’re relational

• Discard Minimum Cardinality– A function of process or policy, not data– Over-reported by users

• Discard Most Constraints– A function of process or policy, not data– Are over-reported by users

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 27: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

27

Discard Minimum Cardinality• Must EVERY instance of meeting have a person?

– No. E.g., CassandraSummit 2014 already has a date and location but has zero persons associated with it.

• More generally: Should the DBMS refuse to store incomplete data?– People get interrupted and want to save their partial

work.25 June 2013 © 2013 Data Quality Strategies, LLC

Page 28: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

28

Keep/Discard Rule of Thumb

• Keep– Anything that helps you and the users together

discover and name the user categories• Discard– Anything else

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 29: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

29

Conceptual Data Model Examples

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 30: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

30

Stress Some

• Stress Consistency Requirements– Relational modelers (of non-distributed databases)

have not been asking about these.• Stress Data Volume / Velocity Requirements– Can lead or force your to relax application-data

independence

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 31: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

31

Change Some

• Change Your Process– From math-y normalization to English-y

conversation with users– Very difficult to achieve rigor conversationally

25 June 2013 © 2013 Data Quality Strategies, LLC

• More help:– Mastering Data Modeling: A

User-Driven Approach by Carlis & Maguire

Page 32: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

32

A New Modeling Framework

• Conceptual Modeling• Choosing a Logical Meta-Model• Logical Modeling• Physical Modeling

• Tool Support?

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 33: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

33

Choosing a Logical Meta-Model

• Don’t Assume Relational (Duh...)• Don’t Assume Big Table, KV-Store, Cassandra• Lots of Choices– Relational– Key-Value Store– XML/Document Database– Graph database– Array database– ...

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 34: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

34

A New Modeling Framework

• Conceptual Modeling• Choosing a Logical Meta-Model• Logical Modeling• Physical Modeling

• Tool Support?

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 35: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

35

Logical, Physical, and Tool Support

• Minimal Support From Modeling Tools– Because few tools support conceptual modeling– Because vendors have not caught up to NoSQL yet

• Community Needs to Develop Shapes– And the attendant transformations from conceptual

shapes to Big-Table shapes• During Logical NoSQL Modeling, Process

Requirements Will Infiltrate

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 36: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

36

Agenda

• History• Current Events• Your Future as a Data Modeler• Summary• Q&A

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 37: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

37

Summary

• Recommit to Conceptual Modeling for Requirements Analysis– Some but not all relational-modeling skills will

apply– Must learn to focus on user communication, not

nerdy stuff like intermediate normal forms

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 38: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

38

Summary

• Remember the fundamentals, so that you can make informed decisions about relaxing them– Application-data independence (relax knowingly)– Distinguish problems from solutions (relax at your

own peril)– Consistency level as a user requirement (as you

ask, you’ll find immediate consistency is often negotiable)

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 39: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

39

Summary

• Additional Benefits– Users will like you better– Agile developers will like you better– This framework works in traditional, all-SQL

environments

25 June 2013 © 2013 Data Quality Strategies, LLC

Page 40: Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

40

Q&A

[email protected]• www.DataQualityStrategies.com

25 June 2013 © 2013 Data Quality Strategies, LLC