NoSQL no MySQL 5.7

Post on 16-Apr-2017

776 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

Transcript

NoSQL em um mundo SQL

Airton Lastori airton.lastori@oracle.com Abril-2016

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

DBA Dev Gerencial

Quem?

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Nunca usou NoSQL

Usa NoSQL apenas em apps não-

críticas

NoSQL em apps críticas

Quem?

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Agenda

1. NoSQL?

2. Uso do relacional como não-relacional

3. NewSQL

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

NoSQL? uma breve introdução

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 6

NoSQL = Não-relacional

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Modelo Relacional

• Edgard F. Codd –1970, IBM

–Turing Award 1981

• Forte base teórica matemática –teoria dos conjuntos, lógica de predicados, etc.

• Implementada como SQL

Oracle Confidential – Internal/Restricted/Highly Restricted 7

The relational model for database management: version 2

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 8

SQL = implementação do modelo relacional

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Sistemas Gerenciadores de Bancos de Dados Relacionais

• Software para gerenciar dados baseados no Modelo Relacional

• Estrutura lógica dos dados – visão do usuário

– Dados organizados em tabelas compostas de linhas e colunas e possuem regras de relacionamento entre sí (constraints) • Conhecidos como Dados Estruturados

– SQL permite criar, manter e consultar os dados nestas estruturas

– Normalização e Constraints evitam duplicidade da informação, aumentando consistência e qualidade dos dados

– propriedades ACID (suporte a transações)

• Estrutura física dos dados – visão da máquina

– Árvores B*Tree = buscas muito rápidas O(log n)

Oracle Confidential – Internal/Restricted/Highly Restricted 9

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Vários modelos de dados

• O modelo relacional não resolve bem todos os problemas

• 1960 - navigacional

– Hierárquico

– Network

• 1970 - SQL/relacional

• 1990 - Orientado a Objetos – em parte, absorvido pelos SGBDRs

• 2000 – NoSQL

– será absorvido pelos SGBDRs? Apenas um Hype?

Oracle Confidential – Internal/Restricted/Highly Restricted 11

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

A Web

Oracle Confidential – Internal/Restricted/Highly Restricted 12

Dados massivos, problemas Big Data

Altíssima escala

Sempre online

Estratégia: hardwares commodity em nuvem + software livre

Modelo relacional: difícil escalar e implementar alta disponibilidade em nuvem de hw commodity

www.leavcom.com/pdf/NoSQL.pdf

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Standalone

Clusterizado

Oracle Confidential – Internal/Restricted/Highly Restricted 13

Problemas diferentes

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Problemas em ambientes clusterizados

• Quando os dados estão normalizados e distribuídos, é difícil manter a performance

• Quando os dados estão distribuídos, é difícil manter a consistência e implementar transações

• Quando o dado está distribuído em hw commodity, é preciso ter duplicidade e sincronização para tolerância a falhas

• Etc.

Que tal abrir mão de algumas coisas do modelo relacional em prol de outras?

Oracle Confidential – Internal/Restricted/Highly Restricted 14

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 15

Web = gatilho para o surgimento das tecnologias NoSQL

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | 16

http://db-engines.com/en/ranking_categories

183 NoSQL

12 categorias

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Características comuns

• Alta Performance

– Normalmente um banco NoSQL é muito rápido, pois possui uma arquitetura simplificada

• Evita operações de JOINs

– armazenando dados duplicados e denormalizados

• Projetado para escalar Horizontalmente – Lembra da Cloud Computing? Pois é...

• Normalmente abre-se mão de funcionalidades em prol da simplicidade de uso, inclusive em escala

Oracle Confidential – Internal/Restricted/Highly Restricted 17

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Schemaless

• key-value

• document

• wide column

• graph

• Etc

• Muda o modelo lógico, a visão do usuário

– Outras APIs de acesso = Not Only SQL

– Em muitos casos, simplifica a vida do desenvolvedor (do DBA, nem tanto...)

Oracle Confidential – Internal/Restricted/Highly Restricted 18

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 19

Computação acessível via clouds públicas, software livre e simplicidade no uso tornam o movimento NoSQL

muito relevante

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | 20

http://db-engines.com/en/ranking_trend (mar-2016)

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Uso do relacional como não-relacional casos de sucesso da web

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Grandes usuários MySQL

23

Web, Cloud, Distribuído e Embarcado…

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 24

Muitas foram start ups há poucos anos, começaram e cresceram com

MySQL

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Usa MySQL como NoSQL

eng.uber.com/schemaless-part-one

• Our new solution needed to be able to linearly add capacity by adding more servers

• We needed write availability – substituir Redis como data pipeline em busca de consistência de leitura sem abrir mão da performance de escrita

• We needed secondary indexes – saindo do Postgres, mas mantendo a mesma funcionalidade

• We needed operation trust in the system, as it contains mission-critical trip data

• We needed a way of notifying downstream dependencies – múltiplos processos (billing, analytics) inter-dependentes, mas que precisam ser isolados para escalar e sem perdas

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

“We had an unexpected loss of data on nearly every technology we used at one time or another, except MySQL.”

– Pinterest Engineering

Oracle Confidential – Internal/Restricted/Highly Restricted 29

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

NewSQL o mundo relacional abraça o NoSQL

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Suporte ao modelo chave-valor

• Memcached plug-in

Oracle Confidential – Internal/Restricted/Highly Restricted 31

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

MySQL 5.7 Sysbench Benchmark: SQL Point Selects 3x Faster than MySQL 5.6

1,600,000 QPS

0

200,000

400,000

600,000

800,000

1,000,000

1,200,000

1,400,000

1,600,000

1,800,000

8 16 32 64 128 256 512 1,024

Qu

eri

es

pe

r Se

con

d

Connections

MySQL 5.7: Sysbench OLTP Read Only (SQL Point Selects)

MySQL 5.7

MySQL 5.6

MySQL 5.5

Intel(R) Xeon(R) CPU E7-8890 v3 4 sockets x 18 cores-HT (144 CPU threads) 2.5 Ghz, 512GB RAM Linux kernel 3.16

32

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Suporte ao modelo orientado a documentos no MySQL 5.7

1. Native JSON datatype

2. JSON Functions

3. Generated Columns

33

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Tipo nativo JSON

34

CREATE TABLE employees (data JSON);

INSERT INTO employees VALUES ('{"id": 1, "name": "Jane"}');

INSERT INTO employees VALUES ('{"id": 2, "name": "Joe"}');

SELECT * FROM employees;

+---------------------------+

| data |

+---------------------------+

| {"id": 1, "name": "Jane"} |

| {"id": 2, "name": "Joe"} |

+---------------------------+

2 rows in set (0,00 sec)

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Vantagens sobre tipos TEXT/VARCHAR

1. Document Validation:

2. Eficiência no armazenamento físico Allows quicker access to object members and array elements because its optimized binary format

37

INSERT INTO employees VALUES ('some random text');

ERROR 3130 (22032): Invalid JSON text: "Expect a value here." at

position 0 in value (or column) 'some random text'.

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

JSON Functions

38

SET @document = '[10, 20, [30, 40]]';

SELECT JSON_EXTRACT(@document, '$[1]');

+---------------------------------+

| JSON_EXTRACT(@document, '$[1]') |

+---------------------------------+

| 20 |

+---------------------------------+

1 row in set (0.01 sec)

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Testes com dados reais

• Via SF OpenData

• 206K JSON objects representing subdivision parcels.

• Imported from https://github.com/zemirco/sf-city-lots-json + small tweaks

39

CREATE TABLE features (

id INT NOT NULL auto_increment primary key,

feature JSON NOT NULL

);

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | 40

{

"type":"Feature",

"geometry":{

"type":"Polygon",

"coordinates":[

[

[-122.42200352825247,37.80848009696725,0],

[-122.42207601332528,37.808835019815085,0],

[-122.42110217434865,37.808803534992904,0],

[-122.42106256906727,37.80860105681814,0],

[-122.42200352825247,37.80848009696725,0]

]

]

},

"properties":{

"TO_ST":"0",

"BLKLOT":"0001001",

"STREET":"UNKNOWN",

"FROM_ST":"0",

"LOT_NUM":"001",

"ST_TYPE":null,

"ODD_EVEN":"E",

"BLOCK_NUM":"0001",

"MAPBLKLOT":"0001001"

}

}

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Naive Performance Comparison

41

# as JSON type

SELECT DISTINCT

feature->"$.type" as json_extract

FROM features;

+--------------+

| json_extract |

+--------------+

| "Feature" |

+--------------+

1 row in set (1.25 sec)

Unindexed traversal of 206K documents

# as TEXT type

SELECT DISTINCT

feature->"$.type" as json_extract

FROM features;

+--------------+

| json_extract |

+--------------+

| "Feature" |

+--------------+

1 row in set (12.85 sec)

Explanation: Binary format of JSON type is very efficient at searching. Storing as TEXT performs over 10x worse at traversal.

Using short cut for JSON_EXTRACT. Coming in 5.7.9.

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Generated Columns

42

id my_integer my_integer_plus_one

1 10 11

2 20 21

3 30 31

4 40 41

CREATE TABLE t1 (

id INT NOT NULL PRIMARY KEY auto_increment,

my_integer INT,

my_integer_plus_one INT AS (my_integer+1)

);

UPDATE t1 SET my_integer_plus_one = 10 WHERE id = 1;

ERROR 3105 (HY000): The value specified for generated column

'my_integer_plus_one' in table 't1' is not allowed.

Column automatically maintained based on your specification.

Read-only of course

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Generated Columns Support Indexes!

43

ALTER TABLE features ADD feature_type VARCHAR(30) AS (feature->"$.type");

Query OK, 0 rows affected (0.01 sec)

Records: 0 Duplicates: 0 Warnings: 0

ALTER TABLE features ADD INDEX (feature_type);

Query OK, 0 rows affected (0.73 sec)

Records: 0 Duplicates: 0 Warnings: 0

SELECT DISTINCT feature_type FROM features;

+--------------+

| feature_type |

+--------------+

| "Feature" |

+--------------+

1 row in set (0.06 sec)

From table scan on 206K documents to index scan on 206K materialized values

Down from 1.25 sec to 0.06 sec

Creates index only. Does not modify table rows.

Meta data change only (FAST). Does not need to touch table.

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Generated Columns (cont.)

• Used for “functional index”

• Available as either VIRTUAL (default) or STORED:

• Both types of computed columns permit for indexes to be added.

44

ALTER TABLE features ADD feature_type varchar(30) AS (feature-

>"$.type") STORED;

Query OK, 206560 rows affected (4.70 sec)

Records: 206560 Duplicates: 0 Warnings: 0

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Indexing Options Available

45

STORED VIRTUAL

Primary and Secondary

BTREE, Fulltext, GIS

Mixed with fields

Requires table rebuild

Not Online

Secondary Only

BTREE Only

Mixed with fields

No table rebuild

INSTANT Alter

Faster Insert

Bottom Line: Unless you need a PRIMARY KEY, FULLTEXT or GIS index VIRTUAL is probably better.

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Virtual vs. Stored Performance

• Approximate worst case scenario via a table scan:

46

SELECT DISTINCT feature_type FROM

features;

+--------------+

| feature_type |

+--------------+

| "Feature" |

+--------------+

VIRTUAL-TEXT (9.89 sec)

STORED-TEXT (0.22 sec)

VIRTUAL-JSON (0.85 sec)

STORED-JSON (0.24 sec)

Clarification: Since indexes are materialized (stored) themselves, the real-life case for STORED is when generating the column is computationally expensive and you can not use indexes effectively.

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Unquote JSON String

SELECT

DISTINCT JSON_UNQUOTE(feature->"$.type")

as feature_type

FROM features;

+-----------------+

| feature_type |

+-----------------+

| Feature |

+-----------------+

1 row in set (1.22 sec)

47

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

JSON Path Search

• Provides a novice way to know the path. To retrieve via: [[database.]table.]column->"$<path spec>"

48

SELECT JSON_SEARCH(feature,

'one', 'MARKET') AS

extract_path

FROM features

WHERE id = 121254;

+-----------------------+

| extract_path |

+-----------------------+

| "$.properties.STREET" |

+-----------------------+

1 row in set (0.00 sec)

SELECT

feature->"$.properties.STREET"

AS property_street

FROM features

WHERE id = 121254;

+-----------------+

| property_street |

+-----------------+

| "MARKET" |

+-----------------+

1 row in set (0.00 sec)

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

JSON Array Creation

49

SELECT JSON_ARRAY(id,

feature->"$.properties.STREET",

feature->'$.type") AS json_array

FROM features ORDER BY RAND() LIMIT 3;

+-------------------------------+

| json_array |

+-------------------------------+

| [65298, "10TH", "Feature"] |

| [122985, "08TH", "Feature"] |

| [172884, "CURTIS", "Feature"] |

+-------------------------------+

3 rows in set (2.66 sec)

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

JSON Object Creation

50

SELECT JSON_OBJECT('id', id,

'street', feature->"$.properties.STREET",

'type', feature->"$.type"

) AS json_object

FROM features ORDER BY RAND() LIMIT 3;

+--------------------------------------------------------+

| json_object |

+--------------------------------------------------------+

| {"id": 122976, "type": "Feature", "street": "RAUSCH"} |

| {"id": 148698, "type": "Feature", "street": "WALLACE"} |

| {"id": 45214, "type": "Feature", "street": "HAIGHT"} |

+--------------------------------------------------------+

3 rows in set (3.11 sec)

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

JSON_REPLACE

51

SELECT JSON_REPLACE(feature, '$.type', JSON_ARRAY('feature', 'bug')) as

json_object FROM features LIMIT 1;

+--------------------------------------------------------+

| json_object |

+--------------------------------------------------------+

| {"type": ["feature", "bug"], "geometry": {"type": ..}} |

+--------------------------------------------------------+

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

• 5.7 supports functions to CREATE, SEARCH, MODIFY and RETURN JSON values:

JSON Functions

52

JSON_ARRAY_APPEND()

JSON_ARRAY_INSERT()

JSON_ARRAY()

JSON_CONTAINS_PATH()

JSON_CONTAINS()

JSON_DEPTH()

JSON_EXTRACT()

JSON_INSERT()

JSON_KEYS()

JSON_LENGTH()

JSON_MERGE()

JSON_OBJECT()

JSON_QUOTE()

JSON_REMOVE()

JSON_REPLACE()

JSON_SEARCH()

JSON_SET()

JSON_TYPE()

JSON_UNQUOTE()

JSON_VALID()

https://dev.mysql.com/doc/refman/5.7/en/json-functions.html

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

JSON Comparator

53

SELECT CAST(1 AS JSON) = 1;

+---------------------+

| CAST(1 AS JSON) = 1 |

+---------------------+

| 1 |

+---------------------+

1 row in set (0.01 sec)

JSON value of 1 equals 1

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

JSON ou Coluna?

• Você escolhe

• Vantagens em ambas abordagens

54

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Storing as a Column

• Easier to apply a schema to your application

• Schema may make applications easier to maintain over time, as change is controlled;

• Do not have to expect as many permutations

• Allows some constraints over data

55

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Storing as JSON

• More flexible way to represent data that is hard to model in schema;

• Imagine you are a SaaS application serving many customers

• Strong use-case to support custom-fields

• Historically this may have used Entity–attribute–value model (EAV). Does not always perform well

56

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

JSON (cont.)

• Easier denormalization; an optimization that is important in some specific situations

• No painful schema changes*

• Easier prototyping

• Fewer types to consider

• No enforced schema, start storing values immediately

57

* MySQL 5.6 has Online DDL. This is not as large of an issue as it was historically.

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Schema + Schemaless

58

SSDs have capacity_in_gb, CPUs have a core_count. These attributes are not consistent across products.

CREATE TABLE pc_components (

id INT NOT NULL PRIMARY KEY,

description VARCHAR(60) NOT NULL,

vendor VARCHAR(30) NOT NULL,

serial_number VARCHAR(30) NOT NULL,

attributes JSON NOT NULL

);

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Sumário

1. O movimento NoSQL é de grande relevância e têm os gigantes da Web como protagonistas

2. NoSQL complementa Bancos Relacionais

3. NewSQL = combinando os dois mundos

4. MySQL continua muito relevante na Web

5. Memcached plugin e JSON são exemplos no MySQL de como bancos relacionais podem abraçar o NoSQL

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Obrigado!

Copyright © 2014, Oracle and/or its affiliates. All rights reserved.

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

@MySQLBR meetup.com/MySQL-BR facebook.com/MySQLBR

pt.planet.mysql.com

Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |

Perguntas?

NoSQL em um mundo SQL Contato: airton.lastori@oracle.com twitter.com/mysqlbr facebook.com/mysqlbr

top related