Du relationnel au NoSql
SQL & SGBDR
Système de gestion de base de données
Relationnel
Modèle relationnel
SQL & SGBDR
➢ Algèbre Relationnelle➢ Théorie des ensembles➢ +40 ans d’optimisations
Ceux qui ne comprennent pas la théorie
des ensembles
????
Ceux qui comprennent la
théorie des ensembles
venn diagram sql
ACID
➢ Transaction ACIDAtomic: Everything in a transaction succeeds or the entire
transaction is rolled back.Consistent: A transaction cannot leave the database in an
inconsistent state.Isolated: Transactions cannot interfere with each other.Durable: Completed transactions persist, even when servers
restart etc.
Limites
➢ Difficilement distribuable➢ Schéma contraignant➢ Object Relational Mapper
■ Impedance mismatch➢ Performances➢ Coûts
Problème de lenteurs ?
Conception
➢ Lenteurs○ Souvent un problème de conception○ Tout modèle est optimisé pour
un cas d’utilisation
➢ SQL ou NoSQL ne change rien à ce problème
Index
➢ Un index est une structure de données redondante organisée de manière à accélérer certaines recherches.
Normalisation / Dénormalisation
Monté en charge verticale
n1-standard-11 CPU 3,75Go RAM51$/mois
n1-highmem-3232 CPU208Go RAM1900$/mois
Réplication pour la lecture
master
slave
slave
write
read
readwrite
Sharding pour augmenter les capacités d’écriture
A B Creadwrite
readwrite
readwrite
Concepts
➢ Monté en charge horizontale/verticale➢ Dénormalisation➢ Sharding/partitionning➢ Réplication
NoSQL
Qui ?
Google Big table map/reduce (2006)
Amazon Dynamo (2007)
Facebook Cassandra (2008)
=> Changement d’échelle
Comment ?
Distribution on commodity hardware
It will fail
Abandon des principes ACIDEn particulier la Consistance
➢ BASE○ Basic Availability○ Soft-state○ Eventual consistency
Comment ?
ACID � ➢ Strong consistency➢ Isolation � ➢ Focus on “commit”➢ Nested transactions
Availability?➢ Conservative (pessimistic)
➢ Difficult evolution (e.g. schema)
BASE � ➢ Weak consistency
○ stale data OK➢ Availability first➢ Best effort➢ Approximate answers OK➢ �Aggressive (optimistic)
➢ Simpler➢ Faster �➢ Easier evolution
NoSQL ACID vs BASE
continuum
CAP Theorem
Consistency
Availibity
SGBDR
1. The network is reliable.2. Latency is zero.3. Bandwidth is infinite.4. The network is secure.5. Topology doesn't change.6. There is one administrator.7. Transport cost is zero.8. The network is homogeneous.
Fallacies of distributed computing
CAP Theorem
Consistency
Availibity
SGBDR
CAP Theorem
Consistency
PartitionTolerance
BigtableMongoDbRedis
CAP Theorem
PartitionTolerance Availibity
DynamoCassandra
CAP Theorem
Consistency
PartitionTolerance Availibity
No
Pick two ! Eric A. Brewer 2000
Towards Robust Towards Robust Distributed Systems
Cas d’usages
➢ Performance & réduction des coûts○ Distribution○ élasticité (cloud)
➢ Agilité des développements○ NoORM○ NoSchema
➢ Modélisations alternatives○ Graph
NoSQL database type
One More Thing...
One More Thing...
?
La base de donnée NoSQL la plus utilisée est
One More Thing...
Lucene !
Solr / ElasticSearch
Ressources
➢ Presentations○ Dynamo and BigTable in light of the CAP theorem○ Towards Robust Towards Robust Distributed Systems
➢ Papers○ Dynamo: Amazon’s Highly Available Key-value Store○ Bigtable: A Distributed Storage System for Structured Data
➢ Videos○ Introduction to NoSQL by Martin Fowler
➢ Books○ NoSQL Distilled○ Seven Databases in Seven Weeks: A Guide to Modern Databases and
the NoSQL Movement