Cubes 1.0 Overviewlight data warehouse and conceptual modelling
Štefan Urbánek, @Stiivi [email protected] November 2014
understandingthrough metadata
datamodel
reporting apps / modules
metadata
❄
logical
physical
Categorical Data
∑ =
OLAP(online analytical processing)
lightweight framework for
conceptual modelling and analytics
Original Cubesbefore 1.0
process or server
store
|
Workspace1 × 1 × model
We needed more!
Models
Stores
file
∑
Postgres Mongo API
APIdatabase
multiple model parts, different sources
multiple data sources, heterogenous
Cubes 1.0
Python ≥ 3.4works with ≥ 2.7 too for the “two” series
■ analytical workspace
■ model providers
■ new and improved backends
■ better extensibility
■ authorisation
Analytical Workspace
Cubes
Model Providers
Stores
sales churn eventsactivations
Static Model Provider
API Model Provider
BI Data(Postgres)
BI Data 2(Mongo)
Events(API)
Workspace
Cubes
Model Providers
Stores
sales churn eventsactivations
Static Model Provider
BI Data(Postgres)
BI Data 2(Mongo)
crm sales events
[workspace] models_path: /var/lib/cubes/models
[models] crm: crm.cubesmodel sales: sales.cubesmodel events: events.cubesmodel
[store crm] type: sql url: postgresql://localhost/crm
[store events] type: mongo host: localhost collection: events
BYOBbring your own backend
Slicer
Backend
|Browser
"Store
#Provider
Logical Physical
physical data store(database or API)
|Browser
"Store
#Provider
∑aggregate
connectcreate model
model
cubes
dimensions
model
backend objects
Model Provider
model
cubes
dimensions
Model Provider
■ metadata on-the-fly
■ local or external source
■ might be linked to a store
model
cubes
dimensions
Model
required
automatic
automatic
automatic
required
Slicer cube dimension
key/attribute
property
column (table)
Dimensions
dimension
Cubes / Facts
metric
table
collection
event
Google Analytics
Mixpanel
MongoDB
SQL
Backend
Model Improvements
Model
■ measures → aggregates
■ more front-end metadata cube categories, dimension role and cardinality
■ customised dimension linking
"measures": [ { "name": "amount", "label": "Sales Amount" }, { "name": "vat", "label": "VAT" } ]
"aggregates": [ { "name": “total_sales", "label": "Total Sales Amount", "measure": "amount", "function": "sum" }, { "name": “total_vat", "label": "Total VAT", "measure": "vat", "function": "sum" }, { "name": "item_count", "label": "Item Count", "function": "count" } ]
Aggregates
■ custom name
■ can refer to other aggregates post-aggregation calculations
■ functions are backend-specific SQL aggregations: sum, count, count_nonempty, count_distinct, min, max, avg, stddev, variance, …
Contextual Dimensions{ "measures": [ … ],
"dimensions": [ {"name": "date", "hierarchies": ["ym", "yqm"]}, {"name": "date", "alias": "contract_date"} ],
… }
alias, hierarchies, exclude_hierarchies, default_hierarchy_name, cardinality,
nonadditive
customisable linking properties:
Dimension Roles
dimension.role
level.role
hint for reporting applications or backends
time
year, month, day, …
Cardinality
dimension.cardinality
tiny < low < medium < high
level.cardinality
<<
overload precautions
Browser
∑
Browser■ uses logical model
■ implements aggregation
■ builds queries
■ retrieves dataLogical Physical
physical data store(database or API)
|Browser
"Store
∑aggregate
model
Browser Methods■ features()
■ aggregate(cell, drilldown,…)
■ members(cell, dimension, …)
■ facts(cell, …)
■ fact(id)
■ cell_details(cell, drilldown, …)
Split Cell
TrueFalse
__within_split__generated dimension
aggregate(split=cell)
Post-aggregation
■ computed on aggregation result in Python
■ moving averages, deviation, variance wma, sma, sms, smstd, smsrd, smsvar
■ aggregate property: window_size
“statutils”
Store
Store■ provides database or API connection
■ might provide a model
■ slicer tool actions (future) validation, schema, optimization, ...
Logical Physical
physical data store(database or API)
|Browser
"Store
connect
SQL Backendalso known as ROLAP
or SQL query generator
SQL Overview
■ new query builder
■ join optimisation
■ support for outer-joins
■ support for “split” dimension
■ new aggregate functions
fact table
❄
join optimisation
master detail
match datefacts
detailmaster
detail datefacts
master detail
master datefacts
"joins" = [ { "master": "fact_contracts.contract_date_id", "detail": "dim_date.id", "method": "detail" } ]
Authentication and Authorisation
{ “lidia”: { “allowed_cubes”: [“sales”], “cube_restrictions”: { “sales”: [“store:3”] } }, “martin”: { “allowed_cubes”: [“sales”], “cube_restrictions”: { “sales”: [“store:5”] } }}
[workspace] authorization: simple
[authorization] rights_file: access_rights.json
!
Authorizer
Slicerserver
✂
Model Queries
■ GET /cubes overview of cubes from all providers
■ GET /cube/sales/model detailed cube model with described dimensions
Browser Queries
■ GET /cube/name/aggregate
■ GET /cube/name/members/dim
■ GET /cube/name/facts
■ GET /cube/name/fact
■ GET /cube/name/cell
Aggregate
GET /cube/sales/aggregate? cut=date:2010& split=status:1&drilldown=date|region& page=10 page_size=100&
{ "cell": [], "total_cell_count": 2, "drilldown": [ { "record_count": 31, "amount_sum": 550840, “date.year": 2009 }, { "record_count": 31, "amount_sum": 566020, “date.year": 2010 } ], "summary": { "record_count": 62, "amount_sum": 1116860 }}
Special Characters
“category:10\-24” → “10-24”
“city:Nové\ Mesto\ nad\ Váhom” → “Nové Mesto nad Váhom"
Relative Time
date:yesterday
date:90daysago-today
expiration_date:lastmonth-next2months
uses dimension roles and Calendar
Output Format
format=csv
format=json
format=json_lines
*for facts and members
*
Deploymentreporting for your app or stand-alone
Public
store
Slicer server
HTML & JS Application
HTTP request
JSON reply
model
Public
store
WSGI
HTML & JS Application
HTTP request
JSON reply
Slicer Flask App
model
Public
store
JSON reply
CubesPython API
Django, Flask, …
HTML
model
Public
store
Flask
HTML
Slicer Blueprintmodel
Internal
Public
store
Slicer server
Web ApplicationPHP, RoR, Django
HTTP request
JSON reply
model
HTML
Front-endsgeneric ad-hoc reporting
✂Slicer
Jose Juan Montes, ! jjmontesl/cubesviewer
Cubes Viewer
checkgermany.deFront-end by: Felix Ebert (@femeb)
Data by: Friedrich Lindenberg (@pudo) Cubes 0.10.2
*
Summary & Future
Summary
■ heterogenous pluggable environment
■ easier to extend
■ better SQL query generator
Not Mentioned
■ localisation
■ namespaces
■ calendar
■ query logging
Incubated
■ non-additive properties
■ periods-to-date
■ modeler app
■ cubes.js
Future
■ arithmetic expressions
■ SQL improvements
■ improved API for custom browsers
■ cubes.js
Nutrition FactsServing Size 1 cube
Total Fat 0g
Trans Fat 0g
Amount Per Serving
Saturated Fat 0g
% Daily Value
Total Carbohydrate 0g
Sugars 0g
Dietary Fiber 0g
0%
0%
Want to contribute?
#TODO, #FIXME, Issue #
https://github.com/DataBrewery/cubes/issues
Credits
Thanks for 1.0
Robin Thomas Ryan Berlew
Jose Juan Montes Squarespace
and all contributors on Github
Thank You
"Stiivi