Top Banner
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | SQL Analytics for Analysis, Reporting and Modeling Key SQL Functionality for ANALYTICS in the cloud and on-premise with Oracle Database: 18c 12c Release 2
206

SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

May 31, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SQLAnalyticsforAnalysis,ReportingandModelingKeySQLFunctionalityforANALYTICSinthecloudandon-premisewithOracleDatabase:

18c 12cRelease2

Page 2: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

• FreeService–livesql.oracle.com

• Featuresinclude:– Accesstoverylatest18cfeatures– Abilitytosavecollectionsofstatementsasascript– Accesstogrowinglibraryoftutorials– Sharesavedscriptswithothers– Embeddededucationaltutorials– DataaccessexamplesforpopularlanguagesincludingJava

– Comescompletewithsampleschemas• HumanResourcesschema.• SalesHistoryschema• SCOTTschema• WorldPopulationdata• DinoDatedemodata• Olympicdata

2

LiveSQL–TheEasiestWaytoExplore,LearnandTrySQL

Page 3: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SafeHarborStatement

Thefollowingisintendedtooutlineourgeneralproductdirection.Itisintendedforinformationpurposesonly,andmaynotbeincorporatedintoanycontract.Itisnotacommitmenttodeliveranymaterial,code,orfunctionality,andshouldnotberelieduponinmakingpurchasingdecisions.Thedevelopment,release,andtimingofanyfeaturesorfunctionalitydescribedforOracle’sproductsremainsatthesolediscretionofOracle.

3

Page 4: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

• OverviewofnewSQLFeatures– SAFEHARBORSTATEMENT

• What’snewin18Release1(SAFEHARBORSTATEMENT)

– ROUND()– PolymorphicTableFunctions– ApproximateQueryProcessing

• Approx.Top/Bottom-N

– AnalyticViews• MDXSupport

– Privatetemporarytables– InlineExternalTables– Column-BasedCollation

OracleDatabase18Release1–NewSQLFeatures

4

Page 5: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

• OverviewofnewSQLFeatures

What’snewin12cRelease2

• LISTAGG– SupportforlargerVARCHAR2objects

• CAST/VALIDATE

• Approximatestatistics– APPROX_PERCENTILE– APPROX_MEDIAN

• Approximateaggregations– APPROX_xxxx_DETAIL,APPROX_xxxx_AGG– TO_APPROX_xxxx

• Externaltables– Externaltable-MODIFYclause– Partitionedexternaltable– AccessingdatainHive,HDFSetc

• AnalyticViews

OracleDatabase12cRelease2–NewSQLFeatures

5

Page 6: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

• OverviewofcoreSQLFeatures

• Schemamodelingenhancements– Invisiblecolumns– Defaultvalueenhancements– Identitycolumns

• Storageoptimizations– AttributeClustering– ZoneMaps– ZoneMapsandattributeclustering– ZoneMapsandpartitioning– ZoneMapsandstorageindexes

• SQLforadvancedanalysis– TOP-N– MATCH_RECOGNIZE– APPROX_COUNT_DISTINCT

• Queryrewrites– Materializedviews– In-place/out-of-placerefresh– Synchronousrefresh

• Multilingualsupport– Databoundcollations

OracleDatabase12cRelease2-CoreSQLFeatures

6

Page 7: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

AnalyticSQL@#oow17

• LinktoCompleteDataWarehouseandBigDataGuideto#oow17

• Linkto#oow17webappfordatawarehousingandbigdata

• ListofSQLsessions

• Listofdatawarehousesessions

• Listofhands-onlabs

KeysessionsandLabs

7

Page 8: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

What’s new in 18 Release 1…evenmoreApproximatequeryprocessingfeaturestoself-describingTableFunctions

8

Page 9: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

NewROUND_TIES_TO_EVEN()Functionin18.1• Thisenhancementwillprovidenewroundingfunction

ROUND_TIES_TO_EVEN(n [, integer])

• ROUND_TIES_TO_EVENandROUNDhavethesamebehaviorexceptwhentheroundingdigitisatthemidpoint.– ROUND_TIES_TO_EVENwillreturnthenearestvaluewithaneven(zero)leastsignificantdigit.

– ROUNDwillreturnnearestvalueabove(forpositivenumbers)orbelow(fornegativenumbers).

• WillnotsupportBINARY_FLOATandBINARY_DOUBLE

9

Page 10: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ComparingROUND()andROUND_TIES_TO_EVEN()

10

Value ROUND(Value,0)

ROUND_TIES_TO_EVEN(Value,0)

1.6 2 2

-1.6 -2 -2

0.5 1 0

-0.5 -1 0

2.5 3 2

-2.5 -3 -2

Page 11: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Top-Napproximateaggregation• Approximateresultsforcommontopnqueries

– Howmanyapproximatepageviewsdidthetopfiveblogpostsgetlastweek?– Whatwerethetop50customersineachregionandtheirapproximatespending?

• Ordersofmagnitudefasterprocessingwithhighaccuracy(errorrate<0.5%)

• NewapproximatefunctionsAPPROX_COUNT(),APPROX_SUM(),APPROX_RANK()

11

SELECT blog_post, APPROX_COUNT(*) FROM weblog GROUP BY blog_post HAVING APPROX_RANK(order by APPROX_COUNT(*) DESC) <= 5;

SELECT region, customer_name, APPROX_RANK(PARTITION BY region ORDER BY APPROX_SUM(sales) DESC) appr_rank, APPROX_SUM(sales) appr_sales FROM sales_transactions GROUP BY region, customer_name HAVING APPROX_RANK(...) <=50;

Top5blogswithapproximatehits Top50customersperregionwithapproximatespending

Page 12: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

PolymorphicTables:Self-Describing,FullyDynamicSQL• PartofANSI2016• EmbedsophisticatedalgorithmsinSQL

– Hidesimplementationofalgorithm– Leveragepowerful,dynamiccapabilitiesofSQL– Passinanytable-columnsforprocessing– Returnsrowset(table,JSON,XMLdoc,etc.)

• Appliesbuilt-inalgorithmsand/orcustomalgorithms

• Returnsanenhancedsetofrows-columnsasoutput(table)

• E.g.returncreditscoreandassociatedrisklevel

12

SQLSQL

MODEL

INPUTS

SQL

HHHH

STATE_ID RISKA_SCOREPOP LOANS A_LOAN

Page 13: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

PTFs:UseCasesForFullyDynamicSQL• Embedcreditriskevaluationmodel

– Hidesimplementationofcreditriskmodel– Passinkeycolumnstoevaluatecreditrisk– PTFreturnscreditscoreandassociatedrisklevel

• Simplifyaccesstoexternaldatasets– Passinanyserverconnectiondetailsandanysourcefile

– Returnsrow-columnbasedformattedresults

13

SELECT state, AVG(credit_score) FROM CREDIT_RISK( tab => table(CUSTOMERS), cols => columns(DOB, ZIP, LoanDefault), outs => columns(credit_score, risk_level)) WHERE risk_level = ‘High’ GROUP BY STATE;

SELECT * FROM HDFS_READER( host_port => ‘http://<host>:<port>’, path => ‘customer_reviews_2013.json’, outs => columns(“cust_id” varchar(20), “prod.id” integer, “prod.desc” varchar(500) ));

Page 14: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

EnhancementstoAnalyticViews• MorecalculationswithinAnalyticViews:

– Rankingandstatisticalfunctions– Hierarchicalexpressions

• BroaderschemasupportforAnalyticViews:– Snowflakeschemas;flat/denormalizedfacttables(inadditiontostarschemas)

• DynamicdefinitionofcalculationswithinSQLqueries• SupportforMDX

14

Page 15: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

PrivateTemporaryTablesGlobaltemporarytables• Persistent,shared(global)tabledefinition• Temporary,private(session-based)datacontent

– Dataphysicallyexistsforatransactionorsession– Session-privatestatistics

15

ACC_TMP

ACC_TMPACC_TMP

Privatetemporarytables(18.1)• Temporary,private(session-based)tabledefinition

– Privatetablenameandshape

• Temporary,private(session-based)datacontent– Sessionortransactionduration

ACC_PTMPACC_PTMP

Page 16: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

InlineExternalTables• Externaltabledefinitionprovidedatruntime

– Similartoinlineview

• Noneedtopre-createexternaltablesthatareusedonetimeonly– Increaseddeveloperproductivity

16

INSERT INTO sales SELECT sales_xt.* FROM EXTERNAL( (prod_id number, … ) TYPE ORACLE_LOADER … LOCATION ’new_sales_kw13') REJECT LIMIT UNLIMITED );

CREATE TABLE sales_xt (prod_id number, … ) TYPE ORACLE_LOADER … LOCATION ’new_sales_kw13') REJECT LIMIT UNLIMITED );

INSERT INTO sales SELECT * FROM sales_xt;

DROP TABLE sales_xt;

Page 17: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Column-BasedCollation• Preciseandconsistentapplicationoflinguisticcomparisoninqueries

– AddsCOLLATEclausetodeclarecolumn’scollationtobeusedinallqueries– COLLATEoperatorpreciselycontrolscollationinexpressions

• Case-andaccent-sensitivecollations(e.g.BINARY_CI)simplifyimplementationofcase-insensitivequeries

• FeatureisbasedonISO/IECSQLStandardandsimplifiesapplicationmigrationfromotherdatabasessupportingtheCOLLATEclause

CREATE TABLE products( product_code VARCHAR2(20 BYTE) COLLATE BINARY, product_name VARCHAR2(100 BYTE) COLLATE GENERIC_M_CI

17

Page 18: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

What’s new in 12c Release 2FromApproximatequeryprocessingtonewVALIDATEFunctionalitytonewdimensionalmodelingwithanalyticviews

18

Page 19: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Pre-12.2LISTAGG

•Pre12.2syntaxtomanagelistswasrelativelysimple:LISTAGG(c.cust_first_name||' '||c.cust_last_name, ',’) WITHIN GROUP (ORDER BY c.country_id) AS Customer

• Issue….keyissueisoverflowerror:–ORA-01489: result of string concatenation is too long

• Solutionsin12.2– IncreasingtheVARCHAR2size-supportVARCHAR2upto32k– Handleoverflowerrors-Newsyntaxsupporttotruncatestring,optionallydisplaycountoftruncateditemscount,andsettruncationindication

19

Page 20: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

NewKeywordsForUseWithLISTAGG• With12.2wehavemadeiteasiertomanagelists:

LISTAGG(<measure_column>[, <delimiter>]. . .– ON OVERFLOW ERROR (default)– ON OVERFLOW TRUNCATE – ON OVERFLOW TRUNCATE “. . .”– WITH COUNT– WITHOUT COUNT (default)

20

Page 21: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

DetectingDataConversionErrors-VALIDATE_CONVERSION

• Usefultodetectifinputvaluecanbeconvertedtodestinationtype.Returns1ifconversionissuccessful,otherwisereturns0

• VALIDATE_CONVERSION('123a'asNUMBER)-->returns0• VALIDATE_CONVERSION('123'asNUMBER)-->returns1

• Canbeefficientlyusedasfiltertoavoidbaddatawhileimportingforeigndatasources,ETLprocessing

21

Identifyinginvaliddataintheinputstreams

Page 22: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Handlingdataconversionerrors-TO_xxxx(),CAST()

• Pre12.2:TO_NUMBER('123a')-->returnsinvalidnumbererror(ora-01722)

New12.2Features• NewsyntaxDEFAULT<default_value>ONCONVERSIONERROR

– Replaceconversionfailurewithuserdefineddefaultvalue– TO_NUMBER('123a'DEFAULT'123'ONCONVERSIONERROR)-->returns123

• ThisnewsyntaxcanbeusedforTO_NUMBER,TO_DATE,TO_TIMESTAMP,TO_TIMESTAMP_TZ,TO_DMINTERVAL,TO_YMINTERVALandCAST

22

-Replacingincorrectormissingdatawithdefaultvalues

Page 23: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Review:AnalyticViewsin12.2EnhancedAnalysisandSimplifiedAccess

• Organizesdataintoauserandapplicationfriendlybusinessmodel– Intuitivefortheenduser

• DefinedwithSQLDDL– Includeshierarchicalexpressionsandcalculatedmeasures– Easytodefine,supportedbySQLDeveloper

• EasilyqueriedwithsimpleSQLSELECT– SmartAnalyticView(containinghierarchiesandcalculations)=SimpleQuery

23

Page 24: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Review:AnalyticViewsin12.2EmbeddedCalculations

• DefinecentrallyintheDatabaseandaccesswithanyapplication– Singleversionofthetruth

• Easilycreatenewmeasures– Simplifiedsyntaxbasedonbusinessmodel

– Includesdimensionalandhierarchicalfunctions

24

SalesYeartoDatesales_ytd AS (SUM(sales) OVER(HIERARCHY time_hierarchy BETWEEN UNBOUNDED PROCEEDING AND 0 FOLLOWING WITHIN ANCESTOR AT LEVEL year)

ProductShareofParentshare_product_parent_sales AS (SHARE_OF (sales HIERARCHY product_hierachy PARENT))

Page 25: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ApproximateStatistics

• Issue:PERCENTILE_CONT, PERCENTILE_DISC, MEDIAN functionsrequiresortingandcanconsumelargeamountsofresources

• Solution:NewapproximateSQLfunctionsusefewerresources:APPROX_PERCENTILEAPPROX_MEDIAN

– Uselessmemory,nosorting,nouseoftemp

25

Page 26: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Howtogetmoreinformationaboutresultset

• Eachfunctioncanusedifferentalgorithmsandreporterrorratesandconfidencelevels:

1. DETERMINISTIC/NONDETERMINISTIC [default]– Non-deterministicisfasterbutresultsmayvary,goodforpersonaldatadiscoveries– Deterministic,slightlyslower;betterwhereresultsaresharedwithotherusers

2. ERROR_RATE– Returnsthemarginoferrorassociatedwithresult

3. CONFIDENCE– Returnedasapercentagethatindicatesthelevelofconfidence

Additionalkeywords

26

Page 27: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

NewFunctionsForBuildingApproximateAggregates1. APPROX_xxxxxx_DETAIL(expr[DETERMINISTIC])

– builds summary table containing results for all dimensions in GROUP BY clause– Data stored within MV as a BLOB object

2. APPROX_xxxxxx_AGG(expr)– Builds higher level summary table based on results from table derived from _DETAIL function– Does not re-query base fact table, derives new aggregates from _DETAIL table– Data stored within MV as a BLOB object

3. TO_APPROX_xxxxxx(detail,percentage,order)– Returns results from the specified aggregated results table

select ... to_approx_percentile(approx_percentile_agg(detail),0.5)

27

Page 28: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

External Tables• Keyissues:

– Definitionofexternaltableisfixedatcreationtime– Needabilitytodefinetableonceanduseitmultipletimes,toaccessdifferentexternalfiles– Applysametabledefinitiontodifferentinputs

• Solution:– AddedEXTERNALMODIFYclause– Easeofuseenhancementforusingexternaltables– Clauseallowsexternaltabletobeoverriddenatquerytime– Properties:DEFAULT_DIRECTORY,certainACCESSPARAMETERS,LOCATIONandREJECTLIMIT

28

Page 29: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Core SQL in 12c Release 2FromstorageoptimizationstoSQLpatternmatchingtodataboundcollationstosupportmulti-lingualsystems

29

Page 30: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

OverviewofSchemaModelingEnhancements• Invisible Columns • DEFAULT VALUE enhancements

– Metadata-Only Default column values for NULL’able columns – Default values for columns on explicit NULL insertion – Default values for columns based on sequences

• Multiple Indexes on the same columns • IDENTITY columns

30

Page 31: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

AttributeClustering

• Ordersdatasothatitisincloseproximitybasedonselectedcolumnsvalues:“attributes”

• Attributescanbefromasingletableormultipletables– e.g.fromfactanddimensiontables

• SignificantIOpruningwhenusedwithzonemaps

• ReducedblockIOfortablelookupsinindexrangescans

• Queriesthatsortandaggregatecanbenefitfrompre-ordereddata

• Enableimprovedcompressionratios– Ordereddataislikelytocompressmorethanunordereddata

ConceptsandBenefits

31

Page 32: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

BasicsofZoneMaps• Independentaccessstructurebuiltforatable

– Implementedusingatypeofmaterializedview– Forpartitionedandnon-partitionedtables

• Onezonemappertable– Zonemaponpartitionedtableincludesaggregateentryper[sub]partition

• Usedtransparently– Noneedtochangeorhintqueries

• Implicitorexplicitcreationandcolumnselection– ThroughAttributeClustering:CREATETABLE…CLUSTERING– CREATEMATERIALIZEDZONEMAP…ASSELECT…

32

Page 33: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

PatternRecognitionInSequencesofRows

• RecognizepatternsinsequencesofeventsusingSQL– Sequenceisastreamofrows– Eventequalsarowinastream– NewSQLconstructMATCH_RECOGNIZE– Logicallypartitionandorderthedata– ORDERBYandPARTITIONBYareoptional–butbecareful

• Patterndefinedusingregularexpressionusingvariables– Regularexpressionismatchedagainstasequenceofrows– Eachpatternvariableisdefinedusingconditionsonrowsandaggregate

33

SQLPatternMatching-Concepts

Page 34: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

DistinctCountstosupport“HowManyUnique…”Businessesneedtoanswerslotsofdifferent“Howmany…”typequestions

– Howmanyuniquesessionstoday– Howmanyuniquecustomersloggedon– Howmanyuniqueeventsoccurred

Mostqueriesdon’tneedpreciseanswers,approximateanswergoodenough– Approximateanswerscanbereturnedsignificantlyfaster– Approximateanswersconsumefewerresources,leavingresourcesforotherqueries

34

Page 35: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

OverviewofMaterializedViewsinOracleDatabase12c• Objectives

– Improveperformanceofrefreshoperation–Minimizestalenesstimeofmaterializedviews

• Twofundamentalnewconceptsforrefresh–Out-of-placerefresh

• Refresh“shadowMV”andswapwithoriginalMVafterrefresh

– Synchronousrefresh• RefreshbasetablesandMVssynchronously,leveragingequi-partitioningoftheobjects

35

Page 36: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Enhancements to External Tables• Issues:

– Definition of external table is fixed at creation time – Need ability to define table once and use it multiple times, to access different

external files – Need better integration with big data source files

• Solutions: – Added EXTERNAL MODIFY clause to allow overriding properties – Partitioned external tables for source files stored on file system, Apache Hive

storage, or HDFS

36

Page 37: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Data-BoundCollations

“…anamedsetofrulesdescribinghowtocompareandmatchcharacterstringstoputtheminaspecifiedorder…”• BasedontheISO/IEC/ANSISQLstandard9075:1999• Charactersetisalwaysdeclaredatthedatabaselevel• Collationdeclaredforacolumn

– Doesnotdeterminethecharactersetofdatainthecolumn

• Whyisitimportant?– itsimplifiesapplicationmigrationtotheOracleDatabasefromanumberofnon-Oracledatabasesimplementingcollationinasimilarway

37

Page 38: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 38

ROUND()FunctionNewfinancialroundingfeatures

Page 39: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

NewROUND_TIES_TO_EVEN()Functionin18.1• FormaldefinitionforROUND_TIES_TO_EVENfunctionality

RoundTiesToEven:thefloating-pointnumbernearesttotheinfinitelypreciseresultshallbedelivered;ifthetwonearestfloating-pointnumbersbracketinganunrepresentableinfinitelypreciseresultareequallynear,theonewithanevenleastsignificantdigitshallbedelivered

39

Page 40: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

NewROUND_TIES_TO_EVEN()Functionin18.1• Thisenhancementwillprovidenewroundingfunction

ROUND_TIES_TO_EVEN(n [, integer])

• ROUND_TIES_TO_EVENandROUNDhavethesamebehaviorexceptwhentheroundingdigitisatthemidpoint.– ROUND_TIES_TO_EVENwillreturnthenearestvaluewithaneven(zero)leastsignificantdigit.

– ROUNDwillreturnnearestvalueabove(forpositivenumbers)orbelow(fornegativenumbers).

• WillnotsupportBINARY_FLOATandBINARY_DOUBLE

40

Page 41: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ComparingROUND()andROUND_TIES_TO_EVEN()

41

Value ROUND(Value,0)

ROUND_TIES_TO_EVEN(Value,0)

1.6 2 2

-1.6 -2 -2

0.5 1 0

-0.5 -1 0

2.5 3 2

-2.5 -3 -2

Page 42: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 42

PolymorphicTableFunctions

Page 43: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

WhatisaSelf-Describing/PolymorphicTableFunction?

• PolymorphicTableFunctions(PTF)areuser-definedfunctionsthatcanbeinvokedintheFROMclause.

• Capableofprocessinganytable– rowtypeisnotdeclaredatdefinitiontime– producesaresulttablewhoserowtypemay/maynotbedeclaredatdefinitiontime.

• Allowsapplicationdeveloperstoleveragethelong-defineddynamicSQL– SimpleSQLaccesstopowerfulandcomplexcustomfunctions.

43

ANSISQL2016:Definition

CREDIT RISK

MODEL

BLACK-BOX

Page 44: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

PTFTaxonomy

44

Page 45: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

PTFTaxonomy-Explained• Non-LeafPTF:Transformsanarbitraryinputrowstreamintoanoutputrowstream.

• RowSemantics–ThePTFactsonasinglerowatatime,toproduceitszero,one,ormanyoutputrows.

• TableSemantics–ThePTFactsonasetofrows.Wheretheinputtableisoptionallypartitionedintodisjointsetsandeachsetisoptionallyordered.

• LeafPTF:Doesn’thaveinputparametersoftableorquerytype.Typicallyusedforaccessing“foreign”datasources.

OntheRoadmap

Page 46: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Top5PTFOptimizations

✓ Passthroughcolumns✓ Projectionandpredicatepush-down/push-through✓ PTFexecutionin-linedwithSQLexecution✓ BulkdatatransferintoandoutofPTF✓ ParallelExecution

Page 47: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Part1-DefineImplementationPackageCREATE OR REPLACE PACKAGE echo_package AS -- @Required procedure Describe(-- Generic Arguments: newcols OUT DBMS_TF.columns_new_t, -- Specific Arguments: tab IN OUT DBMS_TF.table_t, cols IN DBMS_TF.columns_t); -- @Optional procedure Open; -- @Required procedure Fetch_Rows; -- @Optional procedure Close; end;

47

Page 48: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Part2-DefinePolymorphicTableFunction

CREATE OR REPLACE FUNCTION echo(tab table, cols columns) RETURN TABLE PIPELINED ROW POLYMORPHIC USING echo_package;

48

Page 49: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Part3a-ImplementationofPackageBodyCREATE OR REPLACE PACKAGE BODY echo_package AS PROCEDURE Describe( -- Generic Arguments: newcols OUT DBMS_TF.columns_new_t,

-- Specific Arguments: tab IN OUT DBMS_TF.table_t, cols IN DBMS_TF.columns_t) as read_count pls_integer := 0; begin . . . end;

49

Page 50: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Part3b-ImplementationofPackageBodyPROCEDURE Open as env DBMS_TF.env_t := DBMS_TF.Get_Env();

begin DBMS_TF.Trace('Open()'); DBMS_TF.Trace('Get_Col.Count = '|| env.get_columns.count, prefix => '....'); DBMS_TF.Trace('Put_Col.Count = '|| env.put_columns.count, prefix => '....'); end;

50

Page 51: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Part3c-ImplementationofPackageBody

PROCEDURE Fetch_Rows as Col DBMS_TF.tab_varchar2_t; col_count pls_integer := DBMS_TF.Get_Env().get_columns.count; begin . . .

end;

51

Page 52: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Part3d-ImplementationofPackageBody PROCEDURE Close as begin DBMS_TF.Trace('Close()', separator=>'*'); end;

52

Page 53: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

UsingAPolymorphicTableSELECT * FROM ECHO(emp, COLUMNS(ename, job)) WHERE deptno = 20;

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO ECHO_ENAME ECHO_JOB ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- --------------- --------------- 7369 SMITH CLERK 7902 17-DEC-80 800 20 ECHO-SMITH ECHO-CLER 7566 JONES MANAGER 7839 02-APR-81 2975 20 ECHO-JONES ECHO-MANA 7788 SCOTT ANALYST 7566 19-APR-87 3000 20 ECHO-SCOTT ECHO-ANAL 7876 ADAMS CLERK 7788 23-MAY-87 1100 20 ECHO-ADAMS ECHO-CLER 7902 FORD ANALYST 7566 03-DEC-81 3000 20 ECHO-FORD ECHO-ANAL

53

Page 54: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ExplainPlanforPolymorphicTableEXPLAIN PLAN FOR SELECT * FROM ECHO(emp, COLUMNS(ename, job)) WHERE deptno = 20; ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 500 | 2 (0)| 00:00:01 | | 1 | VIEW | | 5 | 500 | 2 (0)| 00:00:01 | | 2 | POLYMORPHIC TABLE FUNCTION | ECHO | | | | | | 3 | VIEW | | 5 | 435 | 2 (0)| 00:00:01 | |* 4 | TABLE ACCESS FULL | EMP | 5 | 435 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - filter("EMP"."DEPTNO"=20) Note ----- - dynamic statistics used: dynamic sampling (level=2)

54

Page 55: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ExplainPlanforParallelExecutionofPolymorphicTableALTER TABLE emp PARALLEL 2; EXPLAIN PLAN FOR SELECT * FROM ECHO(emp, COLUMNS(ename, job)) WHERE deptno = 20; ------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 500 | 2 (0)| 00:00:01 | | 1 | PX COORDINATOR | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10000 | 5 | 500 | 2 (0)| 00:00:01 | | 3 | VIEW | | 5 | 500 | 2 (0)| 00:00:01 | | 4 | POLYMORPHIC TABLE FUNCTION | ECHO | | | | | | 5 | VIEW | | 5 | 435 | 2 (0)| 00:00:01 | | 6 | PX BLOCK ITERATOR | | 5 | 435 | 2 (0)| 00:00:01 | |* 7 | TABLE ACCESS FULL | EMP | 5 | 435 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 7 - filter("EMP"."DEPTNO"=20) Note ----- - dynamic statistics used: dynamic sampling (level=2)

55

Page 56: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ExplainPlanforPolymorphicTable-usingIMCDTs

EXPLAIN PLAN FOR WITH e AS (SELECT /*+ MATERIALIZE */ * FROM emp) SELECT * FROM ECHO(e, COLUMNS(ename, job)) WHERE deptno = 20; ---------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 14 | 1400 | 4 (0)| 00:00:01 | | 1 | TEMP TABLE TRANSFORMATION | | | | | | | 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6612_276EFC | | | | | | 3 | TABLE ACCESS FULL | EMP | 14 | 1218 | 2 (0)| 00:00:01 | | 4 | VIEW | | 14 | 1400 | 2 (0)| 00:00:01 | | 5 | POLYMORPHIC TABLE FUNCTION | ECHO | | | | | | 6 | VIEW | | 14 | 1218 | 2 (0)| 00:00:01 | |* 7 | VIEW | | 14 | 1218 | 2 (0)| 00:00:01 | | 8 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6612_276EFC | 14 | 1218 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------------------------------

56

Page 57: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ExplainPlanforPolymorphicTable–UsingResultsCache

EXPLAIN PLAN FOR WITH e AS (SELECT /*+ result_cache */ * FROM echo(emp, COLUMNS(ename, job))) SELECT * FROM e WHERE deptno = 20; ------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 14 | 1400 | 2 (0)| 00:00:01 | |* 1 | VIEW | | 14 | 1400 | 2 (0)| 00:00:01 | | 2 | RESULT CACHE | df9wucm9ak4br4mdpt7t2z1xv8 | | | | | | 3 | VIEW | | 14 | 1400 | 2 (0)| 00:00:01 | | 4 | POLYMORPHIC TABLE FUNCTION | ECHO | | | | | | 5 | VIEW | | 14 | 1218 | 2 (0)| 00:00:01 | | 6 | TABLE ACCESS FULL | EMP | 14 | 1218 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("DEPTNO"=20) Result Cache Information (identified by operation id): ------------------------------------------------------ 2 - column-count=10; dependencies=(SCOTT.EMP, SCOTT.ECHO_PACKAGE, SCOTT.ECHO_PACKAGE, SCOTT.ECHO); attributes=(dynamic); name="select /*+ result_cache */ * from ECHO(emp, columns(ename, job))"

57

Page 58: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ExplainPlanforPolymorphicTable–TemporalQueries

EXPLAIN PLAN FOR WITH e AS (SELECT * FROM emp AS OF TIMESTAMP (SYSTIMESTAMP - INTERVAL '1' MINUTE)) SELECT * FROM echo(e, COLUMNS(ename,job));

------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 82 | 8200 | 2 (0)| 00:00:01 | | 1 | VIEW | | 82 | 8200 | 2 (0)| 00:00:01 | | 2 | POLYMORPHIC TABLE FUNCTION | ECHO | | | | | | 3 | VIEW | | 82 | 7134 | 2 (0)| 00:00:01 | | 4 | TABLE ACCESS FULL | EMP | 82 | 7134 | 2 (0)| 00:00:01 | -------------------------------------------------------------------------------------

58

Page 59: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

• Simplertodesignandbuild• Providescompletereusability• Simplertomakeparallelenabled• Simplertodeploy• MovesmoreprocessingbackinsideDB

59

SummaryKeyBenefitsofPolymorphicTables

Page 60: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 60

ApproximateTop-NFiltering

Page 61: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Top-NQueries• Whatarethetopfiveproductssoldbyweekforthepastyear?• Whoarethetopfiveearnersbyregion?• Howmanypageviewsdidthetopfiveblogpostsgetlastweek?• Howmuchdidmytopfiftycustomerseachspendlastyear?• Whatcomponentsarefailingmostoftenbyvehiclemodel?

61

Read

Sort

Top-NWeblogData

Sortingistime-consuming

Page 62: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Top-Napproximateaggregation• Approximateresultsforcommontopnqueries

– Howmanyapproximatepageviewsdidthetopfiveblogpostsgetlastweek?– Whatwerethetop50customersineachregionandtheirapproximatespending?

• Ordersofmagnitudefasterprocessingwithhighaccuracy(errorrate<0.5%)

• NewapproximatefunctionsAPPROX_COUNT(),APPROX_SUM(),APPROX_RANK()

62

SELECT blog_post, APPROX_COUNT(*) FROM weblog GROUP BY blog_post HAVING APPROX_RANK(order by APPROX_COUNT(*) DESC) <= 5;

SELECT region, customer_name, APPROX_RANK(PARTITION BY region ORDER BY APPROX_SUM(sales) DESC) appr_rank, APPROX_SUM(sales) appr_sales FROM sales_transactions GROUP BY region, customer_name HAVING APPROX_RANK(...) <=50;

Top5blogswithapproximatehits Top50customersperregionwithapproximatespending

Page 63: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ApproximateTop-NQueries• Approx.functions:

– APPROX_COUNTandAPPROX_RANK

• Highperformance– Thebenefitismostsignificantforlargedatasets

• Highaccuracy–Maximumerrorreporting

• "Top-NStructure"issmallandmemory-resident– Nodisksorts

63

Read Top-NWeblogData

Top-NStructure

Page 64: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 64

AnalyticViewEnhancements

Page 65: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

EnhancementstoAnalyticViews• MorecalculationswithinAnalyticViews:

– Rankingandstatisticalfunctions• RANK_*,PERCENTILE_*,STATS_*,COVAR_*

– Hierarchicalexpressions• HIER_DEPTH,HIER_LEVEL,HIER_MEMBER_NAME,etc

• BroaderschemasupportforAnalyticViews:– Snowflakeschemas;flat/denormalizedfacttables(inadditiontostarschemas)

• MorepowerfulSQLoverAnalyticViews:– DynamicdefinitionofcalculationswithinSQLqueries

65

Page 66: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

MDXQueryLanguagewithAnalyticViews• SupportforMDX(Multi-DimensionalExpression)querylanguage

– InitiallycertifiedforusebyMicrosoftExcelPivotTables• Support/certificationforotherapplicationstofollow

– Includesamulti-dimensionalquerycache• SimilartotheSQLResultCache

66

SELECT {[Measures].[Sales], [Measures].[Units_Sold]} ON COLUMNS, {[Time].[Calendar].[Year].&[2014], [Time].[Calendar].[Year].&[2015]} ON ROWS FROM [Sales_View] WHERE ([Customer].[Region].[North America], [Product].[Departments].[Category].&[Cameras])

Analytic View

Page 67: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 67

PrivateTemporaryTables

Page 68: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

PrivateTemporaryTablesGlobaltemporarytables• Persistent,shared(global)tabledefinition• Temporary,private(session-based)datacontent

– Dataphysicallyexistsforatransactionorsession– Session-privatestatistics

68

ACC_TMP

ACC_TMPACC_TMP

Privatetemporarytables(18.1)• Temporary,private(session-based)tabledefinition

– Privatetablenameandshape

• Temporary,private(session-based)datacontent– Sessionortransactionduration

ACC_PTMPACC_PTMP

Page 69: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 69

InlineExternalTables

Page 70: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

In-lining external tables• External tables

– first class object where row data resides outside database – maps external data to internal data (table columns) – access type:

• oracle_loader (default) • oracle_datapump • oracle_hive • oracle_hdfs

– default directory (directory object) – access parameters (opaque) – location list (data source) – reject limit

Page 71: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Inline external tables

• Inline external tables (inline XT) – don’t have to create an external table – query with inline XT clause, similar to inline view – syntax similar to external table DDL, except for column list

Page 72: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Inline external tables• Example

select myext.*

from external

(

(deptno number(2), dname varchar2(12), loc varchar2(13))

type ORACLE_LOADER

default directory scott_def_dir1

access parameters

(

records delimited by newline

badfile scott_def_dir2:'deptXT1.bad'

logfile scott_def_dir2:'deptXT2.log'

fields terminated by ','

missing field values are null

)

location ('tkexld01.dat')

reject limit unlimited

) myext;

Page 73: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Inline external tables• Example, cont.

PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------Plan hash value: 674205990

------------------------------------------------- | Id | Operation | Name |-------------------------------------------------| 0 | SELECT STATEMENT | || 1 | EXTERNAL TABLE ACCESS FULL| MYEXT |-------------------------------------------------

Page 74: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Inline external tables• Example, cont.

-- inline XT in WITH clause

with dext as (

select * from external

((deptno char(2), dname char(14), loc char(13))

type oracle_loader

default directory scott_def_dir1

access parameters (fields terminated by ',')

location ('tkexld01.dat')

reject limit unlimited

)

)

select d.dname

from dext d

where d.deptno = 10

order by 1;

Page 75: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 75

DataBoundCollations

Page 76: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| Confidential–OracleInternal/Restricted/Highly

Data-BoundCollation• Preciseandconsistentapplicationoflinguisticcomparisoninqueries

– AddsCOLLATEclausetodeclarecolumn’scollationtobeusedinallqueries– COLLATEoperatorpreciselycontrolscollationinexpressions

• Case-andaccent-sensitivecollations(e.g.BINARY_CI)simplifyimplementationofcase-insensitivequeries

• FeatureisbasedonISO/IECSQLStandardandsimplifiesapplicationmigrationfromotherdatabasessupportingtheCOLLATEclause

76

Page 77: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Column-BasedData-BoundCollation“…anamedsetofrulesdescribinghowtocompareandmatchcharacterstringstoputtheminaspecifiedorder…”• BasedontheISO/IEC/ANSISQLstandard9075:1999• Charactersetisalwaysdeclaredatthedatabaselevel• Collationdeclaredforacolumn

– Doesnotdeterminethecharactersetofdatainthecolumn

• Whyisitimportant?– itsimplifiesapplicationmigrationtotheOracleDatabasefromanumberofnon-Oracledatabasesimplementingcollationinasimilarway

77

Page 78: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

1

Column-BasedData-BoundCollation• Oraclesupportsaround100linguisticcollations

– Parameterizedbyaddingthesuffix_CIorthesuffix_AI• _CI-Specifiesacase-insensitivesort• _AI-Specifiesanaccent-insensitivesort

CREATE TABLE products( product_code VARCHAR2(20 BYTE) COLLATE BINARY, product_name VARCHAR2(100 BYTE) COLLATE GENERIC_M_CI, product_category VARCHAR2(5 BYTE) COLLATE BINARY, product_description VARCHAR2(1000 BYTE) COLLATE BINARY_CI);

– Product_nameistobecomparedusingGENERIC_M_CI-case-insensitiveversionofgenericmultilingualcollation

78

Page 79: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 79

ManagingLargeStringsOverviewofnewVARCHAR2featuresandnewkeywordsinLISTAGG

Page 80: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Pre-12.2LISTAGG

•Pre12.2syntaxtomanagelistswasrelativelysimple:LISTAGG(c.cust_first_name||' '||c.cust_last_name, ',’) WITHIN GROUP (ORDER BY c.country_id) AS Customer

• Issue….keyissueisoverflowerror:–ORA-01489: result of string concatenation is too long

• Solutionsin12.2– IncreaseVARCHAR2sizetosupportlargerstrings– Handleoverflowerrors-Newsyntaxsupporttotruncatestring,optionallydisplaycountoftruncateditemscount,andsettruncationindication

80

Page 81: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SupportForLargerVARCHAR2objects

• Introducedin12cRelease1– VARCHAR2objectssupportsupto32K

SQL> show parameter MAX_STRING_SIZE

NAME TYPE VALUE --------------- ------ -------- max_string_size string STANDARD

ALTER SYSTEM SET max_string_size=extended SCOPE= SPFILE;

–Needtorunrdbms/admin/utl32k.sqlscript

81

AvoidsoverflowingLISTAGGfunctionbyincreasingsizeofVARCHAR(2)objects

Page 82: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

NewKeywordsForUseWithLISTAGG• With12.2wehavemadeiteasiertomanagelists:

LISTAGG(<measure_column>[, <delimiter>] . . .

–Whattodowhenanoverflowoccurs• ON OVERFLOW ERROR (default)• ON OVERFLOW TRUNCATE <delimiter>

– Controltoshow/not-showmanyvaluesweretruncated• WITHOUT COUNT (default)• WITH COUNT

82

Page 83: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

NewKeywordsForUseWithLISTAGGWITHCOUNTSELECT g.country_region, LISTAGG(c.cust_first_name||' '||c.cust_last_name, ','

ON OVERFLOW TRUNCATE WITHOUT COUNT) WITHIN GROUP (ORDER BY c.country_id) AS CustomerFROM customers c, countries gWHERE g.country_id = c.country_idGROUP BY country_regionORDER BY country_region;

83

Page 84: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Keywords:ONOVERFLOWTRUNCATEWITHOUTCOUNT

84

Page 85: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

NewKeywordsForUseWithLISTAGGWITHOUTCOUNTSELECT g.country_region, LISTAGG(c.cust_first_name||' '||c.cust_last_name, ','

ON OVERFLOW TRUNCATE ‘***’ WITH COUNT) WITHIN GROUP (ORDER BY c.country_id) AS CustomerFROM customers c, countries gWHERE g.country_id = c.country_idGROUP BY country_regionORDER BY country_region;

85

Page 86: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 86

ManagingDataConversionErrors

Page 87: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Pre12.2DataConversionErrorsParsingData• Issue:Parsingdatainputfromawebformorloadingdatafromexternalfiles,convertingtospecificdatatypetypicallygenerateserror:

SQL Error: ORA-01722: invalid number

• Solutions:– DetectdataconversionerrorswithnewVALIDATE_CONVERSIONfunction– EnhancementstomostofconversionfunctionslikeTO_NUMBER,TO_DATE,CASTetc.tohandledataconversionerrorsandreplacewithuserprovideddefaultvalues

87

Page 88: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Detectingconversionerrors-VALIDATE_CONVERSION

• Usefultodetectifinputvaluecanbeconvertedtodestinationtype.Returns1ifconversionissuccessful,otherwisereturns0

• VALIDATE_CONVERSION('123a'asNUMBER)-->returns0• VALIDATE_CONVERSION('123'asNUMBER)-->returns1

• Canbeefficientlyusedasfiltertoavoidbaddatawhileimportingforeigndatasources,ETLprocessing

88

Identifyinginvaliddataintheinputstreams

Page 89: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

TwoMethodsforDealingWithConversionErrors

SELECT VALIDATE_CONVERSION(empno AS NUMBER) AS is_empno, VALIDATE_CONVERSION(mgr AS NUMBER) AS is_mgr, VALIDATE_CONVERSION(hiredate AS DATE) AS is_hiredate, VALIDATE_CONVERSION(sal AS NUMBER) AS is_sal, VALIDATE_CONVERSION(comm AS NUMBER) AS is_comm, VALIDATE_CONVERSION(deptno AS NUMBER) AS is_deptno FROM staging_emp;

89

Findrow-columnvaluesthatarecausingerrors:VALIDATE_CONVERSION

Page 90: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Handlingdataconversionerrors-TO_xxxx(),CAST()

• Pre12.2:TO_NUMBER('123a')-->returnsinvalidnumbererror(ora-01722)

New12.2Features• NewsyntaxDEFAULT<default_value>ONCONVERSIONERROR

– Replaceconversionfailurewithuserdefineddefaultvalue– TO_NUMBER('123a'DEFAULT'123'ONCONVERSIONERROR)-->returns123

• ThisnewsyntaxcanbeusedforTO_NUMBER,TO_DATE,TO_TIMESTAMP,TO_TIMESTAMP_TZ,TO_DMINTERVAL,TO_YMINTERVALandCAST

90

-Replacingincorrectormissingdatawithdefaultvalues

Page 91: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

UsingCASTandTO_XXXXFUNCTIONS

INSERT INTO empSELECT empno, ename, job, CAST(mgr AS NUMBER DEFAULT 9999 ON CONVERSION ERROR), CAST(hiredate AS DATE DEFAULT sysdate ON CONVERSION ERROR), CAST(sal AS NUMBER DEFAULT 0 ON CONVERSION ERROR), CAST(comm AS NUMBER DEFAULT null ON CONVERSION ERROR), CAST(deptno AS NUMBER DEFAULT 99 ON CONVERSION ERROR)FROM staging_empWHERE VALIDATE_CONVERSION(empno AS NUMBER) = 1

91

Usingenhancedfunctionstoremoveincorrectdatatypesandcorrectconversionerrors

Page 92: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 92

ApproximateStatisticsApproximatequeryprocessingforfasteranalysiswithinbigdatalakes

Page 93: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ApproximateAnalysis

• PERCENTILE_CONT, PERCENTILE_DISC, MEDIAN– functionsrequiresortingandcanconsumelargeamountsofresources

•NewapproximateSQLfunctions:APPROX_PERCENTILEAPPROX_MEDIAN

• Resultscanbe‘DETERMINISTIC’– Differentalgorithmsusedfordeterministicandnon-deterministicresultsets– Ifkeywordisnotpresent,itmeansdeterministicresultsarenotmandatory

93

Page 94: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ApproximateAnalysis

APPROX_PERCENTILE(pct_expr [DETERMINISTIC][,resulttype]) WITHIN GROUP (ORDER BY expr [ DESC | ASC ])

APPROX_MEDIAN(expr [DETERMINISTIC][,resulttype])

* pct_expr - evaluatestoanumericvaluebetween0and1,becauseitisapercentilevalue

94

* resulttype– optional.Ifnotusedthenfunctionreturnsthevalueatthespecifiedpercentile.Ifspecifiedthenvaluesare‘ERROR_RATE’or‘CONFIDENCE’

Page 95: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Resultsforaccuracy• Realworldcustomerdataset(manufacturingusecase)

• Errorrangearound0.1-1.0%• Ingeneralaccuracywillnotbeamajorconcern

PerformanceResults• UsingTPC-Hschemaandworkload

• 6-13ximprovement

• Notethatmajorsavingscomingfrom:– Useofboundedmemoryregardlessoftheinputsizepergroupbykey

– Reductioninchanceofspilltodisk

AccuracyandPerformance

95

Page 96: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ApproximateAnalysis

• Querieswillbeabletoreporterrorratesandconfidencelevelsasfollows:

SELECT APPROX_MEDIAN (sal) AS median_sal, APPROX_MEDIAN (sal, ‘DETERMINISTIC’), APPROX_MEDIAN (sal, ‘ERROR_RATE’) AS error_rate, APPROX_MEDIAN (sal, ‘CONFIDENCE’) as confidence, FROM emp ;

Howtogetmoreinformationaboutresultset

96

Page 97: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Usingapproximateprocessingwithzerocodechanges!

• Usingfollowingparameterstoconvertexistingqueries:

– approx_for_count_distinct = TRUE/FALSE[DEFAULT] • ConvertexistingCOUNT(DISTINCT …)functionstouseapproximateprocessing

– approx_for_percentile = ‘PERCENTILE_CONT/PERCENTILE_DISC/ MEDIAN/ALL’

– approx_percentile_deterministic=TRUE/FALSE[DEFAULT]

• Canbesetatsessionanddatabaselevel

ConvertingExistingQueriesToReturnApproximateAnswers

97

Page 98: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ImpactofPERCENTILE_CONTProcessing

98

Page 99: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ImpactofPERCENTILE_CONTProcessing

1. Queryaccesses105MrowsfromsourcetableNDV

99

1

Page 100: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ImpactofPERCENTILE_CONTProcessing

1. Queryaccesses105MrowsfromsourcetableNDV

2. SORTGROUPBYoperationconsumestempandmemory:11GB+1GB

100

2 3

Page 101: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

BenefitsofAPPROX_PERCENTILEProcessing

101

1. Queryaccesses105MrowsfromsourcetableNDV

1

Page 102: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

BenefitsofAPPROX_PERCENTILEProcessing

102

1. Queryaccesses105MrowsfromsourcetableNDV

2. SORTGROUPBYoperationconsumesZEROtempand830KBmemory

2 3

Page 103: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

BenefitsofAPPROX_PERCENTILE:13XFaster

103

1

3

2

Page 104: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 104

ApproximateAggregations

Page 105: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Whycreateareusableapproximateresultset?• Requirement:SupportfastaccesstoapproximateanswersforwiderangeofGROUPBYqueries

• Objective:Avoidrevisitingandre-scanningbasetables

• Usecasesforstoringreusableapproximateaggregations– CTASaspartofETLprocessforstagingdata– CTASaspartoflargeranalyticalprocess

• pushingdataintodashboardsandsupportingdrill-downclick-throughanalysis

– Materializedviewsforqueryrewriteofapproximatequeries– Materializedviewsfortransparentqueryrewritetoapproximatequeries

Page 106: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

BuildingReusableApproximateResultsetsCOUNTRY STATE PRODUCT …

US CA A

US CA B

...

US IL A

US IL C

US IL D

US TX A

US CO D

US CO F

US CO H

US NY A

US NY A

US NY G

COUNTRY STATE AC_PROD(INTERNAL)

US CA BLOB

US IL BLOB

US TX BLOB

US CO BLOB

US NY BLOB

… APPROX_COUNT_DISTINCT_DETAIL(product) AS ac_prod…GROUP BY country, state

Builds summary table containing results for all dimensions in GROUP BY clause

106

Page 107: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

CreatingaSTATElevelapproximation

COUNTRY STATE AC_PROD(INTERNAL)

US CA BLOB

US IL BLOB

US TX BLOB

US CO BLOB

US NY BLOB

COUNTRY STATE AC_PROD

US CA 2

… TO_APPROX_COUNT_DISTINCT(ac_prod)…WHERE state = ‘CA’

… APPROX_COUNT_DISTINCT_DETAIL (product) AS ac_prod…GROUP BY country, state

107

Returns results from the specified aggregated results table

Page 108: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

CreatingaSTATElevelapproximation

… APPROX_COUNT_DISTINCT_AGG(ac_prod)…GROUP BY country

COUNTRY AC_PROD

US BLOB

CANANDA BLOB

MEXICO BLOB

BRAZIL BLOB

108

COUNTRY STATE AC_PROD(INTERNAL)

US CA BLOB

US IL BLOB

US TX BLOB

US CO BLOB

US NY BLOB

… APPROX_COUNT_DISTINCT_DETAIL (product) AS ac_prod…GROUP BY country, state

Builds higher level summary table based on results from table derived from _DETAIL function

Page 109: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

BuildingReusableApproximateResultsets

… TO_APPROX_COUNT_DISTINCT (ac_prod)…

COUNTRY AC_PROD

US 84

COUNTRY AC_PROD

US BLOB

CANANDA BLOB

MEXICO BLOB

BRAZIL BLOB

109

… APPROX_COUNT_DISTINCT_AGG(ac_prod)…GROUP BY country

Page 110: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ReturningCOUNTRYdatafromSTATElevelapproximation

COUNTRY STATE AC_PROD(INTERNAL)

US CA BLOB

US IL BLOB

US TX BLOB

US CO BLOB

US NY BLOB

… TO_APPROX_COUNT_DISTINCT( APPROX_COUNT_DISTINCT_AGG(ac_prod))…GROUP BY country

COUNTRY AC_PROD

US 84

… APPROX_COUNT_DISTINCT_DETAIL (product) AS ac_prod…GROUP BY country, state

110

Page 111: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2016,Oracleand/oritsaffiliates.Allrightsreserved.|

QueryRewritewithApproximateMVs

111

Page 112: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

BuildinganMVContainingApproximateResults

CREATE MATERIALIZED VIEW pctl_mviewENABLE QUERY REWRITE ASSELECT state, count, APPROX_PERCENTILE_DETAIL(volume) AS pctl_detailFROM sales_factGROUP BY state, county;

Builds materialized view containing results for all dimensions in GROUP BY clause

112

Page 113: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

QueriesRewritetouseanApproximateMV

SELECT state, county, APPROX_PERCENTILE(0.1) WITHIN GROUP (ORDER BY volume) FROM sales_fact WHERE state = 'CA';

Query rewrites to use materialized view PCTL_MVIEW containing approximate results

113

Page 114: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Whataboutnon-approximatequeries?

alter session set approx_for_precentile = ‘all’;

SELECT state, county, MEDIAN(volume) FROM sales_factWHERE state = 'CA' GROUP BY state, county;

Query results returned from materialized view PCTL_MVIEW containing approximate results

114

Excitingfeature:OptimizercanrewriteexactfunctionstouseMV

Page 115: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 115

In-DatabaseDimensionalModeling

Page 116: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Review:AnalyticViewsin12.2EnhancedAnalysisandSimplifiedAccess

• Organizesdataintoauserandapplicationfriendlybusinessmodel– Intuitivefortheenduser

• DefinedwithSQLDDL– Includeshierarchicalexpressionsandcalculatedmeasures– Easytodefine,supportedbySQLDeveloper

• EasilyqueriedwithsimpleSQLSELECT– SmartAnalyticView(containinghierarchiesandcalculations)=SimpleQuery

116

12.2

Page 117: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Review:AnalyticViewsin12.2EmbeddedCalculations

• DefinecentrallyintheDatabaseandaccesswithanyapplication– Singleversionofthetruth

• Easilycreatenewmeasures– Simplifiedsyntaxbasedonbusinessmodel

– Includesdimensionalandhierarchicalfunctions

117

SalesYeartoDatesales_ytd AS (SUM(sales) OVER(HIERARCHY time_hierarchy BETWEEN UNBOUNDED PROCEEDING AND 0 FOLLOWING WITHIN ANCESTOR AT LEVEL year)

ProductShareofParentshare_product_parent_sales AS (SHARE_OF (sales HIERARCHY product_hierachy PARENT))

12.2

Page 118: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Review:AnalyticViewsin12.2SmartViewsandSimpleQueries

118

Sun Mon Tue Wed Thu Fri Sat

1 2 3 4 5 6

7 8 9 10 11 12 1314 15 16 17 18 19 20

21 22 23 24 25 26 2728 29 30 31

Total

Region

Country Country

Region

Country Country

SELECT time_hierarchy.member_name AS time, product_hierarchy.member_name AS product, customer_hierarchy.member_name AS customer, sales AS sales, sales_ytd_pct_chg_yr_ago AS sales_ytd_pct_chg, share_product_parent_sales AS prod_share_sales FROM sales_analysis hierarchies (time_hierarchy, product_hierarchy, customer_hierarchy) WHERE time_hierarchy.level_name = 'YEAR' AND product_hierarchy.level_name = 'DEPARTMENT' AND customer_hierarchy.level_name = 'REGION';

CalculationsAggregate data at

Year, Department and Region

12.2

Page 119: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 119

External Tables EnhancementsinDatabase12cRelease2MODIFYclausePartitioned

Page 120: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

External Tables• Keyissues:

– Definitionofexternaltableisfixedatcreationtime– Needabilitytodefinetableonceanduseitmultipletimes,toaccessdifferentexternalfiles– Applysametabledefinitiontodifferentinputs

• Solution:– AddedEXTERNALMODIFYclause– Easeofuseenhancementforusingexternaltables– Clauseallowsexternaltabletobeoverriddenatquerytime– Properties:DEFAULT_DIRECTORY,certainACCESSPARAMETERS,LOCATIONandREJECTLIMIT

120

Page 121: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

External Tables – existing functionality• Example….LOCATION specification is fixed

CREATE TABLE SALES_TRANSACTIONS_EXT (PROD_ID NUMBER, CUST_ID NUMBER, PROMO_ID NUMBER) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER DEFAULT DIRECTORY data_file_dir ACCESS PARAMETERS (RECORDS DELIMITED BY NEWLINE FIELDS (PROD_ID (1-6) CHAR, CUST_ID (7-11) CHAR, PROMO_ID (12-15) CHAR)) LOCATION('sh_sales1.dat')) REJECT LIMIT UNLIMITED

121

Page 122: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Override settings with EXTERNAL MODIFY clause • Example:OverrideLOCATIONspecification(continued)SELECT * FROM SALES_TRANSACTIONS_EXT EXTERNAL MODIFY (LOCATION('sh_sales2.dat'))

• NOTE:LOCATIONandREJECTLIMITspecificationscanbespecifiedasbindvaluesintheEXTERNALMODIFYclause

122

Page 123: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 123

PartitionedExternalTables

Page 124: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Partitioned External Table

• Similar to partitioned tables stored in Oracle database • Source files can be stored on file system, Apache Hive storage, or HDFS

• Benefits: – Fast query performance – Enhanced data maintenance – Support static and dynamic(bloom, nested loop, subquery) partition pruning – Support full and partial partition-wise join

• Partitioning strategies supported:

124

Primary\Secondary Range List Auto-List Interval

Range Y Y N N

List Y Y N N

Interval N N N N

Page 125: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Keywords For Partitioned External Table

• Partitioning strategy determined by PARTITION clause – partition by range (c1)

• Partition templates define organization for each partition – partition p1 values less than (7655) location('./tkexpetu_p1a.dat', './tkexpetu_p1b.dat'), – partition p2 values less than (7845) default directory def_dir2 location('./tkexpetu_p2.dat'), – partition p3 values less than (7935) location(def_dir3:'./tkexpetu_p3*.dat’)

125

Page 126: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Example Partitioned External Tablecreate table salesrp_xt_hdfs (c1 number, c2 number) organization external ( type oracle_hdfs default directory def_dir1 access parameters ( com.oracle.bigdata.cluster=hadoop_cl_1 com.oracle.bigdata.fields: (c1 int, c2 int) com.oracle.bigdata.rowformat=delimited fields terminated by ',')) reject limit unlimited partition by range (c1) ( partition p1 values less than (7655) location('./tkexpetu_p1a.dat', './tkexpetu_p1b.dat'), partition p2 values less than (7845) default directory def_dir2 location('./tkexpetu_p2.dat'), partition p3 values less than (7935) location(def_dir3:'./tkexpetu_p3*.dat’));

126

Page 127: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Explain Plan For Accessing Partitioned External Table

Select * from salesrp_xt_hdfs partition (p2) order by c2; ------------------------------------------------------------------------ | Id | Operation | Name | Pstart| Pstop | ------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | | 1 | SORT ORDER BY | | | | | 2 | PARTITION RANGE SINGLE | | 2 | 2 | | 3 | EXTERNAL TABLE ACCESS FULL| SALESRP_XT_HDFS | 2 | 2 | ------------------------------------------------------------------------

– select AVG(s.L_PARTKEY) from scott.emp e, salesrp_xt s where s.l_orderkey = e.sal and e.job = 'SALESMAN';

127

--------------------------------------------------------------------------

| Id | Operation | Name | Pstart| Pstop |

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | | |

Page 128: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 128

CreatingExternalTablesforBigData

Page 129: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Metadata:ExtendOracleExternalTables• Newtypesofexternaltables

–ORACLE_HIVE(leveragehivemetadata)–ORACLE_HDFS(specifymetadata)

• Accessparametersusedtodescribehowtoidentifysourcesandprocessdataonthehadoopcluster

129

CREATE TABLE movielog (

click VARCHAR2(4000))

ORGANIZATION EXTERNAL (

TYPE ORACLE_HIVE

DEFAULT DIRECTORY DEFAULT_DIR

ACCESS PARAMETERS

(

com.oracle.bigdata.tablename logs

com.oracle.bigdata.cluster mycluster

))

REJECT LIMIT UNLIMITED;

Page 130: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

AccessParameters:HDFSExample

130

CREATETABLEWEB_SALES_CSV(WS_SOLD_DATE_SKNUMBER,WS_SOLD_TIME_SKNUMBER,WS_ITEM_SKNUMBER)ORGANIZATIONEXTERNAL(TYPEORACLE_HDFSDEFAULTDIRECTORYDEFAULT_DIRACCESSPARAMETERS(com.oracle.bigdata.cluster=orabigcom.oracle.bigdata.fileformat=TEXTFILEcom.oracle.bigdata.rowformat:DELIMITEDFIELDSTERMINATEDBY'|’com.oracle.bigdata.erroropt:{"action":"replace","value":"-1"})LOCATION('/data/tpcds/benchmarks/bigbench/data/web_sales'))REJECTLIMITUNLIMITED;

• AccessParametersdescribesourcedataandprocessingrules

• Schema-on-Read

Page 131: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

AccessParameters:ORACLE_HIVE

131

CREATETABLEWEB_SALES_CSV(WS_SOLD_DATE_SKNUMBER,WS_SOLD_TIME_SKNUMBER,WS_ITEM_SKNUMBER)ORGANIZATIONEXTERNAL(TYPEORACLE_HIVEDEFAULTDIRECTORYDEFAULT_DIRACCESSPARAMETERS(com.oracle.bigdata.cluster=orabigcom.oracle.bigdata.tablename:csv.web_salescom.oracle.bigdata.erroropt:{"action":"replace","value":"-1"}com.oracle.bigdata.datamode=automatic)REJECTLIMITUNLIMITED;

• AccessParametersrefertometadatadescriptioninHive

• Addprocessingrules

Page 132: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

UseORACLE_HIVEWhenPossible• OracleDatabasequeryexecutionaccessesHivemetadataatdescribetime

– ChangestounderlyingHiveaccessparameterswillnotimpactOracletable(oneexception…columnlist)

• Metadataanenablerforperformanceoptimizations– Partitionpruningandpredicatepushdownintointelligentsources

• Utilizetoolingforsimplifiedtabledefinitions– SQLDeveloperandDBMS_HADOOPpackages

132

Page 133: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ViewingHiveMetadatafromOracleDatabase• ALL_HIVE_DATABASES,ALL_HIVE_TABLES,ALL_HIVE_COLUMNS

133

ALL_HIVE_COLUMNS

Page 134: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

CreatingTables

134

SQLDeveloperwithHiveJDBC

Right-clickonHiveTable.UseinOracleBigDataSQL

Reviewgeneratedcolumns.Updateasneeded-focusingondatatypesandprecision

Addoptionalaccessparameters.AutomaticallygeneratetableorsaveDDL.

1 2 3

See:https://blogs.oracle.com/datawarehousing/entry/oracle_sql_developer_data_modeler

Page 135: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

• PL/SQLPackageusedtocreatetableorgenerateDDL

• CombinewithALL_HIVE*dictionaryviewstoautomatecreationofmanytables

• Consideroptimizingdatatypeconversions-especiallyprecision– string->varchar2(?)

135

CreatingTables

declareDDLoutVARCHAR2(4000);beginDDLout:=null;dbms_hadoop.create_extddl_for_hive(CLUSTER_ID=>'orabig',DB_NAME=>'parq',HIVE_TABLE_NAME=>'store_sales',HIVE_PARTITION=>FALSE,TABLE_NAME=>'store_sales_orcl',PERFORM_DDL=>FALSE,TEXT_OF_DDL=>DDLout);dbms_output.put_line(DDLout);end;/

DBMS_HADOOPPackage

Page 136: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 136

Schema Modeling FeaturesInvisiblecolumns,defaultvalues,indexingmultiplecolumnsAndidentitycolumns

Page 137: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

OverviewofSchemaModelingEnhancements• Invisible Columns • DEFAULT VALUE enhancements

– Metadata-Only Default column values for NULL’able columns – Default values for columns on explicit NULL insertion – Default values for columns based on sequences

• Multiple Indexes on the same columns • IDENTITY columns

137

Page 138: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Invisible Columns

• Create a simple table with invisible column CREATE TABLE hr.emp (empno NUMBER(5), name VARCHAR2(30) not null, status VARCHAR2(10) INVISIBLE) TABLESPACE admin_tbs STORAGE ( INITIAL 50K));

• Modify to make the status column visible: ALTER TABLE hr.admin_emp MODIFY(status VISIBLE);

138

Examples

Page 139: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Invisible Columns – Usage in Views• Invisible columns at the view level is supported. • View Columns will be visible unless explicitly over-ridden by the ‘

invisible’ syntax – irrespective of the visibility of the table column. • Invisible columns at the edition-ing view level is supported. • Examples:

– CREATE OR REPLACE VIEW emp (empno, ename, status invisible) AS SELECT empno, ename, status FROM emp;

139

Page 140: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

DEFAULTVALUEEnhancements

• NewinOracleDatabase12c:• CurrentScenariowhenaddingaNULL’ablecolumnwithadefaultvalue• Addscolumntometadata• RunasserialrecursiveSQLtopopulateexistingrowswithdefaultvalue.• HoldsanExclusiveDMLandKGLlockduringtheoperation• MaketheentireDDLametadataonlyoperation

140

Metadata-onlyDEFAULTColumnValuesForNULL’ableColumns

Page 141: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

DEFAULTVALUEEnhancements

• Allow SQL column defaulting when user specifies a NULL value on a NOT NULL column in an insert statement

• Example:

CREATE TABLE test(a1 number DEFAULT ON NULL 10 NOT NULL, a2 varchar2(10);

INSERT INTO TEST (a1, a2) VALUES(NULL, ‘abc’);

SELECT a1, a2 FROM test; a1 a2

-------- ------- 10 abc

141

Column Defaulting for specific NULL insertion

Page 142: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

DEFAULTVALUEEnhancements

• Allow sequence [CURRVAL|NEXTVAL] to be used in SQL default expression

• Example CREATE SEQUENCE s1 START WITH 1;

CREATE TABLE test (a1 number DEFAULT S1.NEXTVAL, a2 varchar2(10));

INSERT INTO test (a2) VALUES (‘abc’);

INSERT INTO test (a2) VALUES ( ‘xyz’);

SELECT * FROM test;

C1 C2 ------ ------------ 1 abc 2 xyz

142

Column Defaulting Using A Sequence

Page 143: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ExamplesofMultipleIndexesOnSameSetOfColumns

• CreatetableandindexCREATE TABLE test(c1 int, c2 int); CREATE INDEX test_idx ON test (c1,c2);

• CreatebitmapindexonthesamesetofcolumnsasTEST_IDX:CREATE BITMAP INDEX test_idx2 ON test(c1, c2) INVISIBLE;

• “Activate”newindexALTER INDEX test_idx INVISIBLE; ALTER INDEX test_idx2 VISIBLE;

143

Page 144: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

MultipleIndexesOnSameSetOfColumns

• Onlyonevisibleindexonthesamesetofcolumnsatanypointoftime• Tocreateavisibleindex,existingindexesonthesamesetofcolumnsneedtobeinvisible

• Alterindexvisiblewillonlybeallowedifallotherindexesonthesamesetofcolumnsareinvisible

144

UsageConstraints

Page 145: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

IdentityColumns

• Identitycolumnsenableasimplewayofcreatingauniqueidentifieraspartofaschemamodel– PartofANSIStandard

• IdentityColumnswilldefaultamonotoneincreasingintegeroninsertDMLfromasequencegenerator,whoseoptionsarespecifiedbytheidentitysyntax– Note:uniquenessisnotenforcedaspartoftheIDENTITYdefinition

145

Concept

Page 146: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Example of Identity Columns

• Create simple table with identity column – Generated by default, start with 100:

CREATE TABLE test(C1 number GENERATED AS IDENTITY (START WITH 100));

• Add identity column, increment by 10 – Existing rows will be updated with a value from sequence generator, but order is not

deterministic ALTER TABLE test ADD( C1 number GENERATED AS IDENTITY (INCREMENT BY 10));

146

Page 147: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

IdentityColumns

• CreatesimpletablewithdefaultidentitycolumnatNULLinsertionCREATE TABLE test (C1 number GENERATED BY DEFAULT ON NULL AS IDENTITY, C2 varchar2(10));

• Identitycolumngeneratedbydefaultstartswith1INSERT INTO test(C2) VALUES (‘abc’); INSERT INTO test(C1,C2) VALUES (null, ‘xyz’); SELECT c1, c2 FROM test; C1 C2 ------ ------------1 abc2 xyz

147

Example,cont.

Page 148: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.| 148

Storage OptimizationsAttribute Clustering and Zone Maps

Page 149: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

AttributeClustering

• Ordersdatasothatitisincloseproximitybasedonselectedcolumnsvalues:“attributes”

• Attributescanbefromasingletableormultipletables– e.g.fromfactanddimensiontables

• SignificantIOpruningwhenusedwithzonemaps

• ReducedblockIOfortablelookupsinindexrangescans

• Queriesthatsortandaggregatecanbenefitfrompre-ordereddata

• Enableimprovedcompressionratios– Ordereddataislikelytocompressmorethanunordereddata

149

ConceptsandBenefits

Page 150: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

AttributeClusteringforZoneMapsOrderedrows

ALTER TABLE sales ADD CLUSTERING BY LINEAR ORDER (category);

ALTER TABLE sales MOVE;

OrderedrowscontainingcategoryvaluesBOYS,GIRLSandMEN.

Zonemapscatalogueregionsofrows,orzones,thatcontainparticularcolumnvalueranges.

Bydefault,eachzoneisupto1024blocks.

Forexample,weonlyneedtoscanthiszoneifwearesearchingforcategory“GIRLS”.Wecanskipallotherzones.

Page 151: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

AttributeClusteringBasics

• Twotypesofattributeclustering– LINEARORDERBY

• Classicalordering– INTERLEAVEDORDERBY

• Multi-dimensionalordering

• Simpleattributeclusteringonasingletable• Joinattributeclustering

– Clusteronattributesderivedthroughjoinofmultipletables• Uptofourtables• Non-duplicatingjoin(PKorUKonjoinedtableisrequired)

Page 152: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

AttributeClusteringWithZoneMapsExample

• CLUSTERINGBYLINEARORDER(category,country)

• Zonemapbenefitsaremostsignificantwithordereddata

LINEARORDER

Pruningwith:

SELECT .. FROM table WHERE category = ‘BOYS’; AND country = ‘US’

SELECT .. FROM table WHERE category = ‘BOYS’;

Page 153: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

AttributeClusteringWithZoneMapsExample

• CLUSTERINGBYINTERLEAVEDORDER(category,country)

• Zonemapbenefitsaremostsignificantwithordereddata

INTERLEAVEDORDER

Pruningwith:

SELECT .. FROM table WHERE category = ‘BOYS’;

SELECT .. FROM table WHERE category = ‘BOYS’; AND country = ‘US’

SELECT .. FROM table WHERE country = ‘US’

Page 154: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

BasicsofZoneMaps• Independentaccessstructurebuiltforatable

– Implementedusingatypeofmaterializedview– Forpartitionedandnon-partitionedtables

• Onezonemappertable– Zonemaponpartitionedtableincludesaggregateentryper[sub]partition

• Usedtransparently– Noneedtochangeorhintqueries

• Implicitorexplicitcreationandcolumnselection– ThroughAttributeClustering:CREATETABLE…CLUSTERING– CREATEMATERIALIZEDZONEMAP…ASSELECT…

154

Page 155: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ZoneMaps

• DMLandpartitionoperationscancausezonemapstobecomefullyorpartiallystale– Directpathinsertdoesnotmakezonemapsstale

• Singletable‘local’zonemaps– Updateandinsertmarksimpactedzonesasstale(andanyaggregatedpartitionentry)– Noimpactonzonemapsfordelete

• Joinedzonemap– DMLonfacttableequivalentbehaviortosingletablezonemap– DMLondimensiontablemakesdependentzonemapsfullystale

155

Staleness

Page 156: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

RefreshingZoneMaps

• Incrementalandfullrefresh,asrequiredbyDML–Zonemaprefreshdoesrequireamaterializedviewlog

•OnlystalezonesarescannedtorefreshtheMV–Forjoinedzonemap

•DMLonfacttable:incrementalrefresh•DMLondimensiontable:fullrefresh

• Zonemapmaintenancethrough–DBMS_MVIEW.REFRESH()–ALTERMATERIALIZEDZONEMAP<xx>REBUILD;

156

Page 157: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ZoneMapswithAttributeClustering• CombinedBenefits

• Improvedqueryperformanceandconcurrency– Reducedphysicaldataaccess– SignificantIOreductionforhighlyselectiveoperations

• Optimizedspaceutilization– Lessneedforindexes– Improvedcompressionratiosthroughdataclustering

• Fullapplicationtransparency– Anyapplicationwillbenefit

157

AttributeClustering

Ordersdatasothatcolumnsvaluesarestoredtogetherondisk

XZonemaps

Storesmin/maxofspecifiedcolumnsperzone

Usedtofilterun-neededdataduringqueryexecution

Page 158: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

AttributeClusteringwithIn-MemoryColumnStoreSnowflakeSchemaBenchmark

• Attributeclustersalone-nozonemaps• Withattributeclusteringversuswithout(baseline)• Warehousingbenchmarkrunonsnowflake-schema• In-MemoryColumnStore• Resultwithattributeclustering:

–Overall,1.4Xresponsetimeimprovementoverbaseline– Improvedsortandaggregationperformance

• Pre-orderedrowscanrequirelesssorting

Page 159: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ZoneMapsWithAttributeClusteringStarSchemaBenchmark

• Overall,2.6Xend-to-endelapsedtimeimprovement– Comparingwithandwithoutzonemapandattributeclustering

Page 160: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ZoneMapsandPartitioning• Zonemapscanprunepartitionsforcolumnsthatarenotincludedinthepartition(orsubpartition)key

160

SALES

PartitionKey: ORDER_DATE

JAN FEB MAR APR

Zonemap: SHIP_DATE JAN FEB MAR

ZonemapcolumnSHIP_DATE correlateswithpartitionkeyORDER_DATE

Page 161: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ZoneMapsandPartitioning• Zonemapscanprunepartitionsforcolumnsthatarenotincludedinthepartition(orsubpartition)key

161

JAN FEB MAR APR

JAN FEB MAR

MARandAPRpartitionsarepruned

WHEREship_date=TO_DATE(‘10-JAN-2011’)

SALES

PartitionKey: ORDER_DATE

Zonemap: SHIP_DATE

Page 162: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ZoneMapsandStorageIndexes

• AttributeclusteringandzonemapsworktransparentlywithExadatastorageindexes– ThebenefitsofExadatastorageindexescontinuetobefullyexploited

• Inaddition,zonemaps(whenusedwithattributeclustering)– EnableadditionalandsignificantIOoptimization

• Provideanalternativetoindexes,especiallyonlargetables• Joinandfact-dimensionqueries,includingdimensionhierarchysearches• Particularlyrelevantinstarandsnowflakeschemas

– Areabletopruneentirepartitionsandsub-partitions– Areeffectiveforbothdirectandconventionalpathreads– Includeoptimizationsforjoinsandindexrangescans– Partofthephysicaldatabasedesign:explicitlycreatedandcontrolledbytheDBA

162

Page 163: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Summary

• MakingI/Oeliminationtechniquesevenmoreeffective• Attributeclusteringisusedtostorerelateddataincloseproximity

–Ensuresthatsimilardatafallswithinthesamezone

• ZonemapsprovideI/Oreductionforsingletables,tablejoinsanddimensionalhierarchies

Page 164: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Top-N Filtering

164

Page 165: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

NativeSupportforTOP-NQueries

• ANSI2008/2011compliantwithsomeadditionalextensions• Specifyoffsetandnumberorpercentageofrowstoreturn• Provisionstoreturnadditionalrowswiththesamesortkeyasthelastrow(WITHTIESoption)

NewoffsetandfetchFIRSTclause

165

Page 166: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

NativeSupportforTOP-NQueries

Find5percentofemployeeswiththelowestsalariesSELECT employee_id, last_name, salary FROM employees ORDER BY salary FETCH FIRST 5 percent ROWS ONLY;

Internalprocessing

166

Page 167: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Find5percentofemployeeswiththelowestsalariesSELECT employee_id, last_name, salary FROM employees ORDER BY salary FETCH FIRST 5 percent ROWS ONLY;

NativeSupportforTOP-NQueriesInternalprocessing,cont.

▪ Internally the query is transformed into an equivalent query using window functions SELECT employee_id, last_name, salary FROM (SELECT employee_id, last_name, salary, row_number() over (order by salary) rn, count(*) over () total FROM employee) WHERE rn <= CEIL(total * 5/100);

▪ Additional Top-N Optimization: – SELECT list may include expensive PL/SQL function or costly expressions – Evaluation of SELECT list expression limited to rows in the final result set

167

Page 168: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SQL for Advanced Analysis PatternmatchingwithMATCH_RECOGNIZE

168

Page 169: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

PatternRecognitionInSequencesofRows• Recognize patterns in sequences of events using SQL

– Sequence is a stream of rows – Event equals a row in a stream

• New SQL construct MATCH_RECOGNIZE – Logically partition and order the data

• ORDER BY and PARTITION BY are optional – but be careful – Pattern defined using regular expression using variables – Regular expression is matched against a sequence of rows – Each pattern variable is defined using conditions on rows and aggregates

169

Page 170: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

BusinessProblem:FindingSuspiciousMoneyTransfers

• Suspiciousmoneytransferpatternforanaccountis:– 3ormoresmall(<2K)moneytransferswithin30days– Largetransfer(>=1M)within10daysoflastsmalltransfer

• Reportaccount,dateoffirstsmalltransfer,dateoflastlargetransfer

170

Page 171: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

FindingSuspiciousMoneyTransfers

171

DataSet

TIME USER ID EVENT AMOUNT1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/20212 John Deposit 500,000

Page 172: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

FindingSuspiciousMoneyTransfers

172

TIME USER ID EVENT AMOUNT1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/20212 John Deposit 500,000

Three small transfers within 30 days

Page 173: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

FindingSuspiciousMoneyTransfers

173

TIME USER ID EVENT AMOUNT1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/20212 John Deposit 500,000

Three small transfers within 30 days

Large transfer within 10 days of last small transfer

Page 174: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SELECT . . . FROM (SELECT * FROM event_log WHERE event = 'transfer') MATCH_RECOGNIZE ( . . .

)

SQL Pattern Matching in action

NewsyntaxfordiscoveringpatternsusingSQL:findingsuspiciousmoneytransfers

MATCH_RECOGNIZE()

174

TIME USER ID EVENT AMOUNT

1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/20212 John Deposit 500,000

Page 175: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SELECT . . . FROM (SELECT * FROM event_log WHERE event = 'transfer') MATCH_RECOGNIZE ( PARTITION BY userid ORDER BY time

)

Define the how data is to be processed

STEP1

SetthePARTITIONBYandORDERBYclauses

175

TIME USER ID EVENT AMOUNT

1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/20212 John Deposit 500,000

Page 176: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SELECT . . . FROM (SELECT * FROM event_log WHERE event = 'transfer') MATCH_RECOGNIZE ( PARTITION BY userid ORDER BY time

PATTERN ( X{3,} )

)

Define PATTERN clause

STEP2

DefinethePATTERN–

Threeormoresmallamount(<2K)moneytransferswithin30days

176

TIME USER ID EVENT AMOUNT

1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/20212 John Deposit 500,000

Page 177: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SELECT . . . FROM (SELECT * FROM event_log WHERE event = 'transfer') MATCH_RECOGNIZE ( PARTITION BY userid ORDER BY time

PATTERN ( X{3,} Y)

)

Define PATTERN clause

STEP2DefinethePATTERNvariables:

Largetransfer(>=1M)within10daysoflastsmalltransfer

177

TIME USER ID EVENT AMOUNT

1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/20212 John Deposit 500,000

Page 178: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SELECT . . . FROM (SELECT * FROM event_log WHERE event = 'transfer') MATCH_RECOGNIZE ( PARTITION BY userid ORDER BY time

PATTERN ( X{3,} Y) DEFINE X as (amount < 2000) AND LAST(X.time) - FIRST(X.time) < 30,

)

Define PATTERN clause

STEP2DefinethePATTERNvariables:

Describethedetailsofeachpattern–smallamountislessthan2Kandwithin30days

178

TIME USER ID EVENT AMOUNT

1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/20212 John Deposit 500,000

Page 179: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SELECT . . . FROM (SELECT * FROM event_log WHERE event = 'transfer') MATCH_RECOGNIZE ( PARTITION BY userid ORDER BY time

PATTERN ( X{3,} Y) DEFINE X as (amount < 2000) AND LAST(X.time) - FIRST(X.time) < 30, Y as (amount >= 1000000 )

Define PATTERN clause

STEP2DefinethePATTERNvariables:

Describethedetailsofeachpattern–largeamountismorethan1M

179

TIME USER ID EVENT AMOUNT

1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/20212 John Deposit 500,000

Page 180: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SELECT . . . FROM (SELECT * FROM event_log WHERE event = 'transfer') MATCH_RECOGNIZE ( PARTITION BY userid ORDER BY time

PATTERN ( X{3,} Y) DEFINE X as (amount < 2000) AND LAST(X.time) - FIRST(X.time) < 30, Y as (amount >= 1000000 AND Y.time - LAST(X.time) < 10 ))

Define PATTERN clause

STEP2DefinethePATTERNvariables:

Largetransferwithin10daysoflastsmalltransfer

180

TIME USER ID EVENT AMOUNT

1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/20212 John Deposit 500,000

Page 181: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SELECT . . . FROM (SELECT * FROM event_log WHERE event = 'transfer') MATCH_RECOGNIZE ( PARTITION BY userid ORDER BY time MEASURES FIRST(x.time) first_t, y.time last_t, y.amount amount PATTERN ( X{3,} Y) DEFINE X as (amount < 2000) AND LAST(X.time) - FIRST(X.time) < 30, Y as (amount >= 1000000 AND Y.time - LAST(X.time) < 10 ))

Define Measures To Be Calculated

STEP3DefinetheMEASURES:

Reportaccount,dateoffirstsmalltransfer,dateoflastlargetransfer

181

TIME USER ID EVENT AMOUNT

1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/20212 John Deposit 500,000

Page 182: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SELECT . . . FROM (SELECT * FROM event_log WHERE event = 'transfer') MATCH_RECOGNIZE ( PARTITION BY userid ORDER BY time MEASURES FIRST(x.time) first_t, y.time last_t, y.amount amount ONE ROW PER MATCH PATTERN ( X{3,} Y) DEFINE X as (amount < 2000) AND LAST(X.time) - FIRST(X.time) < 30, Y as (amount >= 1000000 AND Y.time - LAST(X.time) < 10 ))

Define How Much Data Is Returned

STEP4Controltheoutput:

Outputoneroweachtimewefindamatchtoourpattern

182

TIME USER ID EVENT AMOUNT

1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/20212 John Deposit 500,000

Page 183: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SELECT userid, first_t, last_t, amount FROM (SELECT * FROM event_log WHERE event = 'transfer') MATCH_RECOGNIZE ( PARTITION BY userid ORDER BY time MEASURES FIRST(x.time) first_t, y.time last_t, y.amount amount ONE ROW PER MATCH PATTERN ( X{3,} Y) DEFINE X as (amount < 2000) AND LAST(X.time) - FIRST(X.time) < 30, Y as (amount >= 1000000 AND Y.time - LAST(X.time) < 10 ))

Define output columns

Finallylistcolumnstoreturnaspartofthequeryresultset…

183

TIME USER ID EVENT AMOUNT

1/1/2012 John Deposit 1,000,0001/2/2012 John Transfer 1,0001/5/2012 John Withdrawal 2,0001/10/2012 John Transfer 1,5001/20/2012 John Transfer 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer 1,000,0002/2/20212 John Deposit 500,000

Page 184: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

AddingNewRequirements

• Additionalrequirement:– Checkfortransfersacrossdifferentaccounts

• totalsumofsmalltransfersmustbelessthan20K

184

UsingSQLmakesitveryeasytoextendpatternfornewrequirements

TIMESTAMP USER ID EVENT TRANSFER_TO AMOUNT1/1/2012 John Deposit - 1,000,0001/2/2012 John Transfer Bob 1,0001/5/2012 John Withdrawal - 2,0001/10/2012 John Transfer Allen 1,5001/20/2012 John Transfer Tim 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer Tim 1,000,0002/2/20212 John Deposit - 500,000

Page 185: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

AddingNewRequirements

• Additionalrequirement:– Checkfortransfersacrossdifferentaccounts

• totalsumofsmalltransfersmustbelessthan20K

UsingSQLmakesitveryeasytoextendpatternfornewrequirements

TIMESTAMP USER ID EVENT TRANSFER_TO AMOUNT

1/1/2012 John Deposit - 1,000,0001/2/2012 John Transfer Bob 1,0001/5/2012 John Withdrawal - 2,0001/10/2012 John Transfer Allen 1,5001/20/2012 John Transfer Tim 1,2001/25/2012 John Deposit 1,200,0001/27/2012 John Transfer Tim 1,000,0002/2/20212 John Deposit - 500,000

Threesmalltransferswithin30daystodifferentacctandtotalsum<20K

Largetransferwithin10daysoflastsmalltransfer

Page 186: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SELECT userid, first_t, last_t, amount FROM (SELECT * FROM event_log WHERE event = 'transfer') MATCH_RECOGNIZE ( PARTITION BY userid ORDER BY time MEASURES FIRST(x.time) first_t, y.time last_t, y.amount amount ONE ROW PER MATCH PATTERN (X{3,} Y) DEFINE X as (amount < 2000) AND LAST(X.time) - FIRST(X.time) < 30 AND PREV(X.transfer_to) <> X.transfer_to

AddingNewRequirements

ModifythepatternvariablesDEFINE

-Checkthetransferaccount

Page 187: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SELECT userid, first_t, last_t, amount FROM (SELECT * FROM event_log WHERE event = 'transfer') MATCH_RECOGNIZE ( PARTITION BY userid ORDER BY time MEASURES FIRST(x.time) first_t, y.time last_t, y.amount amount ONE ROW PER MATCH PATTERN (X{3,} Y) DEFINE X as (amount < 2000) AND LAST(X.time) - FIRST(X.time) < 30 AND PREV(X.transfer_to) <> X.transfer_to Y as (amount >= 1000000 AND y.time - LAST(X.time) < 10 AND SUM(X.amount) < 20,000 );

AddingNewRequirements

ModifythepatternvariablesDEFINE

- Checkthetotalofthesmalltransfersislessthan20K

Page 188: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SQL for Advanced Analysis Approximatecountdistinct

188

Page 189: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

ExploringToday’sBigDataLakes• Keybusinesschallenges

–Manyqueriesrelyoncountsand/orstatisticalcalculations• NDVs,Pareto’s80:20rule,identifyingoutliersetc.

– Exactprocessingoflargedatasetsisresourceintensive

– Exploratoryqueriesdon’trequirecompletelyaccurateresult• Trendinganalysis,socialanalysis,sessionizationanalytics

• Oracle’ssolutions– Provide“approximateresult”capabilitiesinSQL– Keyobjectives

• Returnapproximateresultsfaster,minimaldeviationfromactual• Usefewerresources,allowingmorequeriestorun

189

Page 190: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Answer“Howmany…”typequestions– Howmanyuniquesessionstoday– Howmanyuniquecustomersloggedon– Howmanyuniqueeventsoccurred

COUNT(DISTINCTexpr)– returnstheexactnumberofrowsthatcontaindistinctvaluesofspecifiedexpression

– Canberesourceintensivebecauserequiressorting

APPROX_COUNT_DISTINCT(expr) – processeslargeamountsofdatasignificantlyfaster

– usesHyperLogLogalgorithm– negligibledeviationfromexactresult

• ignoresrowscontainingnullvalues

– supportsanyscalardatatype• DoesnotsupportBFILE,BLOB,CLOB,LONG,LONGRAW,orNCLOB

GettingApproximateCounts

190

…significantlyfastersolution

Page 191: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

PerformanceandAccuracyofAPPROX_COUNT_DISTINCT

PerformanceResults• Realworldcustomerworkload• 5-50ximprovement

Resultsforaccuracy• Realworldcustomerworkload• Accuracythatistypically97%with95%confidence

191

Notes:thisapproachdoesnotusesampling,itusesahash-basedapproachignoresrowsthatcontainanullvalueforspecifiedexpression

SupportsanyscalardatatypeotherthanBFILE,BLOB,CLOB,LONG,LONGRAW,orNCLOB

Page 192: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

COUNT(DISTINCT)Processing

1. Queryisprocessingall6,00MrowsintableLINEITEM2. Tableaccessconsumes541MBmemory3. Sortoperationtomanagecount+distinctoperations4. Distinct+Countprocessingconsumes8GBofmemoryand164GBoftemp

192

1 2

43

Page 193: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

BenefitsofAPPROX_COUNT_DISTINCTprocessing

193

1 2

3

4

1. Queryisprocessingall6,00MrowsintableLINEITEM2. Tableaccessconsumes542MBmemory3. OnlysortoperationisnowAGGREGATEAPPROX4. ApproximateprocessingONLYconsumes524MBofmemoryandzeroGBoftemp

Page 194: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

BenefitsofAPPROX_COUNT_DISTINCT:50XFaster

1. COUNT(DISTINCT…)timeline3,059secondson6,000Mrows

2. APPROX_COUNT_DISTINCTindicatorinexplainplan

3. APPROX_COUNT_DISTINCTtimeline69secondson6,000Mrows

194

1

3

2

50XFASTER

Page 195: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

UsingStatisticalAnalyticsForIntelligentAnalysis

• Keybusinessrequirements– Searchingforoutlierswithinadataset– Pareto(80/20)analysis– Datapoints3SDsfrommean

• Dataoutside3SDsisoftenconsideredananomaly

• Typicalusecasesinclude– Qualitymonitoringandassurance– MonitoringSLAperformance– Anomaly/outlierdetection– Trackingactivity/visibilityonsocialmediasites

195

Page 196: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Materialized Views

196

Page 197: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

OverviewofMaterializedViewsinOracleDatabase12c• Objectives

– Improveperformanceofrefreshoperation–Minimizestalenesstimeofmaterializedviews

• Twofundamentalnewconceptsforrefresh–Out-of-placerefresh

• Refresh“shadowMV”andswapwithoriginalMVafterrefresh

– Synchronousrefresh• RefreshbasetablesandMVssynchronously,leveragingequi-partitioningoftheobjects

197

Page 198: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

MaterializedViews:In-Placevs.Out-of-PlaceRefresh

• ApplyrefreshstatementtoMVdirectly

• MVremainsunusableduringexecutionofrefreshstatement

• Potentialsuboptimalprocessing– ConventionalDMLsdon’tscalewell– Truncateanddirectpathloadonlyusedinlimitedcases

• MVbecomesfragmentedaftercertainnumbersofrefreshes

198

• Createoutsidetable(s)– Populateoutsidetable(s)– Switchoutsidetable(s)tobecomenewMVorMVpartition

• HighMVavailability• Efficiencyduetodirectload

• Addressesfragmentationproblem

In-placerefresh Out-of-placerefresh

Page 199: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

OverviewofSynchronousRefresh

• MaterializedViewandbase-tablesrefreshedtogether–MaterializedViewandbase-tablesalways“insync”–MaterializedViewsalwaysfresh

• ImprovedavailabilityofMaterializedViewforrewrite• MaterializedViewandfacttablesmustbeequi-partitioned

– Partitionkeyoffacttablemustfunctionallydeterminepartition-keyofMV

• SynchronousRefreshusespartitionexchangeofchangedfacttableandMaterializedViewpartitions

199

Page 200: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Data Bound CollationsEnhancementstotables/viewstosupportsearchingmultilingualtextstrings

200

Page 201: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

Data-BoundCollations

“…anamedsetofrulesdescribinghowtocompareandmatchcharacterstringstoputtheminaspecifiedorder…”• BasedontheISO/IEC/ANSISQLstandard9075:1999• Charactersetisalwaysdeclaredatthedatabaselevel• Collationdeclaredforacolumn

– Doesnotdeterminethecharactersetofdatainthecolumn

• Whyisitimportant?– itsimplifiesapplicationmigrationtotheOracleDatabasefromanumberofnon-Oracledatabasesimplementingcollationinasimilarway

201

Page 202: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

1

Data-BoundCollations• Oraclesupportsaround100linguisticcollations

– Parameterizedbyaddingthesuffix_CIorthesuffix_AI• _CI-Specifiesacase-insensitivesort• _AI-Specifiesanaccent-insensitivesort

CREATE TABLE products( product_code VARCHAR2(20 BYTE) COLLATE BINARY, product_name VARCHAR2(100 BYTE) COLLATE GENERIC_M_CI, product_category VARCHAR2(5 BYTE) COLLATE BINARY, product_description VARCHAR2(1000 BYTE) COLLATE BINARY_CI);

– Product_nameistobecomparedusingGENERIC_M_CI-case-insensitiveversionofgenericmultilingualcollation

202

Page 203: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

#oow17Getthemostfrom#oow17-Must-SeeDWandBigDataSessionsandHands-onLabs

203

Page 205: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

Copyright©2017,Oracleand/oritsaffiliates.Allrightsreserved.|

SafeHarborStatement

Theprecedingisintendedtooutlineourgeneralproductdirection.Itisintendedforinformationpurposesonly,andmaynotbeincorporatedintoanycontract.Itisnotacommitmenttodeliveranymaterial,code,orfunctionality,andshouldnotberelieduponinmakingpurchasingdecisions.Thedevelopment,release,andtimingofanyfeaturesorfunctionalitydescribedforOracle’sproductsremainsatthesolediscretionofOracle.

205

Page 206: SQL Analytics for Analysis, Reporting and Modeling · • Part of ANSI 2016 • Embed sophisticated algorithms in SQL –Hides implementation of algorithm –Leverage powerful, dynamic

206