Top Banner
Managing and Analyzing Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017
50

Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

Apr 15, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

Managing and Analyzing Time Series data using

Warp 10Santa Clara, California | April 24th – 27th, 2017

Page 2: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

`whoami`

Page 3: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

3

Mathias Herberts

Co-Founder and CTO of Cityzen Data, maker of Warp 10

Former banker and Googler

Currently Frenchman

@herberts

Page 4: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

Time SeriesWhere the database hype is

Page 5: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

5

Open Source Solutions

Page 6: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

6

Warp 10

Page 7: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

7

Warp 10

A rich tool suite for managing and analyzing sensor data

Way more than a simple monitoring solution

Page 8: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

Data ModelEverything there is to know about Geo Time Series®

Page 9: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

9

Geo Time Series®

Page 10: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

10

Geo Time Series®

Page 11: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

11

Geo Time Series®

• Configurable time units, from ms to ns

• Support for four types LONG, DOUBLE, BOOLEAN, STRING,

• Full unicode (UTF-8) support in all strings (classes, labels, attributes, values)

• 1 cm precision for latitude/longitude, 1 mm precision for elevation

• Geo part (latitude, longitude, elevation) is optional

Page 12: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

ArchitectureOne size does not fit all so Warp 10 comes in multiple versions

Page 13: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

13

Standalone version (M-L)

Datalog

(or in-memory)

Page 14: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

14

Distributed version (XL)

Page 15: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

15

Embedded version (S)

Page 16: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

APIsHow to interact with Warp 10

Page 17: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

17

Storing data - /api/v0/update

• Simple text based format

TIMESTAMP/LAT:LON/ELEVATION CLASS{LABELS} VALUE=TIMESTAMP/LAT:LON/ELEVATION VALUET+OFFSET/LAT:LON/ELEVATION CLASS{LABELS} VALUE

• HTTP or WebSocket endpoints + input plugins to support 3rd party formats

• No limit in the amount of datapoints you can push in each request

• Throttling mechanisms to limit number of active series and datapoints rate

Page 18: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

18

Fetching raw data - /api/v0/fetch

• Retrieve data in a time period

• Retrieve a number of datapoints before an instant

• Regular expression matching on classes and labels + attributes

• Various supported output formats

Page 19: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

19

Identify series - /api/v0/find

• List series matching some criteria

• Regular expression matching on classes and labels + attributes

• Companion endpoint for meta

Page 20: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

20

Modifying attributes - /api/v0/meta

• Set attributes for Geo Time Series®

• Use attributes for life cycle management

• Use attributes for signaling transient situations

• Use attributes for geographic search of series (via GeoHash for example)

Page 21: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

21

Deleting data - /api/v0/delete

• Delete a time range or complete series

• Option to delete data older than a certain age on the distributed version

• Select series to delete based on regex searches on classes and labels/attr

• Based on attributes, enforce a flexible lifecycle management mechanism

Page 22: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

22

Perform analytics - /api/v0/exec

• POST analysis script

• Analysis performed server side, as close as possible to the data

• Retrieve results as a JSON object

• Not a query language, way more than that as we’ll see next

Page 23: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

23

Streaming endpoints

• /api/v0/streamupdate

• /api/v0/plasma - Subscribe to raw series data

• /api/v0/mobius - Subscribe to periodic analysis

Page 24: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

SecuritySensor data are sensitive

Page 25: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

25

Authentication / Authorization

• Token based security

• Tokens identify data which can be accessed

• Tokens can carry data transformations applied on the fly

• Tokens can force some label values (both for read and write)

• Tokens can be revoked using TRLs

• Full multi-tenancy support

Page 26: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

26

Privacy / Integrity

• Metadata (class, labels, attributes) are encrypted

• Datapoints can be encrypted too (and mixed with unencrypted datapoints)

• SSL support via Haproxy/Nginx front ends

• Optional integrity checks and encryption between components

Page 27: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

AnalyticsBeyond simple queries

Page 28: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

28

A language dedicated to time series analysis

Page 29: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

29

Advanced stack based language

• Result object is a JSON array of the various stack levels

• Support for variables and context saving

• Code serialization

• Support for complex constructs, loops, conditionals, macros

• Secure code execution

• Data Flow model

Page 30: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

30

5 high level frameworks

• BUCKETIZE - transform a series so it has regularly spaced ticks

• MAP - apply a function on a sliding window

• REDUCE - tick by tick computation on multiple series, producing a single one

• FILTER - select series based on various criteria

• APPLY - tick by tick application of an n-ary function

Page 31: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

31

800 functions! != % & && * ** + +! - ->B64 ->B64URL ->BIN ->BYTES ->DOUBLEBITS ->FLOATBITS ->GEOHASH ->HEX ->HHCODE ->HHCODELONG ->JSON ->LIST ->MAP ->MAT ->OPB64 ->PICKLE ->Q ->SET ->TSELEMENTS ->V ->VEC ->Z / < << <= == > >= >> >>> ABS ACOS ADDDAYS ADDMONTHS ADDVALUE ADDYEARS AESUNWRAP AESWRAP AGO AND APPEND APPLY ASIN ASSERT ATAN ATBUCKET ATINDEX ATTICK ATTRIBUTES AUTHENTICATE B64-> B64TOHEX B64URL-> BBOX BIN-> BINTOHEX BITCOUNT BITGET BITSTOBYTES BOOTSTRAP BREAK BUCKETCOUNT BUCKETIZE BUCKETSPAN BYTES-> BYTESTOBITS BYTESTOBITS CALL CBRT CEIL CHUNK CLEAR CLEARDEFS CLEARSYMBOLS CLEARTOMARK CLIP CLONE CLONEEMPTY CLONEREVERSE COMMONTICKS COMPACT CONTAINS CONTAINSKEY CONTAINSVALUE CONTINUE COPYGEO COPYSIGN CORRELATE COS COSH COUNTER COUNTERDELTA COUNTERVALUE COUNTTOMARK CPROB CROP CSTORE CUDF DEBUGOFF DEBUGON DEDUP DEF DEFINED DEFINEDMACRO DELETE DEPTH DET DIFFERENCE DISCORDS DOC DOCMODE DOUBLEBITS-> DOUBLEEXPONENTIALSMOOTHING DROP DROPN DTW DUMP DUP DUPN DURATION DWTSPLIT E ELAPSED ELEVATIONS EMPTY ESDTEST EVAL EVALSECURE EVERY EXP EXPM1 EXPORT FAIL FDWT FETCH FETCHBOOLEAN FETCHDOUBLE FETCHLONG FETCHSTRING FFT FFTAP FILLNEXT FILLPREVIOUS FILLTICKS FILLVALUE FILTER FIND FINDSETS FINDSTATS FIRSTTICK FLATTEN FLOATBITS-> FLOOR FOR FOREACH FORGET FORSTEP FROMBIN FROMBITS FROMHEX FUSE GEO.DIFFERENCE GEO.INTERSECTION GEO.INTERSECTS GEO.REGEXP GEO.UNION GEO.WITHIN GEO.WKT GEOHASH-> GEOPACK GEOUNPACK GET GETHOOK GETSECTION GRUBBSTEST GZIP HASH HAVERSINE HEADER HEX-> HEXTOB64 HEXTOBIN HHCODE-> HUMANDURATION HYBRIDTEST HYBRIDTEST2 HYPOT IDENT IDWT IEEEREMAINDER IFFT IFT IFTE IMMUTABLE INTEGRATE INTERPOLATE INTERSECTION INV ISNULL ISNaN ISO8601 ISODURATION ISONORMALIZE JOIN JSON-> JSONLOOSE JSONSTRICT KEYLIST LABELS LASTBUCKET LASTSORT LASTTICK LBOUNDS LFLATMAP LIMIT LIST-> LMAP LOAD LOCATIONOFFSET LOCATIONS LOCSTRINGS LOG LOG10 LOG1P LORAENC LORAMIC LOWESS LR LSORT LTTB MACROBUCKETIZER MACROFILTER MACROMAPPER MACROREDUCER MAKEGTS MAP MAP-> MAPID MARK MAT-> MATCH MATCHER MAX MAXBUCKETS MAXDEPTH MAXGTS MAXLONG MAXLOOP MAXOPS MAXPIXELS MAXSYMBOLS MD5 MERGE META METASET METASORT MIN MINLONG MODE MONOTONIC MSGFAIL MSORT MSTU MUSIGMA NAME NBOUNDS NDEBUGON NEWGTS NEXTAFTER NEXTUP NONEMPTY NOOP NORMALIZE NOT NOTAFTER NOTBEFORE NOTIMINGS NOW NPDF NRETURN NSUMSUMSQ NULL NaN ONLYBUCKETS OPB64-> OPB64TOHEX OPS OPTDTW OR PACK PAPPLY PARSE PARSESELECTOR PARTITION PATTERNDETECTION PATTERNS PFILTER PGraphics PI PICK PICKLE-> PIGSCHEMA PREDUCE PROB PROBABILITY PUT Palpha Parc Pbackground PbeginContour PbeginShape Pbezier PbezierDetail PbezierPoint PbezierTangent PbezierVertex Pblend PblendMode Pblue Pbox Pbrightness Pclear Pclip Pcolor PcolorMode Pconstrain Pcopy PcreateFont Pcurve PcurveDetail PcurvePoint PcurveTangent PcurveTightness PcurveVertex Pdecode Pdist Pellipse PellipseMode Pencode PendContour PendShape Pfill Pget Pgreen Phue Pimage PimageMode Plerp PlerpColor Pline Pmag Pmap PnoClip PnoFill PnoStroke PnoTint Pnorm Ppixels Ppoint PpopMatrix PpopStyle PpushMatrix PpushStyle Pquad PquadraticVertex Prect PrectMode Pred PresetMatrix Protate ProtateX ProtateY ProtateZ Psaturation Pscale Pset PshapeMode PshearX PshearY Psphere PsphereDetail Pstroke PstrokeCap PstrokeJoin PstrokeWeight Ptext PtextAlign PtextAscent PtextDescent PtextFont PtextLeading PtextMode PtextSize PtextWidth Ptint Ptranslate Ptriangle PupdatePixels Pvertex Q-> QCONJUGATE QDIVIDE QMULTIPLY QROTATE QROTATION QUANTIZE RAND RANDPDF RANGE RANGECOMPACT REDEFS REDUCE RELABEL REMOVE RENAME REPLACE REPLACEALL RESET RESETS RESTORE RETURN REV REVBITS REVERSE REXEC REXECZ RINT RLOWESS ROLL ROLLD ROT ROTATIONQ ROUND RSADECRYPT RSAENCRYPT RSAGEN RSAPRIVATE RSAPUBLIC RSASIGN RSAVERIFY RSORT RTFM RUN RUNNERNONCE RVALUESORT SAVE SECTION SECUREKEY SET SET-> SETATTRIBUTES SETVALUE SHA1 SHA1HMAC SHA256 SHA256HMAC SHRINK SIGNUM SIN SINGLEEXPONENTIALSMOOTHING SINH SIZE SNAPSHOT SNAPSHOTALL SNAPSHOTALLTOMARK SNAPSHOTCOPY SNAPSHOTCOPYALL SNAPSHOTCOPYALLTOMARK SNAPSHOTCOPYTOMARK SNAPSHOTTOMARK SORT SORTBY SPLIT SQRT STACKATTRIBUTE STACKTOLIST STANDARDIZE STL STLESDTEST STOP STORE STRICTMAPPER STRICTPARTITION STRICTREDUCER STU SUBLIST SUBMAP SUBSTRING SWAP SWITCH TAN TANH TEMPLATE TEMPLATE THRESHOLDTEST TICKINDEX TICKLIST TICKS TIMECLIP TIMEMODULO TIMESCALE TIMESHIFT TIMESPLIT TIMINGS TLTTB TOBIN TOBITS TOBOOLEAN TODEGREES TODOUBLE TOHEX TOKENINFO TOLONG TOLOWER TORADIANS TOSELECTOR TOSTRING TOTIMESTAMP TOTIMESTAMP TOUPPER TR TRANSPOSE TRIM TSELEMENTS TSELEMENTS-> TYPEOF UDF ULP UNBUCKETIZE UNGZIP UNION UNIQUE UNLIST UNMAP UNPACK UNSECURE UNTIL UNWRAP UNWRAPEMPTY UNWRAPSIZE UPDATE URLDECODE URLENCODE UUID V-> VALUEDEDUP VALUEHISTOGRAM VALUELIST VALUES VALUESORT VALUESPLIT VEC-> WEBCALL WHILE WRAP WRAPOPT WRAPRAW WRAPRAWOPT Z-> ZDISCORDS ZIP ZPATTERNDETECTION ZPATTERNS ZSCORE ZSCORETEST [ [] ] ^ bucketizer.and bucketizer.count bucketizer.count.exclude-nulls bucketizer.count.include-nulls bucketizer.count.nonnull bucketizer.first bucketizer.join bucketizer.join.forbid-nulls bucketizer.last bucketizer.mad bucketizer.max bucketizer.max.forbid-nulls bucketizer.mean bucketizer.mean.circular bucketizer.mean.circular.exclude-nulls bucketizer.mean.exclude-nulls bucketizer.median bucketizer.min bucketizer.min.forbid-nulls bucketizer.or bucketizer.percentile bucketizer.sum bucketizer.sum.forbid-nulls d e filter.byattr filter.byclass filter.bylabels filter.bylabelsattr filter.bymetadata filter.last.eq filter.last.ge filter.last.gt filter.last.le filter.last.lt filter.last.ne filter.latencies h m mapper.abs mapper.abscissa mapper.add mapper.and mapper.ceil mapper.count mapper.count.exclude-nulls mapper.count.include-nulls mapper.count.nonnull mapper.day mapper.delta mapper.distinct mapper.dotproduct mapper.dotproduct.positive mapper.dotproduct.sigmoid mapper.dotproduct.tanh mapper.eq mapper.exp mapper.finite mapper.first mapper.floor mapper.ge mapper.geo.approximate mapper.geo.clear mapper.geo.outside mapper.geo.within mapper.gt mapper.hdist mapper.highest mapper.hour mapper.hspeed mapper.join mapper.join.forbid-nulls mapper.kernel.cosine mapper.kernel.epanechnikov mapper.kernel.gaussian mapper.kernel.logistic mapper.kernel.quartic mapper.kernel.silverman mapper.kernel.triangular mapper.kernel.tricube mapper.kernel.triweight mapper.kernel.uniform mapper.last mapper.le mapper.log mapper.lowest mapper.lt mapper.mad mapper.max mapper.max.forbid-nulls mapper.max.x mapper.mean mapper.mean.circular mapper.mean.circular.exclude-nulls mapper.mean.exclude-nulls mapper.median mapper.min mapper.min.forbid-nulls mapper.min.x mapper.minute mapper.mod mapper.month mapper.mul mapper.ne mapper.npdf mapper.or mapper.parsedouble mapper.percentile mapper.pow mapper.product mapper.rate mapper.replace mapper.round mapper.sd mapper.sd.forbid-nulls mapper.second mapper.sigmoid mapper.sum mapper.sum.forbid-nulls mapper.tanh mapper.tick mapper.toboolean mapper.todouble mapper.tolong mapper.tostring mapper.truecourse mapper.var mapper.var.forbid-nulls mapper.vdist mapper.vspeed mapper.weekday mapper.year max.tick.sliding.window max.time.sliding.window ms ns op.add op.add.ignore-nulls op.and op.and.ignore-nulls op.div op.eq op.ge op.gt op.le op.lt op.mask op.mul op.mul.ignore-nulls op.ne op.negmask op.or op.or.ignore-nulls op.sub pi ps reducer.and reducer.and.exclude-nulls reducer.argmax reducer.argmin reducer.count reducer.count.exclude-nulls reducer.count.include-nulls reducer.count.nonnull reducer.join reducer.join.forbid-nulls reducer.join.nonnull reducer.join.urlencoded reducer.mad reducer.max reducer.max.forbid-nulls reducer.max.nonnull reducer.mean reducer.mean.circular reducer.mean.circular.exclude-nulls reducer.mean.exclude-nulls reducer.median reducer.min reducer.min.forbid-nulls reducer.min.nonnull reducer.or reducer.or.exclude-nulls reducer.percentile reducer.product reducer.sd reducer.sd.forbid-nulls reducer.shannonentropy.0 reducer.shannonentropy.1 reducer.sum reducer.sum.forbid-nulls reducer.sum.nonnull reducer.var reducer.var.forbid-nulls s us w { {} | || } ~ ~=

Page 32: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

32

Compact expressiveness<% ‘Display write requests count for each region’ DOC

SAVE 'context' STORE

'cell' STORE 'PT60m' DURATION 'duration' STORE '@TOKEN_READ@' 'TOKEN' STORE NOW 'now' STORE

[ $TOKEN 'writeRequestCount' { 'cell' $cell 'Context' 'regionserver' } $now $duration ] FETCH

// Remove resets false RESETS

// Align ticks [ SWAP bucketizer.last $now 60 STU * 0 ] BUCKETIZE

// Sum by hname [ SWAP [ 'hname' ] reducer.sum ] REDUCE FILLNEXT FILLPREVIOUS

// Compute rates [ SWAP mapper.rate 1 0 0 ] MAP

$context RESTORE%>

Page 33: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

ExtensibilityWhen built in functions are not enough

Page 34: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

34

Macros

• Separate responsabilities between macro authors and their users

• Shorten the actual WarpScript code

• Macros deployed server side, invoked via @macro.name

• Server side macros are hot swappable

• Can also be packed in jar files

Page 35: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

35

Modifying WarpScript via Extensions

• Remove functions you don’t want

• Alter the behavior of existing functions

• Add new functions

• Allows WarpScript to connect to third party systems including other TSDBs

• Extensions are Java classes extending WarpScriptExtension

Page 36: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

36

CALLing external programs#!/usr/bin/env python -u

import cPickle, sys, urllib, base64

# Output the maximum number of instances of this 'callable' to spawnprint 10

# Loop, reading stdin, doing our stuff and outputing to stdout

while True: try: line = sys.stdin.readline() line = line.strip() line = urllib.unquote(line.decode('utf-8')) # Remove Base64 encoding str = base64.b64decode(line) args = cPickle.loads(str)

# Do out stuff output = ….

# Output result (URL encoded UTF-8). print urllib.quote(output.encode('utf-8')) except Exception as err: print ' ' + urllib.quote(repr(err).encode('utf-8'))

...->PICKLE ‘UTF-8’ BYTES-> ->B64‘path/to/file’ CALLB64-> PICKLE->....

Page 37: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

Visualizing dataA picture is worth a thousand words

Page 38: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

38

Quantum Web Components

• WarpScript centric, no JS plumbing

• Display graphs, graphs & maps or images (more on this in the next slide)

• Integrated into the Quantum IDE

Page 39: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

39

Processing available in WarpScript800 'width' STORE 800 'height' STORE400.0 'maxspeed' STORE 40000.0 'maxalt' STORE3.0 2.0 2.0 @orbit/heatmap/kernel/triangular 'kernel' STORE@orbit/heatmap/palette/classic 'palette' STORE'TOKEN''token' STORE

$width $height '2D' PGraphics'MULTIPLY' PblendMode 'CENTER' PimageMode[ $token '~(ALT|CAS)' {} NOW -2000000 ] FETCHDUP 0 GET LASTTICK 'now' STORE[ SWAP bucketizer.last $now STU 0 ] BUCKETIZE

// Create heatmap<% 7 GET LIST-> DROP 'CAS' STORE 'ALT' STORE <% $CAS ISNULL NOT $ALT ISNULL NOT && %> <% $kernel $CAS $maxspeed / $width * $ALT $maxalt / 1.0 SWAP - $height * Pimage %> IFT 0 NaN NaN NaN NULL%> MACROREDUCER 'GRAPHER' STORE[ SWAP [] $GRAPHER ] REDUCE DROP

// ColorizePpixels <% DROP Palpha $palette SWAP GET %> LMAPPupdatePixels Pencode Pdecode$width $height '2D' PGraphics

// Do the gridPnoFill 0 0 $width 1 - $height 1 - Prect2.0 PstrokeWeight 200.0 Pcolor Pstroke250.0 $maxspeed / $width * DUP 0 SWAP $height Pline0 10000 $maxalt / 1.0 SWAP - $height * DUP $width SWAP Pline

SWAP 0 0 Pimage Pencode

Page 40: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

40

Plugins for third party tools

Page 41: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

Batch analysisWhen you need insights on your whole dataset

Page 42: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

42

Enhancing existing batch analytics systems

• Off the shelf batch frameworks are not fitted for time series analytics

• Need for dedicated functions

• Why re-invent the wheel when WarpScript is available?

• Integrated into 3 of the most popular batch analytics frameworks

Page 43: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

43

Augmenting Pig, Spark and FlinkREGISTER warp10-pig-0.0.10-rc2.jar;SET warp.timeunits 'us';DEFINE WarpScriptRun io.warp10.pig.WarpScriptRun();

GTS = LOAD '$input' USING PigStorage() AS (gts: chararray);

-- Retain only the 'frequency' GTS and chunk them by 5 minutes

FREQCHUNKS = FOREACH GTS GENERATE FLATTEN(WarpScriptRun('DUP UNWRAPEMPTY NAME "frequency" == <% UNWRAP 0 5 m 0 0 "chunkid" false CHUNK WRAP %> <% [] %> IFTE ->V', gts));

-- Flatten the bag

CHUNKS = FOREACH FREQCHUNKS GENERATE FLATTEN($0);

-- Generate station id, chunk id, gts

BYSTATIONCHUNK = FOREACH CHUNKS GENERATE FLATTEN(WarpScriptRun('DUP UNWRAP LABELS DUP "chunkid" GET SWAP "stationid" GET', $0)) AS (stationid: chararray, chunkid: chararray, gts: chararray);

-- Group by station id, chunk id

STATIONCHUNKGROUP = GROUP BYSTATIONCHUNK BY (stationid, chunkid) PARALLEL 20;

-- Merge the GTS to reconstruct the chunk and emit station id, chunk id, gts

FULLCHUNKS = FOREACH STATIONCHUNKGROUP GENERATE FLATTEN(WarpScriptRun('V-> <% DROP 2 GET UNWRAP %> LMAP MERGE DUP LABELS SWAP WRAP SWAP DUP "chunkid" GET SWAP "stationid" GET', BYSTATIONCHUNK)) AS (stationid: chararray, chunkid: chararray, gts: chararray);

STORE FULLCHUNKS INTO ‘$output’ USING PigStorage(‘\t’);

Page 44: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

44

Augmenting Pig, Spark and FlinkDataFrame df = sqlc.read().parquet(...);

RDD<Row> rdd = df.rdd();JavaRDD<Row> jrdd = rdd.toJavaRDD();

JavaRDD<Row> out = jrdd.mapPartitions(new WarpScriptFlatMapFunction<Iterator<Row>,Row>("@ext-macro.mc2"));

JavaPairRDD<Row, Iterable<Row>> grouped = out.groupBy(new WarpScriptFunction<Row, Row>("[ 0 1 ] SUBLIST ->SPARKROW"));

JavaRDD<Row> merged = grouped.map(new WarpScriptFunction<Tuple2<Row,Iterable<Row>>, Row>("LIST-> DROP 0 GET [] SWAP <% SPARK-> 2 GET UNWRAP +! %> FOREACH MERGE WRAPRAW + 2 GET 1 ->LIST ->SPARKROW"));

List<StructField> fields = new ArrayList<StructField>();fields.add(DataTypes.createStructField("wrapper", DataTypes.BinaryType, false));StructType st = new StructType(fields.toArray(new StructField[0]));

DataFrame df2 = sqlc.createDataFrame(merged, st);

df2.write().parquet("/path/to/output/parquetfile");

Page 45: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

Streaming analysisBecause some analysis has to be done on the fly!

Page 46: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

46

Augmenting Spark Streaming, Flink & Storm{ 'type' 'spout' 'id' 'spout-0' 'output' { 'stream-0' [ 'field-2' 'field-1' ] } 'parallelism' 1 'every' 500 'debug' true 'macro' 0 'counter' STORE<% $counter 1 + 'counter' STORE 'NOW' 'https://host:port/api/v0/exec' REXEC 'now' STORE { 'stream-0' [ [ 'now' $now ] ] }%>}

{ 'type' 'bolt' 'id' 'bolt-0' 'parallelism' 2 'debug' true 'input' { 'spout-0' { 'stream-0' 'shuffle' } } 'output' { 'stream-1' [ 'outfield' ] } ‘macro' <% SNAPSHOT [ SWAP ] 'value' STORE $value 0 GET _storm.LOG { 'stream-1' [ $value ] }%>}

Page 47: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

Use casesAlready managing gazillions of data

Page 48: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

48

Put to work in many different verticals

IT monitoring, several 100k servers, 300M+ series, in-memory, distributed versions, 10s of trillions of datapoints

Synchrophasor data, batch processing of 100s billions datapoints

Telecommunication systems for High Frequency Trading

Weather forecasting data

Power generator data, real-time collection of CAN, ModBUS data, embedded and distributed versions

Heatpumps, edge analytics and distributed version

Flight data analytics on ARINC data of complete fleets of aircrafts spanning multiple years

Financial transaction supervision

….

All with the same Open Source tool suite

Page 49: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

ConclusionPut Warp 10 to work on your own use cases

Page 50: Time Series data using Warp 10 Managing and Analyzing · Time Series data using Warp 10 Santa Clara, California | April 24th – 27th, 2017 `whoami` 3 Mathias Herberts Co-Founder

50

Thank you!

• 3 steps to get started

curl -O -L https://dl.bintray.com/cityzendata/generic/io/warp10/warp10/1.2.7-rc2/warp10-1.2.7-rc2.tar.gz

tar zxpf warp10-1.2.7-rc2.tar.gz

export JAVA_HOME=/path/to/java/home; cd warp10-1.2.7-rc2; ./bin/warp10-standalone.init start

• Please rate my session

• Available for questions now and after, come talk to me@warp10io

http://www.warp10.io/

http://groups.google.com/forum/#!forum/warp10-users

https://github.com/cityzendata