Top Banner
Roadmap: Operating Pentaho at Scale Jens Bleuel Senior Product Manager, Pentaho
29

6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

May 22, 2018

Download

Documents

hadieu
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

Roadmap:OperatingPentahoatScaleJensBleuelSeniorProductManager,Pentaho

Page 2: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

Agenda– WorkerNodes

HearaboutnewupcomingcapabilitiesforscalingoutthePentahoplatforminlargeenterpriseoperations.Thiswillcover8.0androadmaptopics.

• WorkerNodes:OverviewandBusinessBenefits

• HowisthisdifferentfromAEL/HadoopMapReduce

• TypicalCustomerScenarios

• Architecture&CapabilitiesincludingMonitoring&Logging

• ImprovementsinRelatedAreas

• Demonstration

• Availability&Roadmap

Page 3: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

WorkerNodes– Overview

• WorkerNodescanscaleworkitemsacrossmultiplenodes(containers)like:

– PDIjobsandtransformations(in8.0)– Reportexecutions(notin8.0)– […]

• Itoperateseasilyandsecurelyacrossanelasticarchitecture,whichaddsadditionalmachineresourcesastheyarerequiredforprocessing

• WorkerNodescanoperateonpremiseorinthecloud

• UsesPopulartechnologiesunderthehoodsuchasDocker(ContainerPlatform),Chronos(Scheduler)andMesos/Marathon(ContainerOrchestration)

WorkerNode(a)

WorkerNode(b)

WorkerNode(c…)DistributeandScale

Page 4: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

WorkerNodes– BusinessBenefits

Largeenterprisesneedtheabilitytoseamlesslyandefficientlyspinupresourcestohandle100s+workitemsatdifferenttimes,withdifferentdependenciesandprocessingrequirements.WorkerNodesaddressestheseneedsanddelivers:• FastertimetovalueandreducedTCObecauseitenablescustomerstodeploytheirownscale-outprocesseswithoutrequiredservices• Managechangingworkloadsmoreefficientlybyspinningresourcesupanddownasneeded• Increasedbusinessagilitythankstocontainerization– whichenablesportabilityofprocessesacrosson-prem andcloudenvironmentswithouttheneedtore-engineerthem.– Eveninpureon-prem,WNprovideselasticityandresourceoptimization.

Page 5: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

HowIsThisDifferentfromAEL/HadoopMapReduce?

Thesetwoarchitecturescanalsobecombined:WithinaWorkerNode,aPDItransformationcanalsoscaleoutwithAELorMapReduce

SCALEOUTONDATA

SCALEOUTONPROCESSES(WORKITEMS)

AEL/HadoopMapReduce(simplified):• Dataisdistributedacrossnodes• Theprocessingtakesplaceatthenodelevel• Helpsinscaleoutdatavolume

WorkerNodes(simplified):• WorkItemslikePDIJobs,PDITransformationsgetdistributedacrossnodes– thisisabouttheprocessingandorchestration(incontrasttodistributingdata)

• HelpsinscaleoutPentahoprocesses

Page 6: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

TypicalCustomerScenarios

CustomerType TypicalNumberofWorkItems Scale-OutNeed

Small Upto10 No

Medium 10through100 Sometimes

Enterprisewithonedepartment +/- 100 Yes

Enterprisewithmultipledepartments Hundredsorthousands Yes

Page 7: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

TypicalCustomerExamples– SLA’sandTimeWindows

• NeedtomeetcustomerSLA’s– Datafromhundredsofsourcesneedtogetcollectedandaggregated– ThisisdonebyhundredsofPDIjobsandtransformations– Allthesejobsandtransformationsneedtobefinishedwithinadefinedtimewindow(forexamplebetween5amand7am)sothatthedataisavailableandaccurateforthetargetaudience

• WorkerNodesprovidesthetechnologytorunprocessesinparallelandscaleoutwhenneeded,forexampleatpeaktimes(endofmonth)

Page 8: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

TypicalCustomerExamples– SharedServices

Exampleofoneproject:

• 800dailybatchesfromdifferentdepartmentsinanenterprise

• Oneserverwith120GBmemoryandmanyCPUs

• ThismachinehostslotsofVMinparallel

Issue:Whenthereistoomuchworkload,onemachineisnotenough

• WorkerNodessolvesthisinscalingoutonacluster

Page 9: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

TypicalCustomerExamples– ScalableonDemand

• Needtosupportgrowingdatavolumesandcustomerrequirements

• WorkerNodesprovidesaflexibleandscalablearchitectureon-promiseorinthecloudforgrowingdemand

• Thisisseamlessanddoesnotneedtochangetheunderlyingarchitecture

WorkerNode(1)

WorkerNode(2)

WorkerNode(3)DistributeandScale

WorkerNode(1)

WorkerNode(2)

WorkerNode(3)DistributeandScale

WorkerNode(4)

WorkerNode(5)

BASETIMES PEAKTIMES

Page 10: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

WORKERNODES

OrchestrationFramework

ContainerFramework

WorkerNodes– Newin8.0

• Containerizedscale-out• PentahoPDI“workitems”

PentahoServerWN1e.g.KJB

WN2e.g.KTR

WN…n“Executor”

Orchestration(Scheduler,monitoring,security,etc.)

Controller

Master(Standby)

Master(Standby)

Master(Working)

PentahoRepository

PentahoClients

Page 11: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

WorkerNodesCapabilities

• Deployconsistentlyinphysical,virtual,andcloudenvironmentsAdaptstocustomerneeds(bare-metalvs.virtualizationvs.Cloud)andnoneedtomodifytheproductwhenthestrategychanges

• ScaleandloadbalanceservicesThishelpstodealwithpeaksandlimitedtime-windows,allocatetheresourcesthatareneeded.

• HybriddeploymentscanbeusedtodistributeloadEvenwhentheon-premise resourcesarenotsufficient,scalingoutintotheCloudispossibletoprovidemoreresources.

Page 12: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

MonitoringandLogging

Page 13: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

Monitoring– Overview

Page 14: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

Monitoring– WorkerNodeExample

Page 15: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

ImprovementsinRelatedAreasOpenandSaveDialogs

Page 16: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

• Wheneveryousaveanewtransformation/jobintotherepository,thedefaultfolderissettotheuser’shomefolder.

PainPoint:SaveaNewJob/Transformation

Inpreviousversions:Theuserwillneedtochangethefolderforeverytimetheysaveanewtransformationorjob.

Page 17: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

NewSaveDialogin8.0– Overview

• Remembersthelastopenedfolder!

• Justenterthefilename!(and/orchangethefolder)

• SimilartotheOpenDialogwithadditionalfunctionality(seenextslide).

Page 18: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

NewOpenDialogin8.0– Overview

Recents

Openshowsthelastopenedfolder.Thisisabigtimesaver!

Search

Page 19: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

ImprovementsinRelatedAreasRunConfigurations

Page 20: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

PainPoint:RemotePentahoServerExecutionbefore 8.0

ToexecuteonthePentahoServerbefore8.0,youneedtodefineaSlaveserverandgivethecredentials. ThenexecuteontheselectedServer.

Page 21: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

ExecuteonthePentahoServer

• ByselectingthePentahoserveroption,youdonotneedtodefineaSlaveserveranymorewhenyouwanttoexecuteremotely.

• Behindthescenes,thisoptionexecutesthetransformationorjobviatheScheduler.Thisisthesameasyouwoulddoa“ScheduleNow.”

Thisnewfunctionalityimprovestheeaseofuse,alsoforWorkerNodes

Page 22: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

RunConfigurationswithinJobEntries

• RunConfigurationcanbeusedintheRundialogandalsointhejobentriesthatcouldexecutejobsortransformationsremotelyandonWorkerNodes

7.1 Example

8.0

Page 23: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

Demonstration

Page 24: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

AvailabilityandRoadmap

Page 25: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

Availability

• WorkerNodesisEEonly

• Initially,8.0WorkerNodeswillbeLimitedAvailability– Fullysupported,productiondeployment– Distributiontoalimitednumberofcustomers

• Requiresadditionaldownloadandimplementationservices

Page 26: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

• PentahoServer&RepositoryasaServiceincludingHighAvailability

• ImprovedMonitoringandLogging

• ExtendtootherPentahoworkitemssuchasReports

• IntegratedwithotherHitachiVantara ServicesandProducts

Roadmap

ContainerFrameworkPentahoServer

WN1e.g.KJB

WN2e.g.KTR

WN…n“Executor”

PentahoRepository

Page 27: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

Summary

Whatwecoveredtoday:

• TheupcomingcapabilitiesforscalingoutthePentahoplatformandwhentousethem

• Howtousethenewwayofscalingoutworkitems(PentahoprocessessuchasPDIjobsandtransformations)acrossmultiplenodes

Page 28: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover

NextStepsWanttolearnmore?

• Meet-the-Expert:– PedroTeixera

• Otherrecommendedbreakoutsessions:– MattHoward:Pentaho8.0andRoadmap– RakeshSaha andJensBleuel:Roadmap:ProcessingBigData– MattCasters:PDIBestArchitecturePractices– SteveSzabo:PDISizingOverviewandCaseStudy– JonathanJarvis:UnderstandingParallelismwithPDIandAdaptiveExecutionwithSpark– MarkBurnett:UnderstandingtheBigDataTechnologyEcosystem

Page 29: 6 Roadmap Operating Pentaho at Scale - … –Worker Nodes Hear about new upcoming capabilities for scaling out the Pentaho platform in large enterprise operations. This will cover